Store Module#
Functionality#
The store serves as a temporary or persistent storage interface to each step. This helps to abstract away the runtime environment or the mode the pipeline is executed in. To achieve that each store module is built like a KV-Database.
Base types#
The currently supported datatypes are:
Boolean
String
Integer
Float
File (Stored as Blob)
Object (Pickled and also stored as a blob)
Current Adapters#
Currently, there are two adapters. One which is tasked with persisting data and one only saves data temporary.
Adapter |
Functionality |
When to use |
|---|---|---|
Stores files temporarily in local memory. |
Good for volatile storage. Can also be used for persistent storage, if you dont care about persisting data (e.g. running python code and not running the cli) |
|
Uses a local sqlite3 database to store the values. |
Good for persisting data. Should be used for the persistent storage in the Cli. |
Example#
Note
The get_quickstart_config already configures the storage for a pipeline.
If you want to keep it simple, stick to the config. The following example is
for a more fine grained setup.
Let’s assume we want to add Sqlite3 as a persistent store to the pipeline and not use the default config. This requires us to initialize the pipeline manually, which implies that we also need to choose adapters logging.
from pathlib import Path
from expectmine.pipeline.pipeline import Pipeline
from expectmine.pipeline.utils import get_quickstart_config
pipeline = Pipeline(*get_quickstart_config(output_path=Path("output/")))
from pathlib import Path
from expectmine.pipeline.pipeline import Pipeline
from expectmine.storage.adapters.in_memory_adapter import InMemoryStoreAdapter
from expectmine.storage.adapters.sqlite3_adapter import Sqlite3StoreAdapter
from expectmine.logger.adapters.cli_logger_adapter import CliLoggerAdapter
from expectmine.logger.base_logger import LogLevel
output_directory = Path("output")
temp_directory = Path("output/temp")
db_directory = Path()
pipeline = Pipeline(
persistent_adapter=Sqlite3StoreAdapter(db_directory, temp_directory),
volatile_adapter=InMemoryStoreAdapter(output_directory, temp_directory),
logger_adapter=CliLoggerAdapter(LogLevel.ALL, True),
output_directory=output_directory
)