Pipeline Helper Functions#
- expectmine.pipeline.utils.get_quickstart_config(output_path: Path | None = None) tuple[InMemoryStoreAdapter, InMemoryStoreAdapter, CliLoggerAdapter, Path][source]#
- Parameters:
output_path (Optional[Path]) β Where should the pipeline output its result?
- Returns:
4-tuple containing all necessary configuration
- expectmine.pipeline.utils.validate_add_step(step: Type[BaseStep], io: BaseIo)[source]#
Validates the adapter init parameters.
- Parameters:
- Example:
>>> validate_add_step(Step(...), Io(...))
>>> validate_init() TypeError("Step needs to be of type BaseStep")
- Raises:
TypeError β If the arguments have the wrong type.
- expectmine.pipeline.utils.validate_init(persistent_adapter: BaseStoreAdapter, volatile_adapter: BaseStoreAdapter, logger_adapter: BaseLoggerAdapter)[source]#
Validates the adapters to be valid when initializing the pipeline class.
- Parameters:
persistent_adapter (BaseStoreAdapter) β Persistent storage adapter to the pipeline
volatile_adapter (BaseStoreAdapter) β Volatile storage adapter to the pipeline
logger_adapter (BaseLoggerAdapter) β Logging adapter to the pipeline
- Example:
>>> validate_init(StoageAdapter(...), StorageAdapter(...), LoggingAdapter(...))
>>> validate_init() TypeError("Persistent adapter needs to be an instance of BaseStorageAdapter.")
- Raises:
TypeError β If the arguments have the wrong type.
- expectmine.pipeline.utils.validate_input_files(input_files: list[Path], current_input_filetypes: list[str] | None)[source]#
Validates the input files to the pipeline. The list can only contain Path objects and those must point to files.
- Parameters:
input_files (list[Path]) β List of input_files to the pipeline
current_input_filetypes (list[str]) β List of the current input filetypes.
- Example:
>>> validate_input_files([Path("text.txt"), Path("test.xml")])
>>> validate_input_files() TypeError("input_files need to be of type list[Path].")
>>> validate_input_files([Path("/directory")]) ValueError("input_files need to exclusively contain files.")
- Raises:
TypeError β If the arguments have the wrong type.
ValueError β If not all elements are files or if the input filetype gets changed while reassigning input_files.
- expectmine.pipeline.utils.validate_output_directory(output_directory: Path)[source]#
Validates that the path given is either an empty directory or not used taken yet.
- Parameters:
output_directory (Path) β Step that should be added to the pipeline
- Example:
>>> validate_output_directory(Path("empty_dir"))
>>> validate_output_directory(Path("not_empty_dir")) ValueError("Output directory is not empty.")
- Raises:
ValueError β If the directory is not empty or a file.
- expectmine.pipeline.utils.validate_step_can_run(step: Type[BaseStep], input_filetypes: list[str] | None)[source]#
Validates that a step can actually run on a given input. Throws an error otherwise.
- Parameters:
step (Type[BaseStep]) β Step that should be added to the pipeline
input_filetypes (list[str] | None) β Current input where step needs to run on.
- Example:
>>> validate_add_step(Step(...), [".txt"])
>>> validate_add_step(Step(...), None) TypeError("Step needs to be of type BaseStep")
- Raises:
ValueError β If the step can not run on previous output or if no input has been given to the pipeline yet.