Pipeline Helper Functions#

expectmine.pipeline.utils.get_quickstart_config(output_path: Path | None = None) tuple[InMemoryStoreAdapter, InMemoryStoreAdapter, CliLoggerAdapter, Path][source]#
Parameters:

output_path (Optional[Path]) – Where should the pipeline output its result?

Returns:

4-tuple containing all necessary configuration

expectmine.pipeline.utils.validate_add_step(step: Type[BaseStep], io: BaseIo)[source]#

Validates the adapter init parameters.

Parameters:
  • step (Type[BaseStep]) – Step that should be added to the pipeline

  • io (BaseIo) – Io object that will configure the step

Example:

>>> validate_add_step(Step(...), Io(...))
>>> validate_init()
TypeError("Step needs to be of type BaseStep")
Raises:

TypeError – If the arguments have the wrong type.

expectmine.pipeline.utils.validate_init(persistent_adapter: BaseStoreAdapter, volatile_adapter: BaseStoreAdapter, logger_adapter: BaseLoggerAdapter)[source]#

Validates the adapters to be valid when initializing the pipeline class.

Parameters:
  • persistent_adapter (BaseStoreAdapter) – Persistent storage adapter to the pipeline

  • volatile_adapter (BaseStoreAdapter) – Volatile storage adapter to the pipeline

  • logger_adapter (BaseLoggerAdapter) – Logging adapter to the pipeline

Example:

>>> validate_init(StoageAdapter(...), StorageAdapter(...), LoggingAdapter(...))
>>> validate_init()
TypeError("Persistent adapter needs to be an instance of BaseStorageAdapter.")
Raises:

TypeError – If the arguments have the wrong type.

expectmine.pipeline.utils.validate_input_files(input_files: list[Path], current_input_filetypes: list[str] | None)[source]#

Validates the input files to the pipeline. The list can only contain Path objects and those must point to files.

Parameters:
  • input_files (list[Path]) – List of input_files to the pipeline

  • current_input_filetypes (list[str]) – List of the current input filetypes.

Example:

>>> validate_input_files([Path("text.txt"), Path("test.xml")])
>>> validate_input_files()
TypeError("input_files need to be of type list[Path].")
>>> validate_input_files([Path("/directory")])
ValueError("input_files need to exclusively contain files.")
Raises:
  • TypeError – If the arguments have the wrong type.

  • ValueError – If not all elements are files or if the input filetype gets changed while reassigning input_files.

expectmine.pipeline.utils.validate_output_directory(output_directory: Path)[source]#

Validates that the path given is either an empty directory or not used taken yet.

Parameters:

output_directory (Path) – Step that should be added to the pipeline

Example:

>>> validate_output_directory(Path("empty_dir"))
>>> validate_output_directory(Path("not_empty_dir"))
ValueError("Output directory is not empty.")
Raises:

ValueError – If the directory is not empty or a file.

expectmine.pipeline.utils.validate_step_can_run(step: Type[BaseStep], input_filetypes: list[str] | None)[source]#

Validates that a step can actually run on a given input. Throws an error otherwise.

Parameters:
  • step (Type[BaseStep]) – Step that should be added to the pipeline

  • input_filetypes (list[str] | None) – Current input where step needs to run on.

Example:

>>> validate_add_step(Step(...), [".txt"])
>>> validate_add_step(Step(...), None)
TypeError("Step needs to be of type BaseStep")
Raises:

ValueError – If the step can not run on previous output or if no input has been given to the pipeline yet.