Validator¶
Validation primitives for ETL data quality checks.
Provides a small framework for validating extracted data prior to loading. It
includes a Validator base class, several concrete validators, a
ValidationSequence to compose multiple validators, and the structured
ValidationResult object returned from a validation run.
- class algomancy_data.validator.ValidationSeverity(*values)[source]¶
Bases:
StrEnumSeverity levels used in validation messages.
- INFO = 'INFO'¶
- WARNING = 'WARNING'¶
- ERROR = 'ERROR'¶
- CRITICAL = 'CRITICAL'¶
- exception algomancy_data.validator.ValidationError(message='Validation failed.', context=None)[source]¶
Bases:
ExceptionException raised for validation errors in the data pipeline.
Retained for backwards-compatibility. The modern flow (
ETLPipeline.runreturningETLResult) no longer raises this exception for data-quality failures; callers should inspectETLResult.validation_resultinstead.- message¶
Explanation of the error.
- context¶
Optional dictionary or object with additional context.
- class algomancy_data.validator.ValidationMessage(severity, message, table=None, column=None, row=None, code=None)[source]¶
Bases:
objectContainer for a validation outcome with optional structured location.
- severity¶
- message¶
- table¶
- column¶
- row¶
- code¶
- class algomancy_data.validator.ValidationResult(is_valid, messages=<factory>, halt_on=ValidationSeverity.CRITICAL, counts_by_severity=<factory>)[source]¶
Bases:
objectStructured outcome of a
ValidationSequencerun.- is_valid¶
Trueif no message met or exceeded the halt threshold.- Type:
bool
- messages¶
All messages collected during the run.
- Type:
- halt_on¶
Severity threshold that determined
is_valid.
- counts_by_severity¶
Count of messages per severity level.
- Type:
Dict[str, int]
- is_valid: bool¶
- messages: List[ValidationMessage]¶
- halt_on: ValidationSeverity = 'CRITICAL'¶
- counts_by_severity: Dict[str, int]¶
- class algomancy_data.validator.Validator[source]¶
Bases:
ABCAbstract validator that appends messages during
validate.- property messages: List[ValidationMessage]¶
- class algomancy_data.validator.DefaultValidator[source]¶
Bases:
ValidatorNo-op validator that always returns a single success INFO message.
- class algomancy_data.validator.ExtractionSuccessVerification[source]¶
Bases:
ValidatorValidator that ensures extracted DataFrames are not empty.
- class algomancy_data.validator.SchemaValidator(schemas=None, severity=ValidationSeverity.ERROR)[source]¶
Bases:
ValidatorValidate DataFrames against a list of
Schemadeclarations.Checks each known table for unexpected columns and dtype mismatches.
- _schemas¶
Mapping of file name →
Schema(or subschema).
- _severity¶
Severity used for column/schema mismatches.
- class algomancy_data.validator.RequiredColumnsValidator(schemas, severity=ValidationSeverity.ERROR)[source]¶
Bases:
ValidatorFail when a schema’s required columns are missing from the extracted data.
Emits one structured message per missing column with
tableandcolumnpopulated.- _schemas¶
Schemas to enforce.
- _severity¶
Severity used for missing-column reports.
- class algomancy_data.validator.PrimaryKeyValidator(schemas, severity=ValidationSeverity.ERROR)[source]¶
Bases:
ValidatorEnforce uniqueness and non-null over each schema’s primary key.
Supports joint primary keys. Skips schemas with no declared primary key.
- _schemas¶
Schemas to enforce primary-key constraints for.
- _severity¶
Severity used when violations are detected.
- class algomancy_data.validator.UniqueValueValidator(table, columns, severity=ValidationSeverity.ERROR)[source]¶
Bases:
ValidatorFlag duplicate values within one or more columns of a single table.
Each column is checked independently (not as a composite key).
- table¶
Table name to inspect.
- columns¶
Column names whose values must be unique.
- severity¶
Severity used for violations.
- class algomancy_data.validator.MissingValueValidator(table, columns, severity=ValidationSeverity.ERROR)[source]¶
Bases:
ValidatorFlag null cells in columns that are declared non-nullable.
- table¶
Table name to inspect.
- columns¶
Column names that must not be null.
- severity¶
Severity used for violations.
- class algomancy_data.validator.ForeignKeyValidator(left_table, left_col, right_table, right_col, severity=ValidationSeverity.ERROR)[source]¶
Bases:
ValidatorCross-table integrity check.
Verifies that every (non-null) value of
left_table[left_col]exists inright_table[right_col]. Supports composite keys whenleft_colandright_colare lists of equal length.- left_table¶
Table that holds the foreign key values.
- left_col¶
Column name (or list of names) on the left side.
- right_table¶
Table that holds the referenced values.
- right_col¶
Column name (or list of names) on the right side.
- severity¶
Severity used when a value is not found.
- left_col: List[str]¶
Column name (or list of names) on the left side.
- right_col: List[str]¶
Column name (or list of names) on the right side.
- classmethod from_schemas(schemas, severity=ValidationSeverity.ERROR)[source]¶
Build a list of validators from
Column.foreign_keydeclarations.Walks each schema’s columns; for every column with a non-null
foreign_keydeclaration, returns aForeignKeyValidatorinstance covering that relation. Columns sharing the same parent table on the same schema are collapsed into a single composite-key validator.- Parameters:
schemas (Iterable[type]) – Iterable of
Schemasubclasses.severity (ValidationSeverity) – Severity for emitted FK-violation messages.
- Returns:
List of
ForeignKeyValidatorinstances, one per derived relation. The list is empty if no schema declares a FK.- Return type:
List[ForeignKeyValidator]
- class algomancy_data.validator.ValidationSequence(validators=None, logger=None, halt_on=ValidationSeverity.CRITICAL)[source]¶
Bases:
objectA sequence of validators executed in order with message aggregation.
- halt_on¶
Severity at or above which the run is considered invalid. Defaults to
ValidationSeverity.CRITICAL.
- property is_valid: bool¶
Return True when completed and no message met
halt_onthreshold.
- property messages: List[ValidationMessage]¶
- property completed: bool¶