Data quality monitoring for DevOps

Keep the Data Quality checks along with the data pipeline code

How hard is it to migrate definitions between development, test and production environments?

Data quality monitoring can help. Monitor the data sources you use for the dashboards and get warnings about potential issues before more dashboards are affected.

SOURCE DATA QUALITY RULES

SOURCE DATA QUALITY RULES

Monitor the Data Quality rules for source data in one place. Detect issues and instability in data sources before they affect the entire Data Warehouse or a Data Lake.

DQO stores the data quality definitions as simple YAML files. All data quality rules for a source table can be edited using code complete in all the most popular text editors. Simply copy the data quality definition file and make minor changes to monitor the quality of another similar table.

  • The data quality of source tables is easy to define
  • All data quality rules for all source tables may be defined in the same way
  • Adding new tables to be observed is as simple as copying a YAML file

Data Quality Testing

Data Quality Testing

Define data quality tests in code. Develop data pipelines following a Test Driven Development approach: develop the pipeline, test it, refactor... and retest after changes.

Data quality checks are defined in text files. The developer can follow a Test Driven Workflow to run data loading scripts followed by running data quality checks.

  • Data quality checks defined in the code
  • Data quality checks may be instantly executed
  • Enable Test Driven Development and Integration Testing for databases and data lakes

Tested DEV -> TEST -> PROD migration

Tested DEV -> TEST -> PROD migration

Define Data Quality rules on the DEV / test / UAT environments. Run data quality checks after migration to the production environment to ensure that the migration was successful.

Data quality rules that are defined in text files are easy to store in the code repository. No deployment is required to update the data quality checks. Simply migrate your pipelines to the production environment, run the pipelines and run DQO data quality checks to ensure a successful migration.

  • Manage multiple environments
  • Instantly upgrade the data quality rules after migrating your pipelines to the production environment
  • Define data quality tests to be executed after migration

Data Quality Test Versioning

Data Quality Test Versioning

Store the data quality rules definitions in the source repository. Track how your data quality expectations change over time.

DQO data quality rules are just AML files. Store them in the repository like any other code. Create pull requests and compare changes to the rules by simply using tools for Git.

  • Data quality rules are easy to version
  • Data quality rules may be released after a peer review (a pull request)
  • Check who has changed the data quality rules or a data lineage dependency

Work with local environments

Work with local environments

Verify the quality of the data generated by your data preparation scripts on local environments (local databases) before merging changes to a shared environment.

DQO does not need a server to run data quality checks. DQO command line tools can connect to your local database and run the data quality checks.

  • Build data quality checks without affecting shared environments (such as a developer database shared by other developers)
  • Verify changes to the data quality checks locally
  • Design and test custom data quality checks in isolated environments

REACH A 100% DATA QUALITY SCORE