Are your tables healthy and of good quality today? As a Data Steward, can you answer this question instantly?
DQOps data monitoring lets you monitor the data quality of all the tables that you are responsible for. The quality can be monitored every day or after each data refresh. Monitor data quality KPIs to answer questions about the quality of your data. Continuously monitor the Data Quality of the data
DQOps data monitoring lets you monitor the data quality of all the tables that you are responsible for. The quality can be monitored every day or after each data refresh. Monitor data quality KPIs to answer questions about the quality of your data. Continuously monitor the Data Quality of the data
One place for Data Quality rules
One place for Data Quality rules
Define all data quality checks in one place. Easily apply similar data quality rules for similar columns by copy-pasting definition.
Data quality checks are defined in YAML files, one file per table. All rules are easy to edit with a popular text editor that also supports auto suggestions for possible quality checks.
- Define your data quality requirements in the code
- Profile your data sources and detect data quality issues
- Detect data integrity issues such as missing dimension rows
Data Quality under control
Data Quality under control
Monitor and detect data quality issues before users ask you what happened with your tables.
DQOps data observability can monitor defined data quality checks on the tables every day or following a custom schedule. Issues with data availability or validity are detected before they affect downstream data consumers.
- Data quality checks are executed frequently
- All data quality rules promised to data customers are ensured
- Stale tables (not refreshed tables) are detected by timeliness (latency) tests
Track multiple Data Quality dimensions
Track multiple Data Quality dimensions
Analyze and measure the data quality on multiple data quality dimensions to detect different types of issues.
All built-in data quality checks are clearly divided into multiple dimensions commonly used in the data quality area. Monitor the validity, consistency, completeness, timeliness, reasonableness, availability, accuracy, reliability, accessibility and integrity of your data separately.
- Multiple dimensions of data quality are clearly separated
- Data quality monitored from many angles
- Define custom data quality rules to monitor business-relevant metrics
Data source documentation
Data source documentation
Treat your data quality check definition as documentation of the data quality metrics that are ensured for your tables.
The data quality definition files are simple and self-descriptive. All data quality check names and alert thresholds are written as easy-to-understand names. You can just share the data quality definition file (YAML) without revealing the database credentials.
- Data Quality rules become our documentation
- The Data Quality rules documentation is always valid because the Data Quality checks are validated daily
- Your data customers like data engineers, BI developers or data scientists can see which columns are not null, which columns are unique, what are the valid data ranges or data formats
Data Quality issues notifications
Data Quality issues notifications
Get notified when the quality of your tables is not met. Also get notified if your upstream sources behave inconsistently and will affect your tables.
DQOps data quality checks may be executed anywhere. Simply define a way to get notified when fatal alerts are raised.
- Build the data quality check step into your data processing pipeline. DQOps can be called from any pipeline as a command line tool
- Define alerts at multiple severity levels: warning, error and fatal
- Define dependencies on upstream tables in the data pipeline that can affect the quality of your tables
Ground Truth Checks
Ground Truth Checks
Compare the quality of the data in the database with other trusted data sources to detect differences from the real world.
DQOps can compare data aggregated by business metrics across data sources. Select an aggregation function (count, sum, average) and a list of data dimensions for two related tables and run a comparison (data quality check). You can also upload reference data to DQOps data quality database to compare it with your tables.
- Compare data with real world, grouped by business dimensions (like a country or state code), maybe you are missing data from one state
- Compare data between related databases
- Set up a custom Python script to pull reference data from external data sources for comparison