Skip to content

Last updated: July 22, 2025

DQOps YAML file definitions

The definition of YAML files used by DQOps to configure the data sources, monitored tables, and the configuration of activated data quality checks.

TableProfilingCheckCategoriesSpec

Container of table level checks that are activated on a table level.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
result_truncation Defines how many profiling checks results are stored for the table monthly. By default, DQOps will use the 'one_per_month' configuration and store only the most recent profiling checks result executed during the month. By changing this value, it is possible to store one value per day or even store all profiling checks results. enum store_the_most_recent_result_per_month
store_the_most_recent_result_per_week
store_the_most_recent_result_per_day
store_the_most_recent_result_per_hour
store_all_results_without_date_truncation
volume Configuration of volume data quality checks on a table level. TableVolumeProfilingChecksSpec
timeliness Configuration of timeliness checks on a table level. Timeliness checks detect anomalies like rapid row count changes. TableTimelinessProfilingChecksSpec
accuracy Configuration of accuracy checks on a table level. Accuracy checks compare the tested table with another reference table. TableAccuracyProfilingChecksSpec
custom_sql Configuration of data quality checks that are evaluating custom SQL conditions and aggregated expressions. TableCustomSqlProfilingChecksSpec
availability Configuration of the table availability data quality checks on a table level. TableAvailabilityProfilingChecksSpec
schema Configuration of schema (column count and schema) data quality checks on a table level. TableSchemaProfilingChecksSpec
uniqueness Configuration of uniqueness checks on a table level. TableUniquenessProfilingChecksSpec
comparisons Dictionary of configuration of checks for table comparisons. The key that identifies each comparison must match the name of a data comparison that is configured on the parent table. TableComparisonProfilingChecksSpecMap
custom Dictionary of custom checks. The keys are check names within this category. CustomCheckSpecMap

TableVolumeProfilingChecksSpec

Container of built-in preconfigured volume data quality checks on a table level.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
profile_row_count Verifies that the tested table has at least a minimum accepted number of rows. The default configuration of the warning, error and fatal severity rules verifies a minimum row count of one row, which ensures that the table is not empty. TableRowCountCheckSpec
profile_row_count_anomaly Detects when the row count has changed too much since the previous day. It uses time series anomaly detection to find the biggest volume changes during the last 90 days. TableRowCountAnomalyDifferencingCheckSpec
profile_row_count_change Detects when the volume's (row count) change since the last known row count exceeds the maximum accepted change percentage. TableRowCountChangeCheckSpec
profile_row_count_change_1_day Detects when the volume's change (increase or decrease of the row count) since the previous day exceeds the maximum accepted change percentage. TableRowCountChange1DayCheckSpec
profile_row_count_change_7_days This check verifies that the percentage of change in the table's volume (row count) since seven days ago is below the maximum accepted percentage. Verifying a volume change since a value a week ago overcomes the effect of weekly seasonability. TableRowCountChange7DaysCheckSpec
profile_row_count_change_30_days This check verifies that the percentage of change in the table's volume (row count) since thirty days ago is below the maximum accepted percentage. Comparing the current row count to a value 30 days ago overcomes the effect of monthly seasonability. TableRowCountChange30DaysCheckSpec
custom_checks Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. CustomCategoryCheckSpecMap

CustomCategoryCheckSpecMap

Dictionary of custom checks indexed by a check name that are configured on a category.


TableTimelinessProfilingChecksSpec

Container of timeliness data quality checks on a table level.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
profile_data_freshness Calculates the number of days since the most recent event timestamp (freshness) TableDataFreshnessCheckSpec
profile_data_freshness_anomaly Verifies that the number of days since the most recent event timestamp (freshness) changes in a rate within a percentile boundary during the last 90 days. TableDataFreshnessAnomalyCheckSpec
profile_data_staleness Calculates the time difference in days between the current date and the most recent data ingestion timestamp (staleness) TableDataStalenessCheckSpec
profile_data_ingestion_delay Calculates the time difference in days between the most recent event timestamp and the most recent ingestion timestamp TableDataIngestionDelayCheckSpec
custom_checks Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. CustomCategoryCheckSpecMap

TableAccuracyProfilingChecksSpec

Container of built-in preconfigured accuracy data quality checks on a table level.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
profile_total_row_count_match_percent Verifies that the total row count of the tested table matches the total row count of another (reference) table. TableAccuracyTotalRowCountMatchPercentCheckSpec
custom_checks Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. CustomCategoryCheckSpecMap

TableCustomSqlProfilingChecksSpec

Container of built-in preconfigured data quality checks on a table level that are using custom SQL expressions (conditions).

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
profile_sql_condition_failed_on_table Verifies that a minimum percentage of rows passed a custom SQL condition (expression). Reference the current table by using tokens, for example: `{alias}.col_price > {alias}.col_tax`. TableSqlConditionFailedCheckSpec
profile_sql_condition_passed_percent_on_table Verifies that a custom SQL expression is met for each row. Counts the number of rows where the expression is not satisfied, and raises an issue if too many failures were detected. This check is used also to compare values between columns: `{alias}.col_price > {alias}.col_tax`. TableSqlConditionPassedPercentCheckSpec
profile_sql_aggregate_expression_on_table Verifies that a custom aggregated SQL expression (MIN, MAX, etc.) is not outside the expected range. TableSqlAggregateExpressionCheckSpec
profile_sql_invalid_record_count_on_table Runs a custom query that retrieves invalid records found in a table and returns the number of them, and raises an issue if too many failures were detected. This check is used for setting testing queries or ready queries used by users in their own systems (legacy SQL queries). For example, when this check is applied on a age column, the condition can find invalid records in which the age is lower than 18 using an SQL query: `SELECT age FROM {table} WHERE age < 18`. TableSqlInvalidRecordCountCheckSpec
profile_import_custom_result_on_table Runs a custom query that retrieves a result of a data quality check performed in the data engineering, whose result (the severity level) is pulled from a separate table. TableSqlImportCustomResultCheckSpec
custom_checks Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. CustomCategoryCheckSpecMap

TableAvailabilityProfilingChecksSpec

Container of built-in preconfigured table availability data quality checks on a table level.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
profile_table_availability Verifies availability of a table in a monitored database using a simple query. TableAvailabilityCheckSpec
custom_checks Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. CustomCategoryCheckSpecMap

TableSchemaProfilingChecksSpec

Container of built-in preconfigured volume data quality checks on a table level.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
profile_column_count Detects if the number of column matches an expected number. Retrieves the metadata of the monitored table, counts the number of columns and compares it to an expected value (an expected number of columns). TableSchemaColumnCountCheckSpec
profile_column_count_changed Detects if the count of columns has changed. Retrieves the metadata of the monitored table, counts the number of columns and compares it the last known column count that was captured when this data quality check was executed the last time. TableSchemaColumnCountChangedCheckSpec
profile_column_list_changed Detects if new columns were added or existing columns were removed. Retrieves the metadata of the monitored table and calculates an unordered hash of the column names. Compares the current hash to the previously known hash to detect any changes to the list of columns. TableSchemaColumnListChangedCheckSpec
profile_column_list_or_order_changed Detects if new columns were added, existing columns were removed or the columns were reordered. Retrieves the metadata of the monitored table and calculates an ordered hash of the column names. Compares the current hash to the previously known hash to detect any changes to the list of columns or their order. TableSchemaColumnListOrOrderChangedCheckSpec
profile_column_types_changed Detects if new columns were added, removed or their data types have changed. Retrieves the metadata of the monitored table and calculates an unordered hash of the column names and the data types (including the length, scale, precision, nullability). Compares the current hash to the previously known hash to detect any changes to the list of columns or their types. TableSchemaColumnTypesChangedCheckSpec
custom_checks Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. CustomCategoryCheckSpecMap

TableUniquenessProfilingChecksSpec

Container of built-in preconfigured uniqueness data quality checks on a table level.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
profile_duplicate_record_count Verifies that the number of duplicate record values in a table does not exceed the maximum accepted count. TableDuplicateRecordCountCheckSpec
profile_duplicate_record_percent Verifies that the percentage of duplicate record values in a table does not exceed the maximum accepted percentage. TableDuplicateRecordPercentCheckSpec
custom_checks Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. CustomCategoryCheckSpecMap

TableComparisonProfilingChecksSpecMap

Container of comparison checks for each defined data comparison. The name of the key in this dictionary must match a name of a table comparison that is defined on the parent table.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
self Dict[string, TableComparisonProfilingChecksSpec]

TableComparisonProfilingChecksSpec

Container of built-in comparison (accuracy) checks on a table level that are using a defined comparison to identify the reference table and the data grouping configuration.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
profile_row_count_match Verifies that the row count of the tested (parent) table matches the row count of the reference table. Compares each group of data with a GROUP BY clause. TableComparisonRowCountMatchCheckSpec
profile_column_count_match Verifies that the column count of the tested (parent) table matches the column count of the reference table. Only one comparison result is returned, without data grouping. TableComparisonColumnCountMatchCheckSpec
custom_checks Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. CustomCategoryCheckSpecMap

CustomCheckSpecMap

Dictionary of custom checks indexed by a check name.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
self Dict[string, CustomCheckSpec]

CustomCheckSpec

Custom check specification. This check is usable only when there is a matching custom check definition that identifies the sensor definition and the rule definition.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
sensor_name Optional custom sensor name. It is a folder name inside the user's home 'sensors' folder or the DQOps Home (DQOps distribution) home/sensors folder. Sample sensor name: table/volume/row_count. When this value is set, it overrides the default sensor definition defined for the named check definition. string
rule_name Optional custom rule name. It is a path to a custom rule python module that starts at the user's home 'rules' folder. The path should not end with the .py file extension. Sample rule: myrules/my_custom_rule. When this value is set, it overrides the default rule definition defined for the named check definition. string
parameters Custom sensor parameters CustomSensorParametersSpec
warning Alerting threshold that raises a data quality warning that is considered as a passed data quality check CustomRuleParametersSpec
error Default alerting thresholdthat raises a data quality issue at an error severity level CustomRuleParametersSpec
fatal Alerting threshold that raises a fatal data quality issue which indicates a serious data quality problem CustomRuleParametersSpec
schedule_override Run check scheduling configuration. Specifies the schedule (a cron expression) when the data quality checks are executed by the scheduler. CronScheduleSpec
comments Comments for change tracking. Please put comments in this collection because YAML comments may be removed when the YAML file is modified by the tool (serialization and deserialization will remove non tracked comments). CommentsListSpec
disabled Disables the data quality check. Only enabled data quality checks and monitorings are executed. The check should be disabled if it should not work, but the configuration of the sensor and rules should be preserved in the configuration. boolean
exclude_from_kpi Data quality check results (alerts) are included in the data quality KPI calculation by default. Set this field to true in order to exclude this data quality check from the data quality KPI calculation. boolean
include_in_sla Marks the data quality check as part of a data quality SLA (Data Contract). The data quality SLA is a set of critical data quality checks that must always pass and are considered as a Data Contract for the dataset. boolean
quality_dimension Configures a custom data quality dimension name that is different than the built-in dimensions (Timeliness, Validity, etc.). string
display_name Data quality check display name that can be assigned to the check, otherwise the check_display_name stored in the parquet result files is the check_name. string
data_grouping Data grouping configuration name that should be applied to this data quality check. The data grouping is used to group the check's result by a GROUP BY clause in SQL, evaluating the data quality check for each group of rows. Use the name of one of data grouping configurations defined on the parent table. string
always_collect_error_samples Forces collecting error samples for this check whenever it fails, even if it is a monitoring check that is run by a scheduler, and running an additional query to collect error samples will impose additional load on the data source. boolean
do_not_schedule Disables running this check by a DQOps CRON scheduler. When a check is disabled from scheduling, it can be only triggered from the user interface or by submitting "run checks" job. boolean

CustomSensorParametersSpec

Custom sensor parameters for custom checks.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
filter SQL WHERE clause added to the sensor query. Both the table level filter and a sensor query filter are added, separated by an AND operator. string

CustomRuleParametersSpec

Custom data quality rule.


CronScheduleSpec

Cron job schedule specification.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
cron_expression Unix style cron expression that specifies when to execute scheduled operations like running data quality checks or synchronizing the configuration with the cloud. string
disabled Disables the schedule. When the value of this 'disable' field is false, the schedule is stored in the metadata but it is not activated to run data quality checks. boolean

CommentsListSpec

List of comments.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
self List[CommentSpec]

CommentSpec

Comment entry. Comments are added when a change was made and the change should be recorded in a persisted format.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
date Comment date and time datetime
comment_by Commented by string
comment Comment text string