Skip to content

Last updated: July 22, 2025

DQOps YAML file definitions

The definition of YAML files used by DQOps to configure the data sources, monitored tables, and the configuration of activated data quality checks.

ColumnDailyPartitionedCheckCategoriesSpec

Container of data quality partitioned checks on a column level that are checking numeric values at a daily level.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
nulls Daily partitioned checks of nulls in the column ColumnNullsDailyPartitionedChecksSpec
uniqueness Daily partitioned checks of uniqueness in the column ColumnUniquenessDailyPartitionedChecksSpec
accepted_values Configuration of accepted values checks on a column level ColumnAcceptedValuesDailyPartitionedChecksSpec
text Daily partitioned checks of text values in the column ColumnTextDailyPartitionedChecksSpec
whitespace Configuration of column level checks that detect blank and whitespace values ColumnWhitespaceDailyPartitionedChecksSpec
conversions Configuration of conversion testing checks on a column level. ColumnConversionsDailyPartitionedChecksSpec
patterns Daily partitioned pattern match checks on a column level ColumnPatternsDailyPartitionedChecksSpec
pii Daily partitioned checks of Personal Identifiable Information (PII) in the column ColumnPiiDailyPartitionedChecksSpec
numeric Daily partitioned checks of numeric values in the column ColumnNumericDailyPartitionedChecksSpec
anomaly Daily partitioned checks for anomalies in numeric columns ColumnAnomalyDailyPartitionedChecksSpec
datetime Daily partitioned checks of datetime in the column ColumnDatetimeDailyPartitionedChecksSpec
bool Daily partitioned checks for booleans in the column ColumnBoolDailyPartitionedChecksSpec
integrity Daily partitioned checks for integrity in the column ColumnIntegrityDailyPartitionedChecksSpec
custom_sql Daily partitioned checks using custom SQL expressions evaluated on the column ColumnCustomSqlDailyPartitionedChecksSpec
datatype Daily partitioned checks for datatype in the column ColumnDatatypeDailyPartitionedChecksSpec
comparisons Dictionary of configuration of checks for table comparisons at a column level. The key that identifies each comparison must match the name of a data comparison that is configured on the parent table. ColumnComparisonDailyPartitionedChecksSpecMap
custom Dictionary of custom checks. The keys are check names within this category. CustomCheckSpecMap

ColumnNullsDailyPartitionedChecksSpec

Container of nulls data quality partitioned checks on a column level that are checking at a daily level.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
daily_partition_nulls_count Detects incomplete columns that contain any null values. Counts the number of rows having a null value. Raises a data quality issue when the count of null values is above a max_count threshold. Stores a separate data quality check result for each daily partition. ColumnNullsCountCheckSpec
daily_partition_nulls_percent Detects incomplete columns that contain any null values. Measures the percentage of rows having a null value. Raises a data quality issue when the percentage of null values is above a max_percent threshold. Stores a separate data quality check result for each daily partition. ColumnNullsPercentCheckSpec
daily_partition_nulls_percent_anomaly Detects day-to-day anomalies in the percentage of null values. Raises a data quality issue when the rate of null values increases or decreases too much during the last 90 days. ColumnNullPercentAnomalyStationaryCheckSpec
daily_partition_not_nulls_count Verifies that a column contains a minimum number of non-null values. The default value of the min_count parameter is 1 to detect at least one value in a monitored column. Raises a data quality issue when the count of non-null values is below min_count. Stores a separate data quality check result for each daily partition. ColumnNotNullsCountCheckSpec
daily_partition_not_nulls_percent Detects columns that contain too many non-null values. Measures the percentage of rows that have non-null values. Raises a data quality issue when the percentage of non-null values is above max_percentage. Stores a separate data quality check result for each daily partition. ColumnNotNullsPercentCheckSpec
daily_partition_empty_column_found Detects empty columns that contain only null values. Counts the number of rows that have non-null values. Raises a data quality issue when the column is empty. Stores a separate data quality check result for each daily partition. ColumnEmptyColumnFoundCheckSpec
daily_partition_nulls_percent_change Verifies that the null percent value in a column changed in a fixed rate since last readout. ColumnNullPercentChangeCheckSpec
daily_partition_nulls_percent_change_1_day Verifies that the null percent value in a column changed in a fixed rate since the last readout from yesterday. ColumnNullPercentChange1DayCheckSpec
daily_partition_nulls_percent_change_7_days Verifies that the null percent value in a column changed in a fixed rate since the last readout from the last week. ColumnNullPercentChange7DaysCheckSpec
daily_partition_nulls_percent_change_30_days Verifies that the null percent value in a column changed in a fixed rate since the last readout from the last month. ColumnNullPercentChange30DaysCheckSpec
custom_checks Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. CustomCategoryCheckSpecMap

ColumnUniquenessDailyPartitionedChecksSpec

Container of uniqueness data quality partitioned checks on a column level that are checking at a daily level.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
daily_partition_distinct_count Verifies that the number of distinct values stays within an accepted range. Stores a separate data quality check result for each daily partition. ColumnDistinctCountCheckSpec
daily_partition_distinct_percent Verifies that the percentage of distinct values in a column does not fall below the minimum accepted percent. Stores a separate data quality check result for each daily partition. ColumnDistinctPercentCheckSpec
daily_partition_duplicate_count Verifies that the number of duplicate values in a column does not exceed the maximum accepted count. Stores a separate data quality check result for each daily partition. ColumnDuplicateCountCheckSpec
daily_partition_duplicate_percent Verifies that the percent of duplicate values in a column does not exceed the maximum accepted percent. Stores a separate data quality check result for each daily partition. ColumnDuplicatePercentCheckSpec
daily_partition_distinct_count_anomaly Verifies that the distinct count in a monitored column is within a two-tailed percentile from measurements made during the last 90 days. ColumnDistinctCountAnomalyStationaryPartitionCheckSpec
daily_partition_distinct_percent_anomaly Verifies that the distinct percent in a monitored column is within a two-tailed percentile from measurements made during the last 90 days. ColumnDistinctPercentAnomalyStationaryCheckSpec
daily_partition_distinct_count_change Verifies that the distinct count in a monitored column has changed by a fixed rate since the last readout. ColumnDistinctCountChangeCheckSpec
daily_partition_distinct_count_change_1_day Verifies that the distinct count in a monitored column has changed by a fixed rate since the last readout from yesterday. ColumnDistinctCountChange1DayCheckSpec
daily_partition_distinct_count_change_7_days Verifies that the distinct count in a monitored column has changed by a fixed rate since the last readout from the last week. ColumnDistinctCountChange7DaysCheckSpec
daily_partition_distinct_count_change_30_days Verifies that the distinct count in a monitored column has changed by a fixed rate since the last readout from the last month. ColumnDistinctCountChange30DaysCheckSpec
daily_partition_distinct_percent_change Verifies that the distinct percent in a monitored column has changed by a fixed rate since the last readout. ColumnDistinctPercentChangeCheckSpec
daily_partition_distinct_percent_change_1_day Verifies that the distinct percent in a monitored column has changed by a fixed rate since the last readout from yesterday. ColumnDistinctPercentChange1DayCheckSpec
daily_partition_distinct_percent_change_7_days Verifies that the distinct percent in a monitored column has changed by a fixed rate since the last readout from the last week. ColumnDistinctPercentChange7DaysCheckSpec
daily_partition_distinct_percent_change_30_days Verifies that the distinct percent in a monitored column has changed by a fixed rate since the last readout from the last month. ColumnDistinctPercentChange30DaysCheckSpec
custom_checks Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. CustomCategoryCheckSpecMap

ColumnDistinctCountAnomalyStationaryPartitionCheckSpec

This check monitors the count of distinct values and detects anomalies in the changes of the distinct count. It monitors a 90-day time window. The check is configured by setting a desired percentage of anomalies to identify as data quality issues.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
parameters Data quality check parameters ColumnUniquenessDistinctCountSensorParametersSpec
warning Alerting threshold that raises a data quality warning that is considered as a passed data quality check AnomalyStationaryCountValuesRuleWarning1PctParametersSpec
error Default alerting threshold for a set number of rows with negative value in a column that raises a data quality alert AnomalyStationaryCountValuesRuleError05PctParametersSpec
fatal Alerting threshold that raises a fatal data quality issue which indicates a serious data quality problem AnomalyStationaryCountValuesRuleFatal01PctParametersSpec
schedule_override Run check scheduling configuration. Specifies the schedule (a cron expression) when the data quality checks are executed by the scheduler. CronScheduleSpec
comments Comments for change tracking. Please put comments in this collection because YAML comments may be removed when the YAML file is modified by the tool (serialization and deserialization will remove non tracked comments). CommentsListSpec
disabled Disables the data quality check. Only enabled data quality checks and monitorings are executed. The check should be disabled if it should not work, but the configuration of the sensor and rules should be preserved in the configuration. boolean
exclude_from_kpi Data quality check results (alerts) are included in the data quality KPI calculation by default. Set this field to true in order to exclude this data quality check from the data quality KPI calculation. boolean
include_in_sla Marks the data quality check as part of a data quality SLA (Data Contract). The data quality SLA is a set of critical data quality checks that must always pass and are considered as a Data Contract for the dataset. boolean
quality_dimension Configures a custom data quality dimension name that is different than the built-in dimensions (Timeliness, Validity, etc.). string
display_name Data quality check display name that can be assigned to the check, otherwise the check_display_name stored in the parquet result files is the check_name. string
data_grouping Data grouping configuration name that should be applied to this data quality check. The data grouping is used to group the check's result by a GROUP BY clause in SQL, evaluating the data quality check for each group of rows. Use the name of one of data grouping configurations defined on the parent table. string
always_collect_error_samples Forces collecting error samples for this check whenever it fails, even if it is a monitoring check that is run by a scheduler, and running an additional query to collect error samples will impose additional load on the data source. boolean
do_not_schedule Disables running this check by a DQOps CRON scheduler. When a check is disabled from scheduling, it can be only triggered from the user interface or by submitting "run checks" job. boolean

AnomalyStationaryCountValuesRuleWarning1PctParametersSpec

Data quality rule that detects anomalies in a stationary time series of counts of values. The rule identifies the top X% of anomalous values, based on the distribution of the changes using a standard deviation. The rule uses the time window of the last 90 days, but at least 30 historical measures must be present to run the calculation.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
anomaly_percent The probability (in percent) that the count of values (records) is an anomaly because the value is outside the regular range of counts. The default time window of 90 time periods (days, etc.) is used, but at least 30 readouts must exist to run the calculation. double
use_ai Use an AI model to predict anomalies. WARNING: anomaly detection by AI models is not supported in a trial distribution of DQOps. Please contact DQOps support to upgrade your instance to a full DQOps instance. boolean

AnomalyStationaryCountValuesRuleFatal01PctParametersSpec

Data quality rule that detects anomalies in a stationary time series of counts of values. The rule identifies the top X% of anomalous values, based on the distribution of the changes using a standard deviation. The rule uses the time window of the last 90 days, but at least 30 historical measures must be present to run the calculation.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
anomaly_percent The probability (in percent) that the count of values (records) is an anomaly because the value is outside the regular range of counts. The default time window of 90 time periods (days, etc.) is used, but at least 30 readouts must exist to run the calculation. double
use_ai Use an AI model to predict anomalies. WARNING: anomaly detection by AI models is not supported in a trial distribution of DQOps. Please contact DQOps support to upgrade your instance to a full DQOps instance. boolean

ColumnAcceptedValuesDailyPartitionedChecksSpec

Container of accepted values data quality partitioned checks on a column level that are checking at a daily level.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
daily_partition_text_found_in_set_percent The check measures the percentage of rows whose value in a tested column is one of values from a list of expected values or the column value is null. Verifies that the percentage of rows having a valid column value does not exceed the minimum accepted percentage. Stores a separate data quality check result for each daily partition. ColumnTextFoundInSetPercentCheckSpec
daily_partition_number_found_in_set_percent The check measures the percentage of rows whose value in a tested column is one of values from a list of expected values or the column value is null. Verifies that the percentage of rows having a valid column value does not exceed the minimum accepted percentage. Stores a separate data quality check result for each daily partition. ColumnNumberFoundInSetPercentCheckSpec
daily_partition_expected_text_values_in_use_count Verifies that the expected string values were found in the column. Raises a data quality issue when too many expected values were not found (were missing). Stores a separate data quality check result for each daily partition. ColumnExpectedTextValuesInUseCountCheckSpec
daily_partition_expected_texts_in_top_values_count Verifies that the top X most popular column values contain all values from a list of expected values. Stores a separate data quality check result for each daily partition. ColumnExpectedTextsInTopValuesCountCheckSpec
daily_partition_expected_numbers_in_use_count Verifies that the expected numeric values were found in the column. Raises a data quality issue when too many expected values were not found (were missing). Stores a separate data quality check result for each daily partition. ColumnExpectedNumbersInUseCountCheckSpec
daily_partition_text_valid_country_code_percent Verifies that the percentage of valid country codes in a text column does not fall below the minimum accepted percentage. Analyzes every daily partition and creates a separate data quality check result with the time period value that identifies the daily partition. ColumnTextValidCountryCodePercentCheckSpec
daily_partition_text_valid_currency_code_percent Verifies that the percentage of valid currency codes in a text column does not fall below the minimum accepted percentage. Analyzes every daily partition and creates a separate data quality check result with the time period value that identifies the daily partition. ColumnTextValidCurrencyCodePercentCheckSpec
custom_checks Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. CustomCategoryCheckSpecMap

ColumnTextDailyPartitionedChecksSpec

Container of text data quality partitioned checks on a column level that are checking at a daily partition level.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
daily_partition_text_min_length This check finds the length of the shortest text in a column. Then, it verifies that the minimum length is within an accepted range. It detects that the shortest text is too short. Analyzes every daily partition and creates a separate data quality check result with the time period value that identifies the daily partition. ColumnTextMinLengthCheckSpec
daily_partition_text_max_length This check finds the length of the longest text in a column. Then, it verifies that the maximum length is within an accepted range. It detects that the texts are too long or not long enough. Analyzes every daily partition and creates a separate data quality check result with the time period value that identifies the daily partition. ColumnTextMaxLengthCheckSpec
daily_partition_text_mean_length Verifies that the mean (average) length of texts in a column is within an accepted range. Analyzes every daily partition and creates a separate data quality check result with the time period value that identifies the daily partition. ColumnTextMeanLengthCheckSpec
daily_partition_text_length_below_min_length The check counts the number of text values in the column that is below the length defined by the user as a parameter. Analyzes every daily partition and creates a separate data quality check result with the time period value that identifies the daily partition. ColumnTextLengthBelowMinLengthCheckSpec
daily_partition_text_length_below_min_length_percent The check measures the percentage of text values in the column that is below the length defined by the user as a parameter. Analyzes every daily partition and creates a separate data quality check result with the time period value that identifies the daily partition. ColumnTextLengthBelowMinLengthPercentCheckSpec
daily_partition_text_length_above_max_length The check counts the number of text values in the column that is above the length defined by the user as a parameter. Analyzes every daily partition and creates a separate data quality check result with the time period value that identifies the daily partition. ColumnTextLengthAboveMaxLengthCheckSpec
daily_partition_text_length_above_max_length_percent The check measures the percentage of text values in the column that is above the length defined by the user as a parameter. Analyzes every daily partition and creates a separate data quality check result with the time period value that identifies the daily partition. ColumnTextLengthAboveMaxLengthPercentCheckSpec
daily_partition_text_length_in_range_percent The check measures the percentage of those text values with length in the range provided by the user in the column. Analyzes every daily partition and creates a separate data quality check result with the time period value that identifies the daily partition. ColumnTextLengthInRangePercentCheckSpec
daily_partition_min_word_count This check finds the lowest word count of text in a column. Then, it verifies that the minimum length is within an accepted range. It detects that the text contains too less words. ColumnTextMinWordCountCheckSpec
daily_partition_max_word_count This check finds the highest word count of text in a column. Then, it verifies that the maximum length is within an accepted range. It detects that the text contains too many words. ColumnTextMaxWordCountCheckSpec
custom_checks Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. CustomCategoryCheckSpecMap

ColumnWhitespaceDailyPartitionedChecksSpec

Container of whitespace values detection data quality partitioned checks on a column level that are checking at a daily level.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
daily_partition_empty_text_found Detects empty texts (not null, zero-length texts). This check counts empty and raises a data quality issue when their count exceeds a max_count parameter value. Stores a separate data quality check result for each daily partition. ColumnWhitespaceEmptyTextFoundCheckSpec
daily_partition_whitespace_text_found Detects texts that contain only spaces and other whitespace characters. It raises a data quality issue when their count exceeds a max_count parameter value. Stores a separate data quality check result for each daily partition. ColumnWhitespaceWhitespaceTextFoundCheckSpec
daily_partition_null_placeholder_text_found Detects texts that are well-known placeholders of null values, such as None, null, n/a. It counts null placeholders and raises a data quality issue when their count exceeds a max_count parameter value. Stores a separate data quality check result for each daily partition. ColumnWhitespaceNullPlaceholderTextFoundCheckSpec
daily_partition_empty_text_percent Detects empty texts (not null, zero-length texts) and measures their percentage in the column. This check verifies that the rate of empty strings in a column does not exceed the maximum accepted percentage. Stores a separate data quality check result for each daily partition. ColumnWhitespaceEmptyTextPercentCheckSpec
daily_partition_whitespace_text_percent Detects texts that contain only spaces and other whitespace characters and measures their percentage in the column. It raises a data quality issue when their rate exceeds a max_percent parameter value. Stores a separate data quality check result for each daily partition. ColumnWhitespaceWhitespaceTextPercentCheckSpec
daily_partition_null_placeholder_text_percent Detects texts that are well-known placeholders of null values, such as None, null, n/a, and measures their percentage in the column. It raises a data quality issue when their rate exceeds a max_percent parameter value. Stores a separate data quality check result for each daily partition. ColumnWhitespaceNullPlaceholderTextPercentCheckSpec
daily_partition_text_surrounded_by_whitespace_found Detects text values that are surrounded by whitespace characters on any side. This check counts whitespace-surrounded texts and raises a data quality issue when their count exceeds the max_count parameter value. Analyzes every daily partition and creates a separate data quality check result with the time period value that identifies the daily partition. ColumnWhitespaceTextSurroundedByWhitespaceFoundCheckSpec
daily_partition_text_surrounded_by_whitespace_percent This check detects text values that are surrounded by whitespace characters on any side and measures their percentage. This check raises a data quality issue when their percentage exceeds the max_percent parameter value. Analyzes every daily partition and creates a separate data quality check result with the time period value that identifies the daily partition. ColumnWhitespaceTextSurroundedByWhitespacePercentCheckSpec
custom_checks Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. CustomCategoryCheckSpecMap

ColumnConversionsDailyPartitionedChecksSpec

Container of conversion test checks that are monitoring if text values are convertible to a target data type at a daily partition level.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
daily_partition_text_parsable_to_boolean_percent Verifies that the percentage of text values that are parsable to a boolean value does not fall below the minimum accepted percentage, text values identified as boolean placeholders are: 0, 1, true, false, t, f, yes, no, y, n. Analyzes every daily partition and creates a separate data quality check result with the time period value that identifies the daily partition. ColumnTextParsableToBooleanPercentCheckSpec
daily_partition_text_parsable_to_integer_percent Verifies that the percentage text values that are parsable to an integer value in a column does not fall below the minimum accepted percentage. Analyzes every daily partition and creates a separate data quality check result with the time period value that identifies the daily partition. ColumnTextParsableToIntegerPercentCheckSpec
daily_partition_text_parsable_to_float_percent Verifies that the percentage text values that are parsable to a float value in a column does not fall below the minimum accepted percentage. Analyzes every daily partition and creates a separate data quality check result with the time period value that identifies the daily partition. ColumnTextParsableToFloatPercentCheckSpec
daily_partition_text_parsable_to_date_percent Verifies that the percentage text values that are parsable to a date value in a column does not fall below the minimum accepted percentage. DQOps uses a safe_cast when possible, otherwise the text is verified using a regular expression. Analyzes every daily partition and creates a separate data quality check result with the time period value that identifies the daily partition. ColumnTextParsableToDatePercentCheckSpec
custom_checks Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. CustomCategoryCheckSpecMap

ColumnPatternsDailyPartitionedChecksSpec

Container of built-in preconfigured daily partition checks on a column level that are checking for values matching patterns (regular expressions) in text columns.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
daily_partition_text_not_matching_regex_found Verifies that the number of text values not matching the custom regular expression pattern does not exceed the maximum accepted count. ColumnTextNotMatchingRegexFoundCheckSpec
daily_partition_texts_not_matching_regex_percent Verifies that the percentage of strings matching the custom regular expression pattern does not exceed the maximum accepted percentage. ColumnTextsNotMatchingRegexPercentCheckSpec
daily_partition_invalid_email_format_found Verifies that the number of invalid emails in a text column does not exceed the maximum accepted count. ColumnInvalidEmailFormatFoundCheckSpec
daily_partition_invalid_email_format_percent Verifies that the percentage of invalid emails in a text column does not exceed the maximum accepted percentage. ColumnInvalidEmailFormatPercentCheckSpec
daily_partition_text_not_matching_date_pattern_found Verifies that the number of texts not matching the date format regular expression does not exceed the maximum accepted count. ColumnTextNotMatchingDatePatternFoundCheckSpec
daily_partition_text_not_matching_date_pattern_percent Verifies that the percentage of texts matching the date format regular expression in a column does not exceed the maximum accepted percentage. ColumnTextNotMatchingDatePatternPercentCheckSpec
daily_partition_text_not_matching_name_pattern_percent Verifies that the percentage of texts matching the name regular expression does not exceed the maximum accepted percentage. ColumnTextNotMatchingNamePatternPercentCheckSpec
daily_partition_invalid_uuid_format_found Verifies that the number of invalid UUIDs in a text column does not exceed the maximum accepted count. ColumnInvalidUuidFormatFoundCheckSpec
daily_partition_invalid_uuid_format_percent Verifies that the percentage of invalid UUID in a text column does not exceed the maximum accepted percentage. ColumnInvalidUuidFormatPercentCheckSpec
daily_partition_invalid_ip4_address_format_found Verifies that the number of invalid IP4 addresses in a text column does not exceed the maximum accepted count. ColumnInvalidIp4AddressFormatFoundCheckSpec
daily_partition_invalid_ip6_address_format_found Verifies that the number of invalid IP6 addresses in a text column does not exceed the maximum accepted count. ColumnInvalidIp6AddressFormatFoundCheckSpec
daily_partition_invalid_usa_phone_format_found Verifies that the number of invalid USA phone numbers in a text column does not exceed the maximum accepted count. ColumnInvalidUsaPhoneFoundCheckSpec
daily_partition_invalid_usa_zipcode_format_found Verifies that the number of invalid zip codes in a text column does not exceed the maximum accepted count. ColumnInvalidUsaZipcodeFoundCheckSpec
daily_partition_invalid_usa_phone_format_percent Verifies that the percentage of invalid USA phones number in a text column does not exceed the maximum accepted percentage. ColumnInvalidUsaPhonePercentCheckSpec
daily_partition_invalid_usa_zipcode_format_percent Verifies that the percentage of invalid USA phones number in a text column does not exceed the maximum accepted percentage. ColumnInvalidUsaZipcodePercentCheckSpec
custom_checks Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. CustomCategoryCheckSpecMap

ColumnPiiDailyPartitionedChecksSpec

Container of PII data quality partitioned checks on a column level that are checking at a daily level.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
daily_partition_contains_usa_phone_percent Detects USA phone numbers in text columns. Verifies that the percentage of rows that contains USA phone number in a column does not exceed the maximum accepted percentage. Stores a separate data quality check result for each daily partition. ColumnPiiContainsUsaPhonePercentCheckSpec
daily_partition_contains_email_percent Detects emails in text columns. Verifies that the percentage of rows that contains emails in a column does not exceed the minimum accepted percentage. Stores a separate data quality check result for each daily partition. ColumnPiiContainsEmailPercentCheckSpec
daily_partition_contains_usa_zipcode_percent Detects USA zip codes in text columns. Verifies that the percentage of rows that contains USA zip code in a column does not exceed the maximum accepted percentage. Stores a separate data quality check result for each daily partition. ColumnPiiContainsUsaZipcodePercentCheckSpec
daily_partition_contains_ip4_percent Detects IP4 addresses in text columns. Verifies that the percentage of rows that contains IP4 address values in a column does not fall below the minimum accepted percentage. Stores a separate data quality check result for each daily partition. ColumnPiiContainsIp4PercentCheckSpec
daily_partition_contains_ip6_percent Detects IP6 addresses in text columns. Verifies that the percentage of rows that contains valid IP6 address values in a column does not fall below the minimum accepted percentage. Stores a separate data quality check result for each daily partition. ColumnPiiContainsIp6PercentCheckSpec
custom_checks Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. CustomCategoryCheckSpecMap

ColumnNumericDailyPartitionedChecksSpec

Container of numeric data quality partitioned checks on a column level that are checking at a daily level.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
daily_partition_number_below_min_value The check counts the number of values in the column that are below the value defined by the user as a parameter. Stores a separate data quality check result for each daily partition. ColumnNumberBelowMinValueCheckSpec
daily_partition_number_above_max_value The check counts the number of values in the column that are above the value defined by the user as a parameter. Stores a separate data quality check result for each daily partition. ColumnNumberAboveMaxValueCheckSpec
daily_partition_negative_values Verifies that the number of negative values in a column does not exceed the maximum accepted count. Stores a separate data quality check result for each daily partition. ColumnNegativeCountCheckSpec
daily_partition_negative_values_percent Verifies that the percentage of negative values in a column does not exceed the maximum accepted percentage. Stores a separate data quality check result for each daily partition. ColumnNegativePercentCheckSpec
daily_partition_number_below_min_value_percent The check counts the percentage of values in the column that are below the value defined by the user as a parameter. Stores a separate data quality check result for each daily partition. ColumnNumberBelowMinValuePercentCheckSpec
daily_partition_number_above_max_value_percent The check counts the percentage of values in the column that are above the value defined by the user as a parameter. Stores a separate data quality check result for each daily partition. ColumnNumberAboveMaxValuePercentCheckSpec
daily_partition_number_in_range_percent Verifies that the percentage of values from range in a column does not exceed the minimum accepted percentage. Stores a separate data quality check result for each daily partition. ColumnNumberInRangePercentCheckSpec
daily_partition_integer_in_range_percent Verifies that the percentage of values from range in a column does not exceed the minimum accepted percentage. Stores a separate data quality check result for each daily partition. ColumnIntegerInRangePercentCheckSpec
daily_partition_min_in_range Verifies that the minimum value in a column is not outside the expected range. Stores a separate data quality check result for each daily partition. ColumnMinInRangeCheckSpec
daily_partition_max_in_range Verifies that the maximum value in a column is not outside the expected range. Stores a separate data quality check result for each daily partition. ColumnMaxInRangeCheckSpec
daily_partition_sum_in_range Verifies that the sum of all values in a column is not outside the expected range. Stores a separate data quality check result for each daily partition. ColumnSumInRangeCheckSpec
daily_partition_mean_in_range Verifies that the average (mean) of all values in a column is not outside the expected range. Stores a separate data quality check result for each daily partition. ColumnMeanInRangeCheckSpec
daily_partition_median_in_range Verifies that the median of all values in a column is not outside the expected range. Stores a separate data quality check result for each daily partition. ColumnMedianInRangeCheckSpec
daily_partition_percentile_in_range Verifies that the percentile of all values in a column is not outside the expected range. Stores a separate data quality check result for each daily partition. ColumnPercentileInRangeCheckSpec
daily_partition_percentile_10_in_range Verifies that the percentile 10 of all values in a column is not outside the expected range. Stores a separate data quality check result for each daily partition. ColumnPercentile10InRangeCheckSpec
daily_partition_percentile_25_in_range Verifies that the percentile 25 of all values in a column is not outside the expected range. Stores a separate data quality check result for each daily partition. ColumnPercentile25InRangeCheckSpec
daily_partition_percentile_75_in_range Verifies that the percentile 75 of all values in a column is not outside the expected range. Stores a separate data quality check result for each daily partition. ColumnPercentile75InRangeCheckSpec
daily_partition_percentile_90_in_range Verifies that the percentile 90 of all values in a column is not outside the expected range. Stores a separate data quality check result for each daily partition. ColumnPercentile90InRangeCheckSpec
daily_partition_sample_stddev_in_range Verifies that the sample standard deviation of all values in a column is not outside the expected range. Stores a separate data quality check result for each daily partition. ColumnSampleStddevInRangeCheckSpec
daily_partition_population_stddev_in_range Verifies that the population standard deviation of all values in a column is not outside the expected range. Stores a separate data quality check result for each daily partition. ColumnPopulationStddevInRangeCheckSpec
daily_partition_sample_variance_in_range Verifies that the sample variance of all values in a column is not outside the expected range. Stores a separate data quality check result for each daily partition. ColumnSampleVarianceInRangeCheckSpec
daily_partition_population_variance_in_range Verifies that the population variance of all values in a column is not outside the expected range. Stores a separate data quality check result for each daily partition. ColumnPopulationVarianceInRangeCheckSpec
daily_partition_invalid_latitude Verifies that the number of invalid latitude values in a column does not exceed the maximum accepted count. Stores a separate data quality check result for each daily partition. ColumnInvalidLatitudeCountCheckSpec
daily_partition_valid_latitude_percent Verifies that the percentage of valid latitude values in a column does not fall below the minimum accepted percentage. Stores a separate data quality check result for each daily partition. ColumnValidLatitudePercentCheckSpec
daily_partition_invalid_longitude Verifies that the number of invalid longitude values in a column does not exceed the maximum accepted count. Stores a separate data quality check result for each daily partition. ColumnInvalidLongitudeCountCheckSpec
daily_partition_valid_longitude_percent Verifies that the percentage of valid longitude values in a column does not fall below the minimum accepted percentage. Stores a separate data quality check result for each daily partition. ColumnValidLongitudePercentCheckSpec
daily_partition_non_negative_values Verifies that the number of non-negative values in a column does not exceed the maximum accepted count. Stores a separate data quality check result for each daily partition. ColumnNonNegativeCountCheckSpec
daily_partition_non_negative_values_percent Verifies that the percentage of non-negative values in a column does not exceed the maximum accepted percentage. Stores a separate data quality check result for each daily partition. ColumnNonNegativePercentCheckSpec
custom_checks Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. CustomCategoryCheckSpecMap

ColumnAnomalyDailyPartitionedChecksSpec

Container of built-in preconfigured data quality checks on a column level for detecting anomalies.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
daily_partition_sum_anomaly Verifies that the sum in a column is within a percentile from measurements made during the last 90 days. Calculates the sum of each daily partition and detect anomalies between daily partitions. ColumnSumAnomalyStationaryPartitionCheckSpec
daily_partition_mean_anomaly Verifies that the mean value in a column is within a percentile from measurements made during the last 90 days. Calculates the mean (average) of each daily partition and detect anomalies between daily partitions. ColumnMeanAnomalyStationaryCheckSpec
daily_partition_median_anomaly Verifies that the median in a column is within a percentile from measurements made during the last 90 days. Calculates the median of each daily partition and detect anomalies between daily partitions. ColumnMedianAnomalyStationaryCheckSpec
daily_partition_min_anomaly Detects new outliers, which are new minimum values, much below the last known minimum value. If the minimum value is constantly changing, detects outliers as the biggest change of the minimum value during the last 90 days. Finds the minimum value of each daily partition and detect anomalies between daily partitions. ColumnMinAnomalyStationaryCheckSpec
daily_partition_max_anomaly Detects new outliers, which are new maximum values, much above the last known maximum value. If the maximum value is constantly changing, detects outliers as the biggest change of the maximum value during the last 90 days. Finds the maximum value of each daily partition and detect anomalies between daily partitions. ColumnMaxAnomalyStationaryCheckSpec
daily_partition_mean_change Verifies that the mean value in a column changed in a fixed rate since last readout. ColumnMeanChangeCheckSpec
daily_partition_mean_change_1_day Verifies that the mean value in a column changed in a fixed rate since the last readout from yesterday. ColumnMeanChange1DayCheckSpec
daily_partition_mean_change_7_days Verifies that the mean value in a column changed in a fixed rate since the last readout from the last week. ColumnMeanChange7DaysCheckSpec
daily_partition_mean_change_30_days Verifies that the mean value in a column changed in a fixed rate since the last readout from the last month. ColumnMeanChange30DaysCheckSpec
daily_partition_median_change Verifies that the median in a column changed in a fixed rate since the last readout. ColumnMedianChangeCheckSpec
daily_partition_median_change_1_day Verifies that the median in a column changed in a fixed rate since the last readout from yesterday. ColumnMedianChange1DayCheckSpec
daily_partition_median_change_7_days Verifies that the median in a column changed in a fixed rate since the last readout from the last week. ColumnMedianChange7DaysCheckSpec
daily_partition_median_change_30_days Verifies that the median in a column changed in a fixed rate since the last readout from the last month. ColumnMedianChange30DaysCheckSpec
daily_partition_sum_change Verifies that the sum in a column changed in a fixed rate since the last readout. ColumnSumChangeCheckSpec
daily_partition_sum_change_1_day Verifies that the sum in a column changed in a fixed rate since the last readout from yesterday. ColumnSumChange1DayCheckSpec
daily_partition_sum_change_7_days Verifies that the sum in a column changed in a fixed rate since the last readout from the last week. ColumnSumChange7DaysCheckSpec
daily_partition_sum_change_30_days Verifies that the sum in a column changed in a fixed rate since the last readout from the last month. ColumnSumChange30DaysCheckSpec
custom_checks Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. CustomCategoryCheckSpecMap

ColumnSumAnomalyStationaryPartitionCheckSpec

This check calculates a sum of values in a numeric column and detects anomalies in a time series of previous sums. It raises a data quality issue when the sum is in the top anomaly_percent percentage of the most outstanding values in the time series. This data quality check uses a 90-day time window and requires a history of at least 30 days.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
parameters Data quality check parameters ColumnNumericSumSensorParametersSpec
warning Alerting threshold that raises a data quality warning that is considered as a passed data quality check AnomalyStationaryPercentileMovingAverageRuleWarning1PctParametersSpec
error Default alerting threshold for a set number of rows with negative value in a column that raises a data quality alert AnomalyStationaryPercentileMovingAverageRuleError05PctParametersSpec
fatal Alerting threshold that raises a fatal data quality issue which indicates a serious data quality problem AnomalyStationaryPercentileMovingAverageRuleFatal01PctParametersSpec
schedule_override Run check scheduling configuration. Specifies the schedule (a cron expression) when the data quality checks are executed by the scheduler. CronScheduleSpec
comments Comments for change tracking. Please put comments in this collection because YAML comments may be removed when the YAML file is modified by the tool (serialization and deserialization will remove non tracked comments). CommentsListSpec
disabled Disables the data quality check. Only enabled data quality checks and monitorings are executed. The check should be disabled if it should not work, but the configuration of the sensor and rules should be preserved in the configuration. boolean
exclude_from_kpi Data quality check results (alerts) are included in the data quality KPI calculation by default. Set this field to true in order to exclude this data quality check from the data quality KPI calculation. boolean
include_in_sla Marks the data quality check as part of a data quality SLA (Data Contract). The data quality SLA is a set of critical data quality checks that must always pass and are considered as a Data Contract for the dataset. boolean
quality_dimension Configures a custom data quality dimension name that is different than the built-in dimensions (Timeliness, Validity, etc.). string
display_name Data quality check display name that can be assigned to the check, otherwise the check_display_name stored in the parquet result files is the check_name. string
data_grouping Data grouping configuration name that should be applied to this data quality check. The data grouping is used to group the check's result by a GROUP BY clause in SQL, evaluating the data quality check for each group of rows. Use the name of one of data grouping configurations defined on the parent table. string
always_collect_error_samples Forces collecting error samples for this check whenever it fails, even if it is a monitoring check that is run by a scheduler, and running an additional query to collect error samples will impose additional load on the data source. boolean
do_not_schedule Disables running this check by a DQOps CRON scheduler. When a check is disabled from scheduling, it can be only triggered from the user interface or by submitting "run checks" job. boolean

AnomalyStationaryPercentileMovingAverageRuleWarning1PctParametersSpec

Data quality rule that detects anomalies in time series of data quality measures that are stationary over time, such as a percentage of null values. Stationary measures stay within a well-known range of values. The rule identifies the top X% of anomalous values, based on the distribution of the changes using a standard deviation. The rule uses the time window of the last 90 days, but at least 30 historical measures must be present to run the calculation.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
anomaly_percent The probability (in percent) that the current sensor readout (measure) is an anomaly, because the value is outside the regular range of previous readouts. The default time window of 90 time periods (days, etc.) is used, but at least 30 readouts must exist to run the calculation. double
use_ai Use an AI model to predict anomalies. WARNING: anomaly detection by AI models is not supported in a trial distribution of DQOps. Please contact DQOps support to upgrade your instance to a full DQOps instance. boolean

AnomalyStationaryPercentileMovingAverageRuleFatal01PctParametersSpec

Data quality rule that detects anomalies in time series of data quality measures that are stationary over time, such as a percentage of null values. Stationary measures stay within a well-known range of values. The rule identifies the top X% of anomalous values, based on the distribution of the changes using a standard deviation. The rule uses the time window of the last 90 days, but at least 30 historical measures must be present to run the calculation.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
anomaly_percent The probability (in percent) that the current sensor readout (measure) is an anomaly, because the value is outside the regular range of previous readouts. The default time window of 90 time periods (days, etc.) is used, but at least 30 readouts must exist to run the calculation. double
use_ai Use an AI model to predict anomalies. WARNING: anomaly detection by AI models is not supported in a trial distribution of DQOps. Please contact DQOps support to upgrade your instance to a full DQOps instance. boolean

ColumnMinAnomalyStationaryCheckSpec

This check finds a minimum value in a numeric column and detects anomalies in a time series of previous minimum values. It raises a data quality issue when the current minimum value is in the top anomaly_percent percentage of the most outstanding values in the time series (it is a new minimum value, far from the previous one). This data quality check uses a 90-day time window and requires a history of at least 30 days.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
parameters Data quality check parameters ColumnNumericMinSensorParametersSpec
warning Alerting threshold that raises a data quality warning that is considered as a passed data quality check AnomalyStationaryPercentileMovingAverageRuleWarning1PctParametersSpec
error Default alerting threshold for a set number of rows with negative value in a column that raises a data quality alert AnomalyStationaryPercentileMovingAverageRuleError05PctParametersSpec
fatal Alerting threshold that raises a fatal data quality issue which indicates a serious data quality problem AnomalyStationaryPercentileMovingAverageRuleFatal01PctParametersSpec
schedule_override Run check scheduling configuration. Specifies the schedule (a cron expression) when the data quality checks are executed by the scheduler. CronScheduleSpec
comments Comments for change tracking. Please put comments in this collection because YAML comments may be removed when the YAML file is modified by the tool (serialization and deserialization will remove non tracked comments). CommentsListSpec
disabled Disables the data quality check. Only enabled data quality checks and monitorings are executed. The check should be disabled if it should not work, but the configuration of the sensor and rules should be preserved in the configuration. boolean
exclude_from_kpi Data quality check results (alerts) are included in the data quality KPI calculation by default. Set this field to true in order to exclude this data quality check from the data quality KPI calculation. boolean
include_in_sla Marks the data quality check as part of a data quality SLA (Data Contract). The data quality SLA is a set of critical data quality checks that must always pass and are considered as a Data Contract for the dataset. boolean
quality_dimension Configures a custom data quality dimension name that is different than the built-in dimensions (Timeliness, Validity, etc.). string
display_name Data quality check display name that can be assigned to the check, otherwise the check_display_name stored in the parquet result files is the check_name. string
data_grouping Data grouping configuration name that should be applied to this data quality check. The data grouping is used to group the check's result by a GROUP BY clause in SQL, evaluating the data quality check for each group of rows. Use the name of one of data grouping configurations defined on the parent table. string
always_collect_error_samples Forces collecting error samples for this check whenever it fails, even if it is a monitoring check that is run by a scheduler, and running an additional query to collect error samples will impose additional load on the data source. boolean
do_not_schedule Disables running this check by a DQOps CRON scheduler. When a check is disabled from scheduling, it can be only triggered from the user interface or by submitting "run checks" job. boolean

ColumnMaxAnomalyStationaryCheckSpec

This check finds a maximum value in a numeric column and detects anomalies in a time series of previous maximum values. It raises a data quality issue when the current maximum value is in the top anomaly_percent percentage of the most outstanding values in the time series (it is a new maximum value, far from the previous one). This data quality check uses a 90-day time window and requires a history of at least 30 days.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
parameters Data quality check parameters ColumnNumericMaxSensorParametersSpec
warning Alerting threshold that raises a data quality warning that is considered as a passed data quality check AnomalyStationaryPercentileMovingAverageRuleWarning1PctParametersSpec
error Default alerting threshold for a set number of rows with negative value in a column that raises a data quality alert AnomalyStationaryPercentileMovingAverageRuleError05PctParametersSpec
fatal Alerting threshold that raises a fatal data quality issue which indicates a serious data quality problem AnomalyStationaryPercentileMovingAverageRuleFatal01PctParametersSpec
schedule_override Run check scheduling configuration. Specifies the schedule (a cron expression) when the data quality checks are executed by the scheduler. CronScheduleSpec
comments Comments for change tracking. Please put comments in this collection because YAML comments may be removed when the YAML file is modified by the tool (serialization and deserialization will remove non tracked comments). CommentsListSpec
disabled Disables the data quality check. Only enabled data quality checks and monitorings are executed. The check should be disabled if it should not work, but the configuration of the sensor and rules should be preserved in the configuration. boolean
exclude_from_kpi Data quality check results (alerts) are included in the data quality KPI calculation by default. Set this field to true in order to exclude this data quality check from the data quality KPI calculation. boolean
include_in_sla Marks the data quality check as part of a data quality SLA (Data Contract). The data quality SLA is a set of critical data quality checks that must always pass and are considered as a Data Contract for the dataset. boolean
quality_dimension Configures a custom data quality dimension name that is different than the built-in dimensions (Timeliness, Validity, etc.). string
display_name Data quality check display name that can be assigned to the check, otherwise the check_display_name stored in the parquet result files is the check_name. string
data_grouping Data grouping configuration name that should be applied to this data quality check. The data grouping is used to group the check's result by a GROUP BY clause in SQL, evaluating the data quality check for each group of rows. Use the name of one of data grouping configurations defined on the parent table. string
always_collect_error_samples Forces collecting error samples for this check whenever it fails, even if it is a monitoring check that is run by a scheduler, and running an additional query to collect error samples will impose additional load on the data source. boolean
do_not_schedule Disables running this check by a DQOps CRON scheduler. When a check is disabled from scheduling, it can be only triggered from the user interface or by submitting "run checks" job. boolean

ColumnDatetimeDailyPartitionedChecksSpec

Container of date-time data quality partitioned checks on a column level that are checking at a daily level.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
daily_partition_date_values_in_future_percent Detects dates in the future in date, datetime and timestamp columns. Measures a percentage of dates in the future. Raises a data quality issue when too many future dates are found. Stores a separate data quality check result for each daily partition. ColumnDateValuesInFuturePercentCheckSpec
daily_partition_date_in_range_percent Verifies that the dates in date, datetime, or timestamp columns are within a reasonable range of dates. The default configuration detects fake dates such as 1900-01-01 and 2099-12-31. Measures the percentage of valid dates and raises a data quality issue when too many dates are found. Stores a separate data quality check result for each daily partition. ColumnDateInRangePercentCheckSpec
daily_partition_text_match_date_format_percent Verifies that the values in text columns match one of the predefined date formats, such as an ISO 8601 date. Measures the percentage of valid date strings and raises a data quality issue when too many invalid date strings are found. Stores a separate data quality check result for each daily partition. ColumnTextMatchDateFormatPercentCheckSpec
custom_checks Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. CustomCategoryCheckSpecMap

ColumnBoolDailyPartitionedChecksSpec

Container of boolean data quality partitioned checks on a column level that are checking at a daily level.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
daily_partition_true_percent Measures the percentage of true values in a boolean column and verifies that it is within the accepted range. Stores a separate data quality check result for each daily partition. ColumnTruePercentCheckSpec
daily_partition_false_percent Measures the percentage of false values in a boolean column and verifies that it is within the accepted range. Stores a separate data quality check result for each daily partition. ColumnFalsePercentCheckSpec
custom_checks Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. CustomCategoryCheckSpecMap

ColumnIntegrityDailyPartitionedChecksSpec

Container of integrity data quality partitioned checks on a column level that are checking at a daily level.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
daily_partition_lookup_key_not_found Detects invalid values that are not present in a dictionary table using an outer join query. Counts the number of invalid keys. Stores a separate data quality check result for each daily partition. ColumnIntegrityLookupKeyNotFoundCountCheckSpec
daily_partition_lookup_key_found_percent Measures the percentage of valid values that are present in a dictionary table. Joins this table to a dictionary table using an outer join. Stores a separate data quality check result for each daily partition. ColumnIntegrityForeignKeyMatchPercentCheckSpec
custom_checks Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. CustomCategoryCheckSpecMap

ColumnCustomSqlDailyPartitionedChecksSpec

Container of built-in preconfigured data quality checks on a column level that are using custom SQL expressions (conditions).

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
daily_partition_sql_condition_failed_on_column Verifies that a custom SQL expression is met for each row. Counts the number of rows where the expression is not satisfied, and raises an issue if too many failures were detected. This check is used also to compare values between the current column and another column: `{alias}.{column} > {alias}.col_tax`. Stores a separate data quality check result for each daily partition. ColumnSqlConditionFailedCheckSpec
daily_partition_sql_condition_passed_percent_on_column Verifies that a minimum percentage of rows passed a custom SQL condition (expression). Reference the current column by using tokens, for example: `{alias}.{column} > {alias}.col_tax`. Stores a separate data quality check result for each daily partition. ColumnSqlConditionPassedPercentCheckSpec
daily_partition_sql_aggregate_expression_on_column Verifies that a custom aggregated SQL expression (MIN, MAX, etc.) is not outside the expected range. Stores a separate data quality check result for each daily partition. ColumnSqlAggregateExpressionCheckSpec
daily_partition_import_custom_result_on_column Runs a custom query that retrieves a result of a data quality check performed in the data engineering, whose result (the severity level) is pulled from a separate table. ColumnSqlImportCustomResultCheckSpec
custom_checks Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. CustomCategoryCheckSpecMap

ColumnDatatypeDailyPartitionedChecksSpec

Container of datatype data quality partitioned checks on a column level that are checking at a daily level.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
daily_partition_detected_datatype_in_text Detects the data type of text values stored in the column. The sensor returns the code of the detected type of column data: 1 - integers, 2 - floats, 3 - dates, 4 - datetimes, 5 - timestamps, 6 - booleans, 7 - strings, 8 - mixed data types. Raises a data quality issue when the detected data type does not match the expected data type. Stores a separate data quality check result for each daily partition. ColumnDetectedDatatypeInTextCheckSpec
daily_partition_detected_datatype_in_text_changed Detects that the data type of texts stored in a text column has changed when compared to an earlier not empty partition. The sensor returns the detected type of column data: 1 - integers, 2 - floats, 3 - dates, 4 - datetimes, 5 - timestamps, 6 - booleans, 7 - strings, 8 - mixed data types. Stores a separate data quality check result for each daily partition. ColumnDatatypeDetectedDatatypeInTextChangedCheckSpec
custom_checks Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. CustomCategoryCheckSpecMap

ColumnComparisonDailyPartitionedChecksSpecMap

Container of comparison checks for each defined data comparison. The name of the key in this dictionary must match a name of a table comparison that is defined on the parent table. Contains configuration of column level comparison checks. Each column level check container also defines the name of the reference column name to which we are comparing.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
self Dict[string, ColumnComparisonDailyPartitionedChecksSpec]

ColumnComparisonDailyPartitionedChecksSpec

Container of built-in preconfigured column level comparison checks that compare min/max/sum/mean/nulls measures between the column in the tested (parent) table and a matching reference column in the reference table (the source of truth). This is the configuration for daily partitioned checks that are counted in KPIs.

The structure of this object is described below

 Property name   Description                       Data type   Enum values   Default value   Sample values 
daily_partition_sum_match Verifies that percentage of the difference between the sum of values in a tested column in a parent table and the sum of a values in a column in the reference table. The difference must be below defined percentage thresholds. Compares each daily partition (each day of data) between the compared table and the reference table (the source of truth). ColumnComparisonSumMatchCheckSpec
daily_partition_min_match Verifies that percentage of the difference between the minimum value in a tested column in a parent table and the minimum value in a column in the reference table. The difference must be below defined percentage thresholds. Compares each daily partition (each day of data) between the compared table and the reference table (the source of truth). ColumnComparisonMinMatchCheckSpec
daily_partition_max_match Verifies that percentage of the difference between the maximum value in a tested column in a parent table and the maximum value in a column in the reference table. The difference must be below defined percentage thresholds. Compares each daily partition (each day of data) between the compared table and the reference table (the source of truth). ColumnComparisonMaxMatchCheckSpec
daily_partition_mean_match Verifies that percentage of the difference between the mean (average) value in a tested column in a parent table and the mean (average) value in a column in the reference table. The difference must be below defined percentage thresholds. Compares each daily partition (each day of data) between the compared table and the reference table (the source of truth). ColumnComparisonMeanMatchCheckSpec
daily_partition_not_null_count_match Verifies that percentage of the difference between the count of not null values in a tested column in a parent table and the count of not null values in a column in the reference table. The difference must be below defined percentage thresholds. Compares each daily partition (each day of data) between the compared table and the reference table (the source of truth). ColumnComparisonNotNullCountMatchCheckSpec
daily_partition_null_count_match Verifies that percentage of the difference between the count of null values in a tested column in a parent table and the count of null values in a column in the reference table. The difference must be below defined percentage thresholds. Compares each daily partition (each day of data) between the compared table and the reference table (the source of truth). ColumnComparisonNullCountMatchCheckSpec
daily_partition_distinct_count_match Verifies that percentage of the difference between the count of distinct values in a tested column in a parent table and the count of distinct values in a column in the reference table. The difference must be below defined percentage thresholds. Compares each daily partition (each day of data) between the compared table and the reference table (the source of truth). ColumnComparisonDistinctCountMatchCheckSpec
reference_column The name of the reference column name in the reference table. It is the column to which the current column is compared to. string
custom_checks Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. CustomCategoryCheckSpecMap