Last updated: July 22, 2025
DQOps YAML file definitions
The definition of YAML files used by DQOps to configure the data sources, monitored tables, and the configuration of activated data quality checks.
ColumnMonthlyPartitionedCheckCategoriesSpec
Container of data quality partitioned checks on a column level that are checking numeric values at a monthly level.
The structure of this object is described below
Property name | Description | Data type | Enum values | Default value | Sample values |
---|---|---|---|---|---|
nulls |
Monthly partitioned checks of nulls in the column | ColumnNullsMonthlyPartitionedChecksSpec | |||
uniqueness |
Monthly partitioned checks of uniqueness in the column | ColumnUniquenessMonthlyPartitionedChecksSpec | |||
accepted_values |
Configuration of accepted values checks on a column level | ColumnAcceptedValuesMonthlyPartitionedChecksSpec | |||
text |
Monthly partitioned checks of text values in the column | ColumnTextMonthlyPartitionedChecksSpec | |||
whitespace |
Configuration of column level checks that detect blank and whitespace values | ColumnWhitespaceMonthlyPartitionedChecksSpec | |||
conversions |
Configuration of conversion testing checks on a column level. | ColumnConversionsMonthlyPartitionedChecksSpec | |||
patterns |
Monthly partitioned pattern match checks on a column level | ColumnPatternsMonthlyPartitionedChecksSpec | |||
pii |
Monthly partitioned checks of Personal Identifiable Information (PII) in the column | ColumnPiiMonthlyPartitionedChecksSpec | |||
numeric |
Monthly partitioned checks of numeric values in the column | ColumnNumericMonthlyPartitionedChecksSpec | |||
datetime |
Monthly partitioned checks of datetime in the column | ColumnDatetimeMonthlyPartitionedChecksSpec | |||
bool |
Monthly partitioned checks for booleans in the column | ColumnBoolMonthlyPartitionedChecksSpec | |||
integrity |
Monthly partitioned checks for integrity in the column | ColumnIntegrityMonthlyPartitionedChecksSpec | |||
custom_sql |
Monthly partitioned checks using custom SQL expressions evaluated on the column | ColumnCustomSqlMonthlyPartitionedChecksSpec | |||
datatype |
Monthly partitioned checks for datatype in the column | ColumnDatatypeMonthlyPartitionedChecksSpec | |||
comparisons |
Dictionary of configuration of checks for table comparisons at a column level. The key that identifies each comparison must match the name of a data comparison that is configured on the parent table. | ColumnComparisonMonthlyPartitionedChecksSpecMap | |||
custom |
Dictionary of custom checks. The keys are check names within this category. | CustomCheckSpecMap |
ColumnNullsMonthlyPartitionedChecksSpec
Container of nulls data quality partitioned checks on a column level that are checking monthly partitions or rows for each day of data.
The structure of this object is described below
Property name | Description | Data type | Enum values | Default value | Sample values |
---|---|---|---|---|---|
monthly_partition_nulls_count |
Detects incomplete columns that contain any null values. Counts the number of rows having a null value. Raises a data quality issue when the count of null values is above a max_count threshold. Stores a separate data quality check result for each monthly partition. | ColumnNullsCountCheckSpec | |||
monthly_partition_nulls_percent |
Detects incomplete columns that contain any null values. Measures the percentage of rows having a null value. Raises a data quality issue when the percentage of null values is above a max_percent threshold. Stores a separate data quality check result for each monthly partition. | ColumnNullsPercentCheckSpec | |||
monthly_partition_not_nulls_count |
Verifies that a column contains a minimum number of non-null values. The default value of the min_count parameter is 1 to detect at least one value in a monitored column. Raises a data quality issue when the count of non-null values is below min_count. Stores a separate data quality check result for each monthly partition. | ColumnNotNullsCountCheckSpec | |||
monthly_partition_not_nulls_percent |
Detects columns that contain too many non-null values. Measures the percentage of rows that have non-null values. Raises a data quality issue when the percentage of non-null values is above max_percentage. Stores a separate data quality check result for each monthly partition. | ColumnNotNullsPercentCheckSpec | |||
monthly_partition_empty_column_found |
Detects empty columns that contain only null values. Counts the number of rows that have non-null values. Raises a data quality issue when the column is empty. Stores a separate data quality check result for each monthly partition. | ColumnEmptyColumnFoundCheckSpec | |||
custom_checks |
Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. | CustomCategoryCheckSpecMap |
ColumnUniquenessMonthlyPartitionedChecksSpec
Container of uniqueness data quality partitioned checks on a column level that are checking at a monthly level.
The structure of this object is described below
Property name | Description | Data type | Enum values | Default value | Sample values |
---|---|---|---|---|---|
monthly_partition_distinct_count |
Verifies that the number of distinct values stays within an accepted range. Stores a separate data quality check result for each monthly partition. | ColumnDistinctCountCheckSpec | |||
monthly_partition_distinct_percent |
Verifies that the percentage of distinct values in a column does not fall below the minimum accepted percent. Stores a separate data quality check result for each monthly partition. | ColumnDistinctPercentCheckSpec | |||
monthly_partition_duplicate_count |
Verifies that the number of duplicate values in a column does not exceed the maximum accepted count. Stores a separate data quality check result for each monthly partition. | ColumnDuplicateCountCheckSpec | |||
monthly_partition_duplicate_percent |
Verifies that the percent of duplicate values in a column does not exceed the maximum accepted percent. Stores a separate data quality check result for each monthly partition. | ColumnDuplicatePercentCheckSpec | |||
monthly_partition_distinct_count_change |
Verifies that the distinct count in a monitored column has changed by a fixed rate since the last readout. | ColumnDistinctCountChangeCheckSpec | |||
monthly_partition_distinct_percent_change |
Verifies that the distinct percent in a monitored column has changed by a fixed rate since the last readout. | ColumnDistinctPercentChangeCheckSpec | |||
custom_checks |
Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. | CustomCategoryCheckSpecMap |
ColumnAcceptedValuesMonthlyPartitionedChecksSpec
Container of accepted values data quality partitioned checks on a column level that are checking at a monthly level.
The structure of this object is described below
Property name | Description | Data type | Enum values | Default value | Sample values |
---|---|---|---|---|---|
monthly_partition_text_found_in_set_percent |
The check measures the percentage of rows whose value in a tested column is one of values from a list of expected values or the column value is null. Verifies that the percentage of rows having a valid column value does not exceed the minimum accepted percentage. Stores a separate data quality check result for each monthly partition. | ColumnTextFoundInSetPercentCheckSpec | |||
monthly_partition_number_found_in_set_percent |
The check measures the percentage of rows whose value in a tested column is one of values from a list of expected values or the column value is null. Verifies that the percentage of rows having a valid column value does not exceed the minimum accepted percentage. Stores a separate data quality check result for each monthly partition. | ColumnNumberFoundInSetPercentCheckSpec | |||
monthly_partition_expected_text_values_in_use_count |
Verifies that the expected string values were found in the column. Raises a data quality issue when too many expected values were not found (were missing). Stores a separate data quality check result for each monthly partition. | ColumnExpectedTextValuesInUseCountCheckSpec | |||
monthly_partition_expected_texts_in_top_values_count |
Verifies that the top X most popular column values contain all values from a list of expected values. Stores a separate data quality check result for each monthly partition. | ColumnExpectedTextsInTopValuesCountCheckSpec | |||
monthly_partition_expected_numbers_in_use_count |
Verifies that the expected numeric values were found in the column. Raises a data quality issue when too many expected values were not found (were missing). Stores a separate data quality check result for each monthly partition. | ColumnExpectedNumbersInUseCountCheckSpec | |||
monthly_partition_text_valid_country_code_percent |
Verifies that the percentage of valid country codes in a text column does not fall below the minimum accepted percentage. Analyzes every monthly partition and creates a separate data quality check result with the time period value that identifies the monthly partition. | ColumnTextValidCountryCodePercentCheckSpec | |||
monthly_partition_text_valid_currency_code_percent |
Verifies that the percentage of valid currency codes in a text column does not fall below the minimum accepted percentage. Analyzes every monthly partition and creates a separate data quality check result with the time period value that identifies the monthly partition. | ColumnTextValidCurrencyCodePercentCheckSpec | |||
custom_checks |
Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. | CustomCategoryCheckSpecMap |
ColumnTextMonthlyPartitionedChecksSpec
Container of text data quality partitioned checks on a column level that are checking monthly partitions or rows for each month of data.
The structure of this object is described below
Property name | Description | Data type | Enum values | Default value | Sample values |
---|---|---|---|---|---|
monthly_partition_text_min_length |
This check finds the length of the shortest text in a column. Then, it verifies that the minimum length is within an accepted range. It detects that the shortest text is too short. Analyzes every monthly partition and creates a separate data quality check result with the time period value that identifies the monthly partition. | ColumnTextMinLengthCheckSpec | |||
monthly_partition_text_max_length |
This check finds the length of the longest text in a column. Then, it verifies that the maximum length is within an accepted range. It detects that the texts are too long or not long enough. Analyzes every monthly partition and creates a separate data quality check result with the time period value that identifies the monthly partition. | ColumnTextMaxLengthCheckSpec | |||
monthly_partition_text_mean_length |
Verifies that the mean (average) length of texts in a column is within an accepted range. Analyzes every monthly partition and creates a separate data quality check result with the time period value that identifies the monthly partition. | ColumnTextMeanLengthCheckSpec | |||
monthly_partition_text_length_below_min_length |
The check counts the number of text values in the column that is below the length defined by the user as a parameter. Analyzes every monthly partition and creates a separate data quality check result with the time period value that identifies the monthly partition. | ColumnTextLengthBelowMinLengthCheckSpec | |||
monthly_partition_text_length_below_min_length_percent |
The check measures the percentage of text values in the column that is below the length defined by the user as a parameter. Analyzes every monthly partition and creates a separate data quality check result with the time period value that identifies the monthly partition. | ColumnTextLengthBelowMinLengthPercentCheckSpec | |||
monthly_partition_text_length_above_max_length |
The check counts the number of text values in the column that is above the length defined by the user as a parameter. Analyzes every monthly partition and creates a separate data quality check result with the time period value that identifies the monthly partition. | ColumnTextLengthAboveMaxLengthCheckSpec | |||
monthly_partition_text_length_above_max_length_percent |
The check measures the percentage of text values in the column that is above the length defined by the user as a parameter. Analyzes every monthly partition and creates a separate data quality check result with the time period value that identifies the monthly partition. | ColumnTextLengthAboveMaxLengthPercentCheckSpec | |||
monthly_partition_text_length_in_range_percent |
The check measures the percentage of those text values with length in the range provided by the user in the column. Analyzes every monthly partition and creates a separate data quality check result with the time period value that identifies the monthly partition. | ColumnTextLengthInRangePercentCheckSpec | |||
monthly_partition_min_word_count |
This check finds the lowest word count of text in a column. Then, it verifies that the minimum length is within an accepted range. It detects that the text contains too less words. | ColumnTextMinWordCountCheckSpec | |||
monthly_partition_max_word_count |
This check finds the highest word count of text in a column. Then, it verifies that the maximum length is within an accepted range. It detects that the text contains too many words. | ColumnTextMaxWordCountCheckSpec | |||
custom_checks |
Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. | CustomCategoryCheckSpecMap |
ColumnWhitespaceMonthlyPartitionedChecksSpec
Container of whitespace values detection data quality partitioned checks on a column level that are checking at a monthly level.
The structure of this object is described below
Property name | Description | Data type | Enum values | Default value | Sample values |
---|---|---|---|---|---|
monthly_partition_empty_text_found |
Detects empty texts (not null, zero-length texts). This check counts empty and raises a data quality issue when their count exceeds a max_count parameter value. Stores a separate data quality check result for each monthly partition. | ColumnWhitespaceEmptyTextFoundCheckSpec | |||
monthly_partition_whitespace_text_found |
Detects texts that contain only spaces and other whitespace characters. It raises a data quality issue when their count exceeds a max_count parameter value. Stores a separate data quality check result for each monthly partition. | ColumnWhitespaceWhitespaceTextFoundCheckSpec | |||
monthly_partition_null_placeholder_text_found |
Detects texts that are well-known placeholders of null values, such as None, null, n/a. It counts null placeholders and raises a data quality issue when their count exceeds a max_count parameter value. Stores a separate data quality check result for each monthly partition. | ColumnWhitespaceNullPlaceholderTextFoundCheckSpec | |||
monthly_partition_empty_text_percent |
Detects empty texts (not null, zero-length texts) and measures their percentage in the column. This check verifies that the rate of empty strings in a column does not exceed the maximum accepted percentage. Stores a separate data quality check result for each monthly partition. | ColumnWhitespaceEmptyTextPercentCheckSpec | |||
monthly_partition_whitespace_text_percent |
Detects texts that contain only spaces and other whitespace characters and measures their percentage in the column. It raises a data quality issue when their rate exceeds a max_percent parameter value. Stores a separate data quality check result for each monthly partition. | ColumnWhitespaceWhitespaceTextPercentCheckSpec | |||
monthly_partition_null_placeholder_text_percent |
Detects texts that are well-known placeholders of null values, such as None, null, n/a, and measures their percentage in the column. It raises a data quality issue when their rate exceeds a max_percent parameter value. Stores a separate data quality check result for each monthly partition. | ColumnWhitespaceNullPlaceholderTextPercentCheckSpec | |||
monthly_partition_text_surrounded_by_whitespace_found |
Detects text values that are surrounded by whitespace characters on any side. This check counts whitespace-surrounded texts and raises a data quality issue when their count exceeds the max_count parameter value. Analyzes every monthly partition and creates a separate data quality check result with the time period value that identifies the monthly partition. | ColumnWhitespaceTextSurroundedByWhitespaceFoundCheckSpec | |||
monthly_partition_text_surrounded_by_whitespace_percent |
This check detects text values that are surrounded by whitespace characters on any side and measures their percentage. This check raises a data quality issue when their percentage exceeds the max_percent parameter value. Analyzes every monthly partition and creates a separate data quality check result with the time period value that identifies the monthly partition. | ColumnWhitespaceTextSurroundedByWhitespacePercentCheckSpec | |||
custom_checks |
Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. | CustomCategoryCheckSpecMap |
ColumnConversionsMonthlyPartitionedChecksSpec
Container of conversion test checks that are monitoring if text values are convertible to a target data type at a monthly partition level.
The structure of this object is described below
Property name | Description | Data type | Enum values | Default value | Sample values |
---|---|---|---|---|---|
monthly_partition_text_parsable_to_boolean_percent |
Verifies that the percentage of text values that are parsable to a boolean value does not fall below the minimum accepted percentage, text values identified as boolean placeholders are: 0, 1, true, false, t, f, yes, no, y, n. Analyzes every monthly partition and creates a separate data quality check result with the time period value that identifies the monthly partition. | ColumnTextParsableToBooleanPercentCheckSpec | |||
monthly_partition_text_parsable_to_integer_percent |
Verifies that the percentage text values that are parsable to an integer value in a column does not fall below the minimum accepted percentage. Analyzes every monthly partition and creates a separate data quality check result with the time period value that identifies the monthly partition. | ColumnTextParsableToIntegerPercentCheckSpec | |||
monthly_partition_text_parsable_to_float_percent |
Verifies that the percentage text values that are parsable to a float value in a column does not fall below the minimum accepted percentage. Analyzes every monthly partition and creates a separate data quality check result with the time period value that identifies the monthly partition. | ColumnTextParsableToFloatPercentCheckSpec | |||
monthly_partition_text_parsable_to_date_percent |
Verifies that the percentage text values that are parsable to a date value in a column does not fall below the minimum accepted percentage. DQOps uses a safe_cast when possible, otherwise the text is verified using a regular expression. Analyzes every monthly partition and creates a separate data quality check result with the time period value that identifies the monthly partition. | ColumnTextParsableToDatePercentCheckSpec | |||
custom_checks |
Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. | CustomCategoryCheckSpecMap |
ColumnPatternsMonthlyPartitionedChecksSpec
Container of built-in preconfigured monthly partition checks on a column level that are checking for values matching patterns (regular expressions) in text columns.
The structure of this object is described below
Property name | Description | Data type | Enum values | Default value | Sample values |
---|---|---|---|---|---|
monthly_partition_text_not_matching_regex_found |
Verifies that the number of text values not matching the custom regular expression pattern does not exceed the maximum accepted count. | ColumnTextNotMatchingRegexFoundCheckSpec | |||
monthly_partition_texts_not_matching_regex_percent |
Verifies that the percentage of strings matching the custom regular expression pattern does not exceed the maximum accepted percentage. | ColumnTextsNotMatchingRegexPercentCheckSpec | |||
monthly_partition_invalid_email_format_found |
Verifies that the number of invalid emails in a text column does not exceed the maximum accepted count. | ColumnInvalidEmailFormatFoundCheckSpec | |||
monthly_partition_invalid_email_format_percent |
Verifies that the percentage of invalid emails in a text column does not exceed the maximum accepted percentage. | ColumnInvalidEmailFormatPercentCheckSpec | |||
monthly_partition_text_not_matching_date_pattern_found |
Verifies that the number of texts not matching the date format regular expression does not exceed the maximum accepted count. | ColumnTextNotMatchingDatePatternFoundCheckSpec | |||
monthly_partition_text_not_matching_date_pattern_percent |
Verifies that the percentage of texts matching the date format regular expression in a column does not exceed the maximum accepted percentage. | ColumnTextNotMatchingDatePatternPercentCheckSpec | |||
monthly_partition_text_not_matching_name_pattern_percent |
Verifies that the percentage of texts matching the name regular expression does not exceed the maximum accepted percentage. | ColumnTextNotMatchingNamePatternPercentCheckSpec | |||
monthly_partition_invalid_uuid_format_found |
Verifies that the number of invalid UUIDs in a text column does not exceed the maximum accepted count. | ColumnInvalidUuidFormatFoundCheckSpec | |||
monthly_partition_invalid_uuid_format_percent |
Verifies that the percentage of invalid UUID in a text column does not exceed the maximum accepted percentage. | ColumnInvalidUuidFormatPercentCheckSpec | |||
monthly_partition_invalid_ip4_address_format_found |
Verifies that the number of invalid IP4 addresses in a text column does not exceed the maximum accepted count. | ColumnInvalidIp4AddressFormatFoundCheckSpec | |||
monthly_partition_invalid_ip6_address_format_found |
Verifies that the number of invalid IP6 addresses in a text column does not exceed the maximum accepted count. | ColumnInvalidIp6AddressFormatFoundCheckSpec | |||
monthly_partition_invalid_usa_phone_format_found |
Verifies that the number of invalid USA phone numbers in a text column does not exceed the maximum accepted count. | ColumnInvalidUsaPhoneFoundCheckSpec | |||
monthly_partition_invalid_usa_zipcode_format_found |
Verifies that the number of invalid zip codes in a text column does not exceed the maximum accepted count. | ColumnInvalidUsaZipcodeFoundCheckSpec | |||
monthly_partition_invalid_usa_phone_format_percent |
Verifies that the percentage of invalid USA phones number in a text column does not exceed the maximum accepted percentage. | ColumnInvalidUsaPhonePercentCheckSpec | |||
monthly_partition_invalid_usa_zipcode_format_percent |
Verifies that the percentage of invalid USA phones number in a text column does not exceed the maximum accepted percentage. | ColumnInvalidUsaZipcodePercentCheckSpec | |||
custom_checks |
Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. | CustomCategoryCheckSpecMap |
ColumnPiiMonthlyPartitionedChecksSpec
Container of PII data quality partitioned checks on a column level that are checking monthly partitions or rows for each month of data.
The structure of this object is described below
Property name | Description | Data type | Enum values | Default value | Sample values |
---|---|---|---|---|---|
monthly_partition_contains_usa_phone_percent |
Detects USA phone numbers in text columns. Verifies that the percentage of rows that contains USA phone number in a column does not exceed the maximum accepted percentage. Stores a separate data quality check result for each monthly partition. | ColumnPiiContainsUsaPhonePercentCheckSpec | |||
monthly_partition_contains_email_percent |
Detects emails in text columns. Verifies that the percentage of rows that contains emails in a column does not exceed the minimum accepted percentage. Stores a separate data quality check result for each monthly partition. | ColumnPiiContainsEmailPercentCheckSpec | |||
monthly_partition_contains_usa_zipcode_percent |
Detects USA zip codes in text columns. Verifies that the percentage of rows that contains USA zip code in a column does not exceed the maximum accepted percentage. Stores a separate data quality check result for each monthly partition. | ColumnPiiContainsUsaZipcodePercentCheckSpec | |||
monthly_partition_contains_ip4_percent |
Detects IP4 addresses in text columns. Verifies that the percentage of rows that contains IP4 address values in a column does not fall below the minimum accepted percentage. Stores a separate data quality check result for each monthly partition. | ColumnPiiContainsIp4PercentCheckSpec | |||
monthly_partition_contains_ip6_percent |
Detects IP6 addresses in text columns. Verifies that the percentage of rows that contains valid IP6 address values in a column does not fall below the minimum accepted percentage. Stores a separate data quality check result for each monthly partition. | ColumnPiiContainsIp6PercentCheckSpec | |||
custom_checks |
Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. | CustomCategoryCheckSpecMap |
ColumnNumericMonthlyPartitionedChecksSpec
Container of numeric data quality partitioned checks on a column level that are checking at a monthly level.
The structure of this object is described below
Property name | Description | Data type | Enum values | Default value | Sample values |
---|---|---|---|---|---|
monthly_partition_number_below_min_value |
The check counts the number of values in the column that are below the value defined by the user as a parameter. Stores a separate data quality check result for each monthly partition. | ColumnNumberBelowMinValueCheckSpec | |||
monthly_partition_number_above_max_value |
The check counts the number of values in the column that are above the value defined by the user as a parameter. Stores a separate data quality check result for each monthly partition. | ColumnNumberAboveMaxValueCheckSpec | |||
monthly_partition_negative_values |
Verifies that the number of negative values in a column does not exceed the maximum accepted count. Stores a separate data quality check result for each monthly partition. | ColumnNegativeCountCheckSpec | |||
monthly_partition_negative_values_percent |
Verifies that the percentage of negative values in a column does not exceed the maximum accepted percentage. Stores a separate data quality check result for each monthly partition. | ColumnNegativePercentCheckSpec | |||
monthly_partition_number_below_min_value_percent |
The check counts the percentage of values in the column that are below the value defined by the user as a parameter. Stores a separate data quality check result for each monthly partition. | ColumnNumberBelowMinValuePercentCheckSpec | |||
monthly_partition_number_above_max_value_percent |
The check counts the percentage of values in the column that are above the value defined by the user as a parameter. Stores a separate data quality check result for each monthly partition. | ColumnNumberAboveMaxValuePercentCheckSpec | |||
monthly_partition_number_in_range_percent |
Verifies that the percentage of values from range in a column does not exceed the minimum accepted percentage. Stores a separate data quality check result for each monthly partition. | ColumnNumberInRangePercentCheckSpec | |||
monthly_partition_integer_in_range_percent |
Verifies that the percentage of values from range in a column does not exceed the minimum accepted percentage. Stores a separate data quality check result for each monthly partition. | ColumnIntegerInRangePercentCheckSpec | |||
monthly_partition_min_in_range |
Verifies that the minimum value in a column is not outside the expected range. Stores a separate data quality check result for each monthly partition. | ColumnMinInRangeCheckSpec | |||
monthly_partition_max_in_range |
Verifies that the maximum value in a column is not outside the expected range. Stores a separate data quality check result for each monthly partition. | ColumnMaxInRangeCheckSpec | |||
monthly_partition_sum_in_range |
Verifies that the sum of all values in a column is not outside the expected range. Stores a separate data quality check result for each monthly partition. | ColumnSumInRangeCheckSpec | |||
monthly_partition_mean_in_range |
Verifies that the average (mean) of all values in a column is not outside the expected range. Stores a separate data quality check result for each monthly partition. | ColumnMeanInRangeCheckSpec | |||
monthly_partition_median_in_range |
Verifies that the median of all values in a column is not outside the expected range. Stores a separate data quality check result for each monthly partition. | ColumnMedianInRangeCheckSpec | |||
monthly_partition_percentile_in_range |
Verifies that the percentile of all values in a column is not outside the expected range. Stores a separate data quality check result for each monthly partition. | ColumnPercentileInRangeCheckSpec | |||
monthly_partition_percentile_10_in_range |
Verifies that the percentile 10 of all values in a column is not outside the expected range. Stores a separate data quality check result for each monthly partition. | ColumnPercentile10InRangeCheckSpec | |||
monthly_partition_percentile_25_in_range |
Verifies that the percentile 25 of all values in a column is not outside the expected range. Stores a separate data quality check result for each monthly partition. | ColumnPercentile25InRangeCheckSpec | |||
monthly_partition_percentile_75_in_range |
Verifies that the percentile 75 of all values in a column is not outside the expected range. Stores a separate data quality check result for each monthly partition. | ColumnPercentile75InRangeCheckSpec | |||
monthly_partition_percentile_90_in_range |
Verifies that the percentile 90 of all values in a column is not outside the expected range. Stores a separate data quality check result for each monthly partition. | ColumnPercentile90InRangeCheckSpec | |||
monthly_partition_sample_stddev_in_range |
Verifies that the sample standard deviation of all values in a column is not outside the expected range. Stores a separate data quality check result for each monthly partition. | ColumnSampleStddevInRangeCheckSpec | |||
monthly_partition_population_stddev_in_range |
Verifies that the population standard deviation of all values in a column is not outside the expected range. Stores a separate data quality check result for each monthly partition. | ColumnPopulationStddevInRangeCheckSpec | |||
monthly_partition_sample_variance_in_range |
Verifies that the sample variance of all values in a column is not outside the expected range. Stores a separate data quality check result for each monthly partition. | ColumnSampleVarianceInRangeCheckSpec | |||
monthly_partition_population_variance_in_range |
Verifies that the population variance of all values in a column is not outside the expected range. Stores a separate data quality check result for each monthly partition. | ColumnPopulationVarianceInRangeCheckSpec | |||
monthly_partition_invalid_latitude |
Verifies that the number of invalid latitude values in a column does not exceed the maximum accepted count. Stores a separate data quality check result for each monthly partition. | ColumnInvalidLatitudeCountCheckSpec | |||
monthly_partition_valid_latitude_percent |
Verifies that the percentage of valid latitude values in a column does not fall below the minimum accepted percentage. Stores a separate data quality check result for each monthly partition. | ColumnValidLatitudePercentCheckSpec | |||
monthly_partition_invalid_longitude |
Verifies that the number of invalid longitude values in a column does not exceed the maximum accepted count. Stores a separate data quality check result for each monthly partition. | ColumnInvalidLongitudeCountCheckSpec | |||
monthly_partition_valid_longitude_percent |
Verifies that the percentage of valid longitude values in a column does not fall below the minimum accepted percentage. Stores a separate data quality check result for each monthly partition. | ColumnValidLongitudePercentCheckSpec | |||
monthly_partition_non_negative_values |
Verifies that the number of non-negative values in a column does not exceed the maximum accepted count. Stores a separate data quality check result for each monthly partition. | ColumnNonNegativeCountCheckSpec | |||
monthly_partition_non_negative_values_percent |
Verifies that the percentage of non-negative values in a column does not exceed the maximum accepted percentage. Stores a separate data quality check result for each monthly partition. | ColumnNonNegativePercentCheckSpec | |||
custom_checks |
Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. | CustomCategoryCheckSpecMap |
ColumnDatetimeMonthlyPartitionedChecksSpec
Container of date-time data quality partitioned checks on a column level that are checking monthly partitions or rows for each month of data.
The structure of this object is described below
Property name | Description | Data type | Enum values | Default value | Sample values |
---|---|---|---|---|---|
monthly_partition_date_values_in_future_percent |
Detects dates in the future in date, datetime and timestamp columns. Measures a percentage of dates in the future. Raises a data quality issue when too many future dates are found. Stores a separate data quality check result for each monthly partition. | ColumnDateValuesInFuturePercentCheckSpec | |||
monthly_partition_date_in_range_percent |
Verifies that the dates in date, datetime, or timestamp columns are within a reasonable range of dates. The default configuration detects fake dates such as 1900-01-01 and 2099-12-31. Measures the percentage of valid dates and raises a data quality issue when too many dates are found. Stores a separate data quality check result for each monthly partition. | ColumnDateInRangePercentCheckSpec | |||
monthly_partition_text_match_date_format_percent |
Verifies that the values in text columns match one of the predefined date formats, such as an ISO 8601 date. Measures the percentage of valid date strings and raises a data quality issue when too many invalid date strings are found. Stores a separate data quality check result for each monthly partition. | ColumnTextMatchDateFormatPercentCheckSpec | |||
custom_checks |
Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. | CustomCategoryCheckSpecMap |
ColumnBoolMonthlyPartitionedChecksSpec
Container of boolean data quality partitioned checks on a column level that are checking monthly partitions or rows for each month of data.
The structure of this object is described below
Property name | Description | Data type | Enum values | Default value | Sample values |
---|---|---|---|---|---|
monthly_partition_true_percent |
Measures the percentage of true values in a boolean column and verifies that it is within the accepted range. Stores a separate data quality check result for each monthly partition. | ColumnTruePercentCheckSpec | |||
monthly_partition_false_percent |
Measures the percentage of false values in a boolean column and verifies that it is within the accepted range. Stores a separate data quality check result for each monthly partition. | ColumnFalsePercentCheckSpec | |||
custom_checks |
Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. | CustomCategoryCheckSpecMap |
ColumnIntegrityMonthlyPartitionedChecksSpec
Container of integrity data quality partitioned checks on a column level that are checking monthly partitions or rows for each month of data.
The structure of this object is described below
Property name | Description | Data type | Enum values | Default value | Sample values |
---|---|---|---|---|---|
monthly_partition_lookup_key_not_found |
Detects invalid values that are not present in a dictionary table using an outer join query. Counts the number of invalid keys. Stores a separate data quality check result for each monthly partition. | ColumnIntegrityLookupKeyNotFoundCountCheckSpec | |||
monthly_partition_lookup_key_found_percent |
Measures the percentage of valid values that are present in a dictionary table. Joins this table to a dictionary table using an outer join. Stores a separate data quality check result for each monthly partition. | ColumnIntegrityForeignKeyMatchPercentCheckSpec | |||
custom_checks |
Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. | CustomCategoryCheckSpecMap |
ColumnCustomSqlMonthlyPartitionedChecksSpec
Container of built-in preconfigured data quality checks on a column level that are using custom SQL expressions (conditions).
The structure of this object is described below
Property name | Description | Data type | Enum values | Default value | Sample values |
---|---|---|---|---|---|
monthly_partition_sql_condition_failed_on_column |
Verifies that a custom SQL expression is met for each row. Counts the number of rows where the expression is not satisfied, and raises an issue if too many failures were detected. This check is used also to compare values between the current column and another column: `{alias}.{column} > {alias}.col_tax`. Stores a separate data quality check result for each monthly partition. | ColumnSqlConditionFailedCheckSpec | |||
monthly_partition_sql_condition_passed_percent_on_column |
Verifies that a minimum percentage of rows passed a custom SQL condition (expression). Reference the current column by using tokens, for example: `{alias}.{column} > {alias}.col_tax`. Stores a separate data quality check result for each monthly partition. | ColumnSqlConditionPassedPercentCheckSpec | |||
monthly_partition_sql_aggregate_expression_on_column |
Verifies that a custom aggregated SQL expression (MIN, MAX, etc.) is not outside the expected range. Stores a separate data quality check result for each monthly partition. | ColumnSqlAggregateExpressionCheckSpec | |||
monthly_partition_import_custom_result_on_column |
Runs a custom query that retrieves a result of a data quality check performed in the data engineering, whose result (the severity level) is pulled from a separate table. | ColumnSqlImportCustomResultCheckSpec | |||
custom_checks |
Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. | CustomCategoryCheckSpecMap |
ColumnDatatypeMonthlyPartitionedChecksSpec
Container of datatype data quality partitioned checks on a column level that are checking monthly partitions or rows for each month of data.
The structure of this object is described below
Property name | Description | Data type | Enum values | Default value | Sample values |
---|---|---|---|---|---|
monthly_partition_detected_datatype_in_text |
Detects the data type of text values stored in the column. The sensor returns the code of the detected type of column data: 1 - integers, 2 - floats, 3 - dates, 4 - datetimes, 5 - timestamps, 6 - booleans, 7 - strings, 8 - mixed data types. Raises a data quality issue when the detected data type does not match the expected data type. Stores a separate data quality check result for each monthly partition. | ColumnDetectedDatatypeInTextCheckSpec | |||
monthly_partition_detected_datatype_in_text_changed |
Detects that the data type of texts stored in a text column has changed when compared to an earlier not empty partition. The sensor returns the detected type of column data: 1 - integers, 2 - floats, 3 - dates, 4 - datetimes, 5 - timestamps, 6 - booleans, 7 - strings, 8 - mixed data types. Stores a separate data quality check result for each monthly partition. | ColumnDatatypeDetectedDatatypeInTextChangedCheckSpec | |||
custom_checks |
Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. | CustomCategoryCheckSpecMap |
ColumnComparisonMonthlyPartitionedChecksSpecMap
Container of comparison checks for each defined data comparison. The name of the key in this dictionary must match a name of a table comparison that is defined on the parent table. Contains configuration of column level comparison checks. Each column level check container also defines the name of the reference column name to which we are comparing.
The structure of this object is described below
Property name | Description | Data type | Enum values | Default value | Sample values |
---|---|---|---|---|---|
self |
Dict[string, ColumnComparisonMonthlyPartitionedChecksSpec] |
ColumnComparisonMonthlyPartitionedChecksSpec
Container of built-in preconfigured column level comparison checks that compare min/max/sum/mean/nulls measures between the column in the tested (parent) table and a matching reference column in the reference table (the source of truth). This is the configuration for daily partitioned checks that are counted in KPIs.
The structure of this object is described below
Property name | Description | Data type | Enum values | Default value | Sample values |
---|---|---|---|---|---|
monthly_partition_sum_match |
Verifies that percentage of the difference between the sum of values in a tested column in a parent table and the sum of a values in a column in the reference table. The difference must be below defined percentage thresholds. Compares each monthly partition (each month of data) between the compared table and the reference table (the source of truth). | ColumnComparisonSumMatchCheckSpec | |||
monthly_partition_min_match |
Verifies that percentage of the difference between the minimum value in a tested column in a parent table and the minimum value in a column in the reference table. The difference must be below defined percentage thresholds. Compares each monthly partition (each month of data) between the compared table and the reference table (the source of truth). | ColumnComparisonMinMatchCheckSpec | |||
monthly_partition_max_match |
Verifies that percentage of the difference between the maximum value in a tested column in a parent table and the maximum value in a column in the reference table. The difference must be below defined percentage thresholds. Compares each monthly partition (each month of data) between the compared table and the reference table (the source of truth). | ColumnComparisonMaxMatchCheckSpec | |||
monthly_partition_mean_match |
Verifies that percentage of the difference between the mean (average) value in a tested column in a parent table and the mean (average) value in a column in the reference table. The difference must be below defined percentage thresholds. Compares each monthly partition (each month of data) between the compared table and the reference table (the source of truth). | ColumnComparisonMeanMatchCheckSpec | |||
monthly_partition_not_null_count_match |
Verifies that percentage of the difference between the count of not null values in a tested column in a parent table and the count of not null values in a column in the reference table. The difference must be below defined percentage thresholds. Compares each monthly partition (each month of data) between the compared table and the reference table (the source of truth). | ColumnComparisonNotNullCountMatchCheckSpec | |||
monthly_partition_null_count_match |
Verifies that percentage of the difference between the count of null values in a tested column in a parent table and the count of null values in a column in the reference table. The difference must be below defined percentage thresholds. Compares each monthly partition (each month of data) between the compared table and the reference table (the source of truth). | ColumnComparisonNullCountMatchCheckSpec | |||
monthly_partition_distinct_count_match |
Verifies that percentage of the difference between the count of distinct values in a tested column in a parent table and the count of distinct values in a column in the reference table. The difference must be below defined percentage thresholds. Compares each monthly partition (each month of data) between the compared table and the reference table (the source of truth). | ColumnComparisonDistinctCountMatchCheckSpec | |||
reference_column |
The name of the reference column name in the reference table. It is the column to which the current column is compared to. | string | |||
custom_checks |
Dictionary of additional custom checks within this category. The keys are check names defined in the definition section. The sensor parameters and rules should match the type of the configured sensor and rule for the custom check. | CustomCategoryCheckSpecMap |