Last updated: July 22, 2025
Total not null count match percent data quality checks, SQL examples
A column-level check that ensures that the difference between the count of null values in the tested column and the count of null values in another column in a referenced table is below a maximum accepted percentage of difference. This check runs an SQL query with an INNER JOIN clause to join another (referenced) table that must be defined in the same database.
The total not null count match percent data quality check has the following variants for each type of data quality checks supported by DQOps.
profile total not null count match percent
Check description
Verifies that the percentage of difference in total not null count of a column in a table and total not null count of a column of another table does not exceed the set number. Stores the most recent captured value for each day when the data quality check was evaluated.
Data quality check name | Friendly name | Category | Check type | Time scale | Quality dimension | Sensor definition | Quality rule | Standard |
---|---|---|---|---|---|---|---|---|
profile_total_not_null_count_match_percent |
The maximum difference in percent between the count of not null values of this column and the reference column | accuracy | profiling | Accuracy | total_not_null_count_match_percent | diff_percent |
Command-line examples
Please expand the section below to see the DQOps command-line examples to run or activate the profile total not null count match percent data quality check.
Managing profile total not null count match percent check from DQOps shell
Activate this data quality using the check activate CLI command, providing the connection name, table name, check name, and all other filters. Activates the warning rule with the default parameters.
dqo> check activate -c=connection_name -t=schema_name.table_name -col=column_name -ch=profile_total_not_null_count_match_percent --enable-warning
You can also use patterns to activate the check on all matching tables and columns.
dqo> check activate -c=connection_name -t=schema_prefix*.fact_* -col=column_name -ch=profile_total_not_null_count_match_percent --enable-warning
Additional rule parameters are passed using the -Wrule_parameter_name=value.
Activate this data quality using the check activate CLI command, providing the connection name, table name, check name, and all other filters. Activates the error rule with the default parameters.
dqo> check activate -c=connection_name -t=schema_name.table_name -col=column_name -ch=profile_total_not_null_count_match_percent --enable-error
You can also use patterns to activate the check on all matching tables and columns.
dqo> check activate -c=connection_name -t=schema_prefix*.fact_* -col=column_name -ch=profile_total_not_null_count_match_percent --enable-error
Additional rule parameters are passed using the -Erule_parameter_name=value.
Run this data quality check using the check run CLI command by providing the check name and all other targeting filters. The following example shows how to run the profile_total_not_null_count_match_percent check on all tables and columns on a single data source.
It is also possible to run this check on a specific connection and table. In order to do this, use the connection name and the full table name parameters.
dqo> check run -c=connection_name -t=schema_name.table_name -ch=profile_total_not_null_count_match_percent
You can also run this check on all tables (and columns) on which the profile_total_not_null_count_match_percent check is enabled using patterns to find tables.
YAML configuration
The sample schema_name.table_name.dqotable.yaml file with the check configured is shown below.
# yaml-language-server: $schema=https://cloud.dqops.com/dqo-yaml-schema/TableYaml-schema.json
apiVersion: dqo/v1
kind: table
spec:
columns:
target_column:
profiling_checks:
accuracy:
profile_total_not_null_count_match_percent:
parameters:
referenced_table: landing_zone.customer_raw
referenced_column: customer_id
warning:
max_diff_percent: 0.0
error:
max_diff_percent: 1.0
fatal:
max_diff_percent: 5.0
labels:
- This is the column that is analyzed for data quality issues
Samples of generated SQL queries for each data source type
Please expand the database engine name section to see the SQL query rendered by a Jinja2 template for the total_not_null_count_match_percent data quality sensor.
BigQuery
{% import '/dialects/bigquery.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{ lib.quote_identifier(lib.macro_project_name) }}.{{ lib.quote_identifier(lib.macro_schema_name) }}.{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
ClickHouse
{% import '/dialects/clickhouse.sql.jinja2' as lib with context -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ lib.render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
Databricks
{% import '/dialects/databricks.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{ lib.quote_identifier(lib.macro_schema_name) }}.{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
DB2
{% import '/dialects/db2.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} referenced_table
) AS expected_value,
analyzed_table.actual_value
FROM (SELECT
COUNT({{ lib.render_target_column('original_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} original_table
{{- lib.render_where_clause() -}} ) analyzed_table
GROUP BY actual_value
SELECT
(SELECT
COUNT(referenced_table."customer_id")
FROM landing_zone.customer_raw referenced_table
) AS expected_value,
analyzed_table.actual_value
FROM (SELECT
COUNT(original_table."target_column") AS actual_value
FROM "<target_schema>"."<target_table>" original_table) analyzed_table
GROUP BY actual_value
DuckDB
{% import '/dialects/duckdb.sql.jinja2' as lib with context -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ lib.render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
HANA
{% import '/dialects/hana.sql.jinja2' as lib with context -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ lib.render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
MariaDB
{% import '/dialects/mariadb.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
MySQL
{% import '/dialects/mysql.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
Oracle
{% import '/dialects/oracle.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} referenced_table
) AS expected_value,
analyzed_table.actual_value
FROM (SELECT
COUNT({{ lib.render_target_column('original_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} original_table
{{- lib.render_where_clause() -}} ) analyzed_table
GROUP BY actual_value
SELECT
(SELECT
COUNT(referenced_table."customer_id")
FROM landing_zone.customer_raw referenced_table
) AS expected_value,
analyzed_table.actual_value
FROM (SELECT
COUNT(original_table."target_column") AS actual_value
FROM "<target_schema>"."<target_table>" original_table) analyzed_table
GROUP BY actual_value
PostgreSQL
{% import '/dialects/postgresql.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{ lib.quote_identifier(lib.macro_database_name) }}.{{ lib.quote_identifier(lib.macro_schema_name) }}.{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
Presto
{% import '/dialects/presto.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{ lib.quote_identifier(lib.macro_database_name) }}.{{ lib.quote_identifier(lib.macro_schema_name) }}.{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
QuestDB
{% import '/dialects/questdb.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
WITH referenced_data AS (
SELECT COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }}) AS expected_value
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
)
SELECT referenced_data.expected_value AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
CROSS JOIN referenced_data
{{- lib.render_where_clause() -}}
WITH referenced_data AS (
SELECT COUNT(referenced_table."customer_id") AS expected_value
FROM landing_zone.customer_raw AS referenced_table
)
SELECT referenced_data.expected_value AS expected_value,
COUNT(analyzed_table."target_column") AS actual_value
FROM "<target_table>" AS analyzed_table
CROSS JOIN referenced_data
Redshift
{% import '/dialects/redshift.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{ lib.quote_identifier(lib.macro_database_name) }}.{{ lib.quote_identifier(lib.macro_schema_name) }}.{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
Snowflake
{% import '/dialects/snowflake.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{ lib.quote_identifier(lib.macro_database_name) }}.{{ lib.quote_identifier(lib.macro_schema_name) }}.{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
Spark
{% import '/dialects/spark.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{ lib.quote_identifier(lib.macro_schema_name) }}.{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
SQL Server
{% import '/dialects/sqlserver.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{ lib.quote_identifier(lib.macro_database_name) }}.{{ lib.quote_identifier(lib.macro_schema_name) }}.{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT_BIG(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT_BIG({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
Teradata
{% import '/dialects/teradata.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{ lib.quote_identifier(lib.macro_schema_name) }}.{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
Trino
{% import '/dialects/trino.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{ lib.quote_identifier(lib.macro_catalog_name) }}.{{ lib.quote_identifier(lib.macro_schema_name) }}.{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
daily total not null count match percent
Check description
Verifies that the percentage of difference in total not null count of a column in a table and total not null count of a column of another table does not exceed the set number. Stores the most recent captured value for each day when the data quality check was evaluated.
Data quality check name | Friendly name | Category | Check type | Time scale | Quality dimension | Sensor definition | Quality rule | Standard |
---|---|---|---|---|---|---|---|---|
daily_total_not_null_count_match_percent |
The maximum difference in percent between the count of not null values of this column and the reference column | accuracy | monitoring | daily | Accuracy | total_not_null_count_match_percent | diff_percent |
Command-line examples
Please expand the section below to see the DQOps command-line examples to run or activate the daily total not null count match percent data quality check.
Managing daily total not null count match percent check from DQOps shell
Activate this data quality using the check activate CLI command, providing the connection name, table name, check name, and all other filters. Activates the warning rule with the default parameters.
dqo> check activate -c=connection_name -t=schema_name.table_name -col=column_name -ch=daily_total_not_null_count_match_percent --enable-warning
You can also use patterns to activate the check on all matching tables and columns.
dqo> check activate -c=connection_name -t=schema_prefix*.fact_* -col=column_name -ch=daily_total_not_null_count_match_percent --enable-warning
Additional rule parameters are passed using the -Wrule_parameter_name=value.
Activate this data quality using the check activate CLI command, providing the connection name, table name, check name, and all other filters. Activates the error rule with the default parameters.
dqo> check activate -c=connection_name -t=schema_name.table_name -col=column_name -ch=daily_total_not_null_count_match_percent --enable-error
You can also use patterns to activate the check on all matching tables and columns.
dqo> check activate -c=connection_name -t=schema_prefix*.fact_* -col=column_name -ch=daily_total_not_null_count_match_percent --enable-error
Additional rule parameters are passed using the -Erule_parameter_name=value.
Run this data quality check using the check run CLI command by providing the check name and all other targeting filters. The following example shows how to run the daily_total_not_null_count_match_percent check on all tables and columns on a single data source.
It is also possible to run this check on a specific connection and table. In order to do this, use the connection name and the full table name parameters.
dqo> check run -c=connection_name -t=schema_name.table_name -ch=daily_total_not_null_count_match_percent
You can also run this check on all tables (and columns) on which the daily_total_not_null_count_match_percent check is enabled using patterns to find tables.
YAML configuration
The sample schema_name.table_name.dqotable.yaml file with the check configured is shown below.
# yaml-language-server: $schema=https://cloud.dqops.com/dqo-yaml-schema/TableYaml-schema.json
apiVersion: dqo/v1
kind: table
spec:
columns:
target_column:
monitoring_checks:
daily:
accuracy:
daily_total_not_null_count_match_percent:
parameters:
referenced_table: landing_zone.customer_raw
referenced_column: customer_id
warning:
max_diff_percent: 0.0
error:
max_diff_percent: 1.0
fatal:
max_diff_percent: 5.0
labels:
- This is the column that is analyzed for data quality issues
Samples of generated SQL queries for each data source type
Please expand the database engine name section to see the SQL query rendered by a Jinja2 template for the total_not_null_count_match_percent data quality sensor.
BigQuery
{% import '/dialects/bigquery.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{ lib.quote_identifier(lib.macro_project_name) }}.{{ lib.quote_identifier(lib.macro_schema_name) }}.{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
ClickHouse
{% import '/dialects/clickhouse.sql.jinja2' as lib with context -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ lib.render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
Databricks
{% import '/dialects/databricks.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{ lib.quote_identifier(lib.macro_schema_name) }}.{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
DB2
{% import '/dialects/db2.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} referenced_table
) AS expected_value,
analyzed_table.actual_value
FROM (SELECT
COUNT({{ lib.render_target_column('original_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} original_table
{{- lib.render_where_clause() -}} ) analyzed_table
GROUP BY actual_value
SELECT
(SELECT
COUNT(referenced_table."customer_id")
FROM landing_zone.customer_raw referenced_table
) AS expected_value,
analyzed_table.actual_value
FROM (SELECT
COUNT(original_table."target_column") AS actual_value
FROM "<target_schema>"."<target_table>" original_table) analyzed_table
GROUP BY actual_value
DuckDB
{% import '/dialects/duckdb.sql.jinja2' as lib with context -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ lib.render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
HANA
{% import '/dialects/hana.sql.jinja2' as lib with context -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ lib.render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
MariaDB
{% import '/dialects/mariadb.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
MySQL
{% import '/dialects/mysql.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
Oracle
{% import '/dialects/oracle.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} referenced_table
) AS expected_value,
analyzed_table.actual_value
FROM (SELECT
COUNT({{ lib.render_target_column('original_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} original_table
{{- lib.render_where_clause() -}} ) analyzed_table
GROUP BY actual_value
SELECT
(SELECT
COUNT(referenced_table."customer_id")
FROM landing_zone.customer_raw referenced_table
) AS expected_value,
analyzed_table.actual_value
FROM (SELECT
COUNT(original_table."target_column") AS actual_value
FROM "<target_schema>"."<target_table>" original_table) analyzed_table
GROUP BY actual_value
PostgreSQL
{% import '/dialects/postgresql.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{ lib.quote_identifier(lib.macro_database_name) }}.{{ lib.quote_identifier(lib.macro_schema_name) }}.{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
Presto
{% import '/dialects/presto.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{ lib.quote_identifier(lib.macro_database_name) }}.{{ lib.quote_identifier(lib.macro_schema_name) }}.{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
QuestDB
{% import '/dialects/questdb.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
WITH referenced_data AS (
SELECT COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }}) AS expected_value
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
)
SELECT referenced_data.expected_value AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
CROSS JOIN referenced_data
{{- lib.render_where_clause() -}}
WITH referenced_data AS (
SELECT COUNT(referenced_table."customer_id") AS expected_value
FROM landing_zone.customer_raw AS referenced_table
)
SELECT referenced_data.expected_value AS expected_value,
COUNT(analyzed_table."target_column") AS actual_value
FROM "<target_table>" AS analyzed_table
CROSS JOIN referenced_data
Redshift
{% import '/dialects/redshift.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{ lib.quote_identifier(lib.macro_database_name) }}.{{ lib.quote_identifier(lib.macro_schema_name) }}.{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
Snowflake
{% import '/dialects/snowflake.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{ lib.quote_identifier(lib.macro_database_name) }}.{{ lib.quote_identifier(lib.macro_schema_name) }}.{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
Spark
{% import '/dialects/spark.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{ lib.quote_identifier(lib.macro_schema_name) }}.{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
SQL Server
{% import '/dialects/sqlserver.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{ lib.quote_identifier(lib.macro_database_name) }}.{{ lib.quote_identifier(lib.macro_schema_name) }}.{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT_BIG(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT_BIG({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
Teradata
{% import '/dialects/teradata.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{ lib.quote_identifier(lib.macro_schema_name) }}.{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
Trino
{% import '/dialects/trino.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{ lib.quote_identifier(lib.macro_catalog_name) }}.{{ lib.quote_identifier(lib.macro_schema_name) }}.{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
monthly total not null count match percent
Check description
Verifies that the percentage of difference in total not null count of a column in a table and total not null count of a column of another table does not exceed the set number. Stores the most recent check result for each month when the data quality check was evaluated.
Data quality check name | Friendly name | Category | Check type | Time scale | Quality dimension | Sensor definition | Quality rule | Standard |
---|---|---|---|---|---|---|---|---|
monthly_total_not_null_count_match_percent |
The maximum difference in percent between the count of not null values of this column and the reference column | accuracy | monitoring | monthly | Accuracy | total_not_null_count_match_percent | diff_percent |
Command-line examples
Please expand the section below to see the DQOps command-line examples to run or activate the monthly total not null count match percent data quality check.
Managing monthly total not null count match percent check from DQOps shell
Activate this data quality using the check activate CLI command, providing the connection name, table name, check name, and all other filters. Activates the warning rule with the default parameters.
dqo> check activate -c=connection_name -t=schema_name.table_name -col=column_name -ch=monthly_total_not_null_count_match_percent --enable-warning
You can also use patterns to activate the check on all matching tables and columns.
dqo> check activate -c=connection_name -t=schema_prefix*.fact_* -col=column_name -ch=monthly_total_not_null_count_match_percent --enable-warning
Additional rule parameters are passed using the -Wrule_parameter_name=value.
Activate this data quality using the check activate CLI command, providing the connection name, table name, check name, and all other filters. Activates the error rule with the default parameters.
dqo> check activate -c=connection_name -t=schema_name.table_name -col=column_name -ch=monthly_total_not_null_count_match_percent --enable-error
You can also use patterns to activate the check on all matching tables and columns.
dqo> check activate -c=connection_name -t=schema_prefix*.fact_* -col=column_name -ch=monthly_total_not_null_count_match_percent --enable-error
Additional rule parameters are passed using the -Erule_parameter_name=value.
Run this data quality check using the check run CLI command by providing the check name and all other targeting filters. The following example shows how to run the monthly_total_not_null_count_match_percent check on all tables and columns on a single data source.
It is also possible to run this check on a specific connection and table. In order to do this, use the connection name and the full table name parameters.
dqo> check run -c=connection_name -t=schema_name.table_name -ch=monthly_total_not_null_count_match_percent
You can also run this check on all tables (and columns) on which the monthly_total_not_null_count_match_percent check is enabled using patterns to find tables.
YAML configuration
The sample schema_name.table_name.dqotable.yaml file with the check configured is shown below.
# yaml-language-server: $schema=https://cloud.dqops.com/dqo-yaml-schema/TableYaml-schema.json
apiVersion: dqo/v1
kind: table
spec:
columns:
target_column:
monitoring_checks:
monthly:
accuracy:
monthly_total_not_null_count_match_percent:
parameters:
referenced_table: landing_zone.customer_raw
referenced_column: customer_id
warning:
max_diff_percent: 0.0
error:
max_diff_percent: 1.0
fatal:
max_diff_percent: 5.0
labels:
- This is the column that is analyzed for data quality issues
Samples of generated SQL queries for each data source type
Please expand the database engine name section to see the SQL query rendered by a Jinja2 template for the total_not_null_count_match_percent data quality sensor.
BigQuery
{% import '/dialects/bigquery.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{ lib.quote_identifier(lib.macro_project_name) }}.{{ lib.quote_identifier(lib.macro_schema_name) }}.{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
ClickHouse
{% import '/dialects/clickhouse.sql.jinja2' as lib with context -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ lib.render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
Databricks
{% import '/dialects/databricks.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{ lib.quote_identifier(lib.macro_schema_name) }}.{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
DB2
{% import '/dialects/db2.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} referenced_table
) AS expected_value,
analyzed_table.actual_value
FROM (SELECT
COUNT({{ lib.render_target_column('original_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} original_table
{{- lib.render_where_clause() -}} ) analyzed_table
GROUP BY actual_value
SELECT
(SELECT
COUNT(referenced_table."customer_id")
FROM landing_zone.customer_raw referenced_table
) AS expected_value,
analyzed_table.actual_value
FROM (SELECT
COUNT(original_table."target_column") AS actual_value
FROM "<target_schema>"."<target_table>" original_table) analyzed_table
GROUP BY actual_value
DuckDB
{% import '/dialects/duckdb.sql.jinja2' as lib with context -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ lib.render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
HANA
{% import '/dialects/hana.sql.jinja2' as lib with context -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ lib.render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
MariaDB
{% import '/dialects/mariadb.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
MySQL
{% import '/dialects/mysql.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
Oracle
{% import '/dialects/oracle.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} referenced_table
) AS expected_value,
analyzed_table.actual_value
FROM (SELECT
COUNT({{ lib.render_target_column('original_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} original_table
{{- lib.render_where_clause() -}} ) analyzed_table
GROUP BY actual_value
SELECT
(SELECT
COUNT(referenced_table."customer_id")
FROM landing_zone.customer_raw referenced_table
) AS expected_value,
analyzed_table.actual_value
FROM (SELECT
COUNT(original_table."target_column") AS actual_value
FROM "<target_schema>"."<target_table>" original_table) analyzed_table
GROUP BY actual_value
PostgreSQL
{% import '/dialects/postgresql.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{ lib.quote_identifier(lib.macro_database_name) }}.{{ lib.quote_identifier(lib.macro_schema_name) }}.{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
Presto
{% import '/dialects/presto.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{ lib.quote_identifier(lib.macro_database_name) }}.{{ lib.quote_identifier(lib.macro_schema_name) }}.{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
QuestDB
{% import '/dialects/questdb.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
WITH referenced_data AS (
SELECT COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }}) AS expected_value
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
)
SELECT referenced_data.expected_value AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
CROSS JOIN referenced_data
{{- lib.render_where_clause() -}}
WITH referenced_data AS (
SELECT COUNT(referenced_table."customer_id") AS expected_value
FROM landing_zone.customer_raw AS referenced_table
)
SELECT referenced_data.expected_value AS expected_value,
COUNT(analyzed_table."target_column") AS actual_value
FROM "<target_table>" AS analyzed_table
CROSS JOIN referenced_data
Redshift
{% import '/dialects/redshift.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{ lib.quote_identifier(lib.macro_database_name) }}.{{ lib.quote_identifier(lib.macro_schema_name) }}.{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
Snowflake
{% import '/dialects/snowflake.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{ lib.quote_identifier(lib.macro_database_name) }}.{{ lib.quote_identifier(lib.macro_schema_name) }}.{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
Spark
{% import '/dialects/spark.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{ lib.quote_identifier(lib.macro_schema_name) }}.{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
SQL Server
{% import '/dialects/sqlserver.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{ lib.quote_identifier(lib.macro_database_name) }}.{{ lib.quote_identifier(lib.macro_schema_name) }}.{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT_BIG(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT_BIG({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
Teradata
{% import '/dialects/teradata.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{ lib.quote_identifier(lib.macro_schema_name) }}.{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
Trino
{% import '/dialects/trino.sql.jinja2' as lib with context -%}
{%- macro render_referenced_table(referenced_table) -%}
{%- if referenced_table.find(".") < 0 -%}
{{ lib.quote_identifier(lib.macro_catalog_name) }}.{{ lib.quote_identifier(lib.macro_schema_name) }}.{{- lib.quote_identifier(referenced_table) -}}
{%- else -%}
{{ referenced_table }}
{%- endif -%}
{%- endmacro -%}
SELECT
(SELECT
COUNT(referenced_table.{{ lib.quote_identifier(parameters.referenced_column) }})
FROM {{ render_referenced_table(parameters.referenced_table) }} AS referenced_table
) AS expected_value,
COUNT({{ lib.render_target_column('analyzed_table')}}) AS actual_value
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
What's next
- Learn how to configure data quality checks in DQOps
- Look at the examples of running data quality checks, targeting tables and columns