Last updated: July 22, 2025

Texts not matching regex percent data quality checks, SQL examples

This check validates text values using a pattern defined as a regular expression. It measures the percentage of invalid values and raises a data quality issue when the rate is above a threshold.

The texts not matching regex percent data quality check has the following variants for each type of data quality checks supported by DQOps.

profile texts not matching regex percent

Check description

Verifies that the percentage of strings not matching the custom regular expression pattern does not exceed the maximum accepted percentage.

Data quality check name	Friendly name	Category	Check type	Time scale	Quality dimension	Sensor definition	Quality rule	Standard
`profile_texts_not_matching_regex_percent`	Maximum percent of rows containing texts values not matching regex	patterns	profiling		Validity	texts_not_matching_regex_percent	max_percent

Command-line examples

Please expand the section below to see the DQOps command-line examples to run or activate the profile texts not matching regex percent data quality check.

Managing profile texts not matching regex percent check from DQOps shell

Activate the check with a warning ruleActivate the check with an error ruleRun all configured checks

Activate this data quality using the check activate CLI command, providing the connection name, table name, check name, and all other filters. Activates the warning rule with the default parameters.

dqo> check activate -c=connection_name -t=schema_name.table_name -col=column_name -ch=profile_texts_not_matching_regex_percent --enable-warning

You can also use patterns to activate the check on all matching tables and columns.

dqo> check activate -c=connection_name -t=schema_prefix*.fact_* -col=column_name -ch=profile_texts_not_matching_regex_percent --enable-warning

Additional rule parameters are passed using the -Wrule_parameter_name=value.

dqo> check activate -c=connection_name -t=schema_prefix*.fact_* -col=column_name -ch=profile_texts_not_matching_regex_percent --enable-warning
                    -Wmax_percent=value

Activate this data quality using the check activate CLI command, providing the connection name, table name, check name, and all other filters. Activates the error rule with the default parameters.

dqo> check activate -c=connection_name -t=schema_name.table_name -col=column_name -ch=profile_texts_not_matching_regex_percent --enable-error

You can also use patterns to activate the check on all matching tables and columns.

dqo> check activate -c=connection_name -t=schema_prefix*.fact_* -col=column_name -ch=profile_texts_not_matching_regex_percent --enable-error

Additional rule parameters are passed using the -Erule_parameter_name=value.

dqo> check activate -c=connection_name -t=schema_prefix*.fact_* -col=column_name -ch=profile_texts_not_matching_regex_percent --enable-error
                    -Emax_percent=value

Run this data quality check using the check run CLI command by providing the check name and all other targeting filters. The following example shows how to run the profile_texts_not_matching_regex_percent check on all tables and columns on a single data source.

dqo> check run -c=data_source_name -ch=profile_texts_not_matching_regex_percent

It is also possible to run this check on a specific connection and table. In order to do this, use the connection name and the full table name parameters.

dqo> check run -c=connection_name -t=schema_name.table_name -ch=profile_texts_not_matching_regex_percent

You can also run this check on all tables (and columns) on which the profile_texts_not_matching_regex_percent check is enabled using patterns to find tables.

dqo> check run -c=connection_name -t=schema_prefix*.fact_* -col=column_name_* -ch=profile_texts_not_matching_regex_percent

YAML configuration

The sample schema_name.table_name.dqotable.yaml file with the check configured is shown below.

# yaml-language-server: $schema=https://cloud.dqops.com/dqo-yaml-schema/TableYaml-schema.json
apiVersion: dqo/v1
kind: table
spec:
  columns:
    target_column:
      profiling_checks:
        patterns:
          profile_texts_not_matching_regex_percent:
            parameters:
              regex: "^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$"
            warning:
              max_percent: 0.0
            error:
              max_percent: 1.0
            fatal:
              max_percent: 5.0
      labels:
      - This is the column that is analyzed for data quality issues

Samples of generated SQL queries for each data source type

Please expand the database engine name section to see the SQL query rendered by a Jinja2 template for the texts_not_matching_regex_percent data quality sensor.

BigQuery

Sensor template for BigQueryRendered SQL for BigQuery

{% import '/dialects/bigquery.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_CONTAINS({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_CONTAINS(analyzed_table.`target_column`, r'^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value
FROM `your-google-project-id`.`<target_schema>`.`<target_table>` AS analyzed_table

ClickHouse

Sensor template for ClickHouseRendered SQL for ClickHouse

{% import '/dialects/clickhouse.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT match({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT match(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value
FROM "<target_schema>"."<target_table>" AS analyzed_table

Databricks

Sensor template for DatabricksRendered SQL for Databricks

{% import '/dialects/databricks.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} RLIKE {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table.`target_column` RLIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value
FROM `<target_schema>`.`<target_table>` AS analyzed_table

DB2

Sensor template for DB2Rendered SQL for DB2

{% import '/dialects/db2.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value
FROM (
    SELECT
        original_table.*
    FROM "<target_schema>"."<target_table>" original_table
) analyzed_table

DuckDB

Sensor template for DuckDBRendered SQL for DuckDB

{% import '/dialects/duckdb.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN REGEXP_MATCHES({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }}) IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN REGEXP_MATCHES(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$') IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value
FROM  AS analyzed_table

HANA

Sensor template for HANARendered SQL for HANA

{% import '/dialects/hana.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} NOT LIKE_REGEXPR {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" NOT LIKE_REGEXPR '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value
FROM (
    SELECT
        original_table.*
    FROM "<target_schema>"."<target_table>" original_table
) analyzed_table

MariaDB

Sensor template for MariaDBRendered SQL for MariaDB

{% import '/dialects/mariadb.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_regex(lib.render_target_column('analyzed_table'), parameters.regex ) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT 
    analyzed_table.`target_column` RLIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'

                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value
FROM `<target_table>` AS analyzed_table

MySQL

Sensor template for MySQLRendered SQL for MySQL

{% import '/dialects/mysql.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_regex(lib.render_target_column('analyzed_table'), parameters.regex ) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table.`target_column`, '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value
FROM `<target_table>` AS analyzed_table

Oracle

Sensor template for OracleRendered SQL for Oracle

{% import '/dialects/oracle.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value
FROM (
    SELECT
        original_table.*
    FROM "<target_schema>"."<target_table>" original_table
) analyzed_table

PostgreSQL

Sensor template for PostgreSQLRendered SQL for PostgreSQL

{% import '/dialects/postgresql.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} ~ {{ lib.render_regex(parameters.regex) }} IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" ~ '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$' IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value
FROM "your_postgresql_database"."<target_schema>"."<target_table>" AS analyzed_table

Presto

Sensor template for PrestoRendered SQL for Presto

{% import '/dialects/presto.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT(analyzed_table."target_column")
    END AS actual_value
FROM (
    SELECT
        original_table.*
    FROM "your_trino_database"."<target_schema>"."<target_table>" original_table
) analyzed_table

QuestDB

Sensor template for QuestDBRendered SQL for QuestDB

{% import '/dialects/questdb.sql.jinja2' as lib with context -%}

SELECT
    COALESCE(100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} ~ {{ lib.render_regex(parameters.regex) }} = FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }}), 100.0)
    AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM(
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
    WHERE {{ lib.render_target_column('original_table') }} IS NOT NULL
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    COALESCE(100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" ~ '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$' = FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column"), 100.0)
    AS actual_value
FROM(
    SELECT
        original_table.*
    FROM "<target_table>" original_table
    WHERE original_table."target_column" IS NOT NULL
) analyzed_table

Redshift

Sensor template for RedshiftRendered SQL for Redshift

{% import '/dialects/redshift.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} ~ {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table."target_column" ~ '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value
FROM "your_redshift_database"."<target_schema>"."<target_table>" AS analyzed_table

Snowflake

Sensor template for SnowflakeRendered SQL for Snowflake

{% import '/dialects/snowflake.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} REGEXP {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table."target_column" REGEXP '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value
FROM "your_snowflake_database"."<target_schema>"."<target_table>" AS analyzed_table

Spark

Sensor template for SparkRendered SQL for Spark

{% import '/dialects/spark.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} RLIKE {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table.`target_column` RLIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value
FROM `<target_schema>`.`<target_table>` AS analyzed_table

SQL Server

Sensor template for SQL ServerRendered SQL for SQL Server

{% import '/dialects/sqlserver.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT_BIG({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} LIKE {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT_BIG({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT_BIG(analyzed_table.[target_column]) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table.[target_column] LIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT_BIG(analyzed_table.[target_column])
    END AS actual_value
FROM [your_sql_server_database].[<target_schema>].[<target_table>] AS analyzed_table

Teradata

Sensor template for TeradataRendered SQL for Teradata

{% import '/dialects/teradata.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} IS NOT NULL AND
                        REGEXP_SUBSTR({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }}) IS NULL
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" IS NOT NULL AND
                        REGEXP_SUBSTR(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$') IS NULL
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value
FROM "<target_schema>"."<target_table>" AS analyzed_table

Trino

Sensor template for TrinoRendered SQL for Trino

{% import '/dialects/trino.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT(analyzed_table."target_column")
    END AS actual_value
FROM (
    SELECT
        original_table.*
    FROM "your_trino_catalog"."<target_schema>"."<target_table>" original_table
) analyzed_table

Expand the Configure with data grouping section to see additional examples for configuring this data quality checks to use data grouping (GROUP BY).

Configuration with data grouping

Sample configuration with data grouping enabled (YAML) The sample below shows how to configure the data grouping and how it affects the generated SQL query.

# yaml-language-server: $schema=https://cloud.dqops.com/dqo-yaml-schema/TableYaml-schema.json
apiVersion: dqo/v1
kind: table
spec:
  default_grouping_name: group_by_country_and_state
  groupings:
    group_by_country_and_state:
      level_1:
        source: column_value
        column: country
      level_2:
        source: column_value
        column: state
  columns:
    target_column:
      profiling_checks:
        patterns:
          profile_texts_not_matching_regex_percent:
            parameters:
              regex: "^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$"
            warning:
              max_percent: 0.0
            error:
              max_percent: 1.0
            fatal:
              max_percent: 5.0
      labels:
      - This is the column that is analyzed for data quality issues
    country:
      labels:
      - column used as the first grouping key
    state:
      labels:
      - column used as the second grouping key

Please expand the database engine name section to see the SQL query rendered by a Jinja2 template for the texts_not_matching_regex_percent sensor.

BigQuery

Sensor template for BigQueryRendered SQL for BigQuery

{% import '/dialects/bigquery.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_CONTAINS({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_CONTAINS(analyzed_table.`target_column`, r'^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value,
    analyzed_table.`country` AS grouping_level_1,
    analyzed_table.`state` AS grouping_level_2
FROM `your-google-project-id`.`<target_schema>`.`<target_table>` AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

ClickHouse

Sensor template for ClickHouseRendered SQL for ClickHouse

{% import '/dialects/clickhouse.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT match({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT match(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    analyzed_table."country" AS grouping_level_1,
    analyzed_table."state" AS grouping_level_2
FROM "<target_schema>"."<target_table>" AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

Databricks

Sensor template for DatabricksRendered SQL for Databricks

{% import '/dialects/databricks.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} RLIKE {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table.`target_column` RLIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value,
    analyzed_table.`country` AS grouping_level_1,
    analyzed_table.`state` AS grouping_level_2
FROM `<target_schema>`.`<target_table>` AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

DB2

Sensor template for DB2Rendered SQL for DB2

{% import '/dialects/db2.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,

                analyzed_table.grouping_level_1,

                analyzed_table.grouping_level_2
FROM (
    SELECT
        original_table.*,
    original_table."country" AS grouping_level_1,
    original_table."state" AS grouping_level_2
    FROM "<target_schema>"."<target_table>" original_table
) analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

DuckDB

Sensor template for DuckDBRendered SQL for DuckDB

{% import '/dialects/duckdb.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN REGEXP_MATCHES({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }}) IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN REGEXP_MATCHES(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$') IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    analyzed_table."country" AS grouping_level_1,
    analyzed_table."state" AS grouping_level_2
FROM  AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

HANA

Sensor template for HANARendered SQL for HANA

{% import '/dialects/hana.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} NOT LIKE_REGEXPR {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" NOT LIKE_REGEXPR '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,

                analyzed_table.grouping_level_1,

                analyzed_table.grouping_level_2
FROM (
    SELECT
        original_table.*,
    original_table."country" AS grouping_level_1,
    original_table."state" AS grouping_level_2
    FROM "<target_schema>"."<target_table>" original_table
) analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

MariaDB

Sensor template for MariaDBRendered SQL for MariaDB

{% import '/dialects/mariadb.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_regex(lib.render_target_column('analyzed_table'), parameters.regex ) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT 
    analyzed_table.`target_column` RLIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'

                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value,
    analyzed_table.`country` AS grouping_level_1,
    analyzed_table.`state` AS grouping_level_2
FROM `<target_table>` AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

MySQL

Sensor template for MySQLRendered SQL for MySQL

{% import '/dialects/mysql.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_regex(lib.render_target_column('analyzed_table'), parameters.regex ) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table.`target_column`, '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value,
    analyzed_table.`country` AS grouping_level_1,
    analyzed_table.`state` AS grouping_level_2
FROM `<target_table>` AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

Oracle

Sensor template for OracleRendered SQL for Oracle

{% import '/dialects/oracle.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,

                analyzed_table.grouping_level_1,

                analyzed_table.grouping_level_2

FROM (
    SELECT
        original_table.*,
    original_table."country" AS grouping_level_1,
    original_table."state" AS grouping_level_2
    FROM "<target_schema>"."<target_table>" original_table
) analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

PostgreSQL

Sensor template for PostgreSQLRendered SQL for PostgreSQL

{% import '/dialects/postgresql.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} ~ {{ lib.render_regex(parameters.regex) }} IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" ~ '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$' IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    analyzed_table."country" AS grouping_level_1,
    analyzed_table."state" AS grouping_level_2
FROM "your_postgresql_database"."<target_schema>"."<target_table>" AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

Presto

Sensor template for PrestoRendered SQL for Presto

{% import '/dialects/presto.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT(analyzed_table."target_column")
    END AS actual_value,

                analyzed_table.grouping_level_1,

                analyzed_table.grouping_level_2

FROM (
    SELECT
        original_table.*,
    original_table."country" AS grouping_level_1,
    original_table."state" AS grouping_level_2
    FROM "your_trino_database"."<target_schema>"."<target_table>" original_table
) analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

QuestDB

Sensor template for QuestDBRendered SQL for QuestDB

{% import '/dialects/questdb.sql.jinja2' as lib with context -%}

SELECT
    COALESCE(100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} ~ {{ lib.render_regex(parameters.regex) }} = FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }}), 100.0)
    AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM(
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
    WHERE {{ lib.render_target_column('original_table') }} IS NOT NULL
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    COALESCE(100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" ~ '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$' = FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column"), 100.0)
    AS actual_value,

                analyzed_table.grouping_level_1,

                analyzed_table.grouping_level_2
FROM(
    SELECT
        original_table.*,
    original_table."country" AS grouping_level_1,
    original_table."state" AS grouping_level_2
    FROM "<target_table>" original_table
    WHERE original_table."target_column" IS NOT NULL
) analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

Redshift

Sensor template for RedshiftRendered SQL for Redshift

{% import '/dialects/redshift.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} ~ {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table."target_column" ~ '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    analyzed_table."country" AS grouping_level_1,
    analyzed_table."state" AS grouping_level_2
FROM "your_redshift_database"."<target_schema>"."<target_table>" AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

Snowflake

Sensor template for SnowflakeRendered SQL for Snowflake

{% import '/dialects/snowflake.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} REGEXP {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table."target_column" REGEXP '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    analyzed_table."country" AS grouping_level_1,
    analyzed_table."state" AS grouping_level_2
FROM "your_snowflake_database"."<target_schema>"."<target_table>" AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

Spark

Sensor template for SparkRendered SQL for Spark

{% import '/dialects/spark.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} RLIKE {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table.`target_column` RLIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value,
    analyzed_table.`country` AS grouping_level_1,
    analyzed_table.`state` AS grouping_level_2
FROM `<target_schema>`.`<target_table>` AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

SQL Server

Sensor template for SQL ServerRendered SQL for SQL Server

{% import '/dialects/sqlserver.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT_BIG({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} LIKE {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT_BIG({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT_BIG(analyzed_table.[target_column]) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table.[target_column] LIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT_BIG(analyzed_table.[target_column])
    END AS actual_value,
    analyzed_table.[country] AS grouping_level_1,
    analyzed_table.[state] AS grouping_level_2
FROM [your_sql_server_database].[<target_schema>].[<target_table>] AS analyzed_table
GROUP BY analyzed_table.[country], analyzed_table.[state]
ORDER BY level_1, level_2
        ,

Teradata

Sensor template for TeradataRendered SQL for Teradata

{% import '/dialects/teradata.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} IS NOT NULL AND
                        REGEXP_SUBSTR({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }}) IS NULL
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" IS NOT NULL AND
                        REGEXP_SUBSTR(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$') IS NULL
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    analyzed_table."country" AS grouping_level_1,
    analyzed_table."state" AS grouping_level_2
FROM "<target_schema>"."<target_table>" AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

Trino

Sensor template for TrinoRendered SQL for Trino

{% import '/dialects/trino.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT(analyzed_table."target_column")
    END AS actual_value,

                analyzed_table.grouping_level_1,

                analyzed_table.grouping_level_2

FROM (
    SELECT
        original_table.*,
    original_table."country" AS grouping_level_1,
    original_table."state" AS grouping_level_2
    FROM "your_trino_catalog"."<target_schema>"."<target_table>" original_table
) analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

daily texts not matching regex percent

Check description

Verifies that the percentage of strings not matching the custom regular expression pattern does not exceed the maximum accepted percentage.

Data quality check name	Friendly name	Category	Check type	Time scale	Quality dimension	Sensor definition	Quality rule	Standard
`daily_texts_not_matching_regex_percent`	Maximum percent of rows containing texts values not matching regex	patterns	monitoring	daily	Validity	texts_not_matching_regex_percent	max_percent

Command-line examples

Please expand the section below to see the DQOps command-line examples to run or activate the daily texts not matching regex percent data quality check.

Managing daily texts not matching regex percent check from DQOps shell

Activate the check with a warning ruleActivate the check with an error ruleRun all configured checks

Activate this data quality using the check activate CLI command, providing the connection name, table name, check name, and all other filters. Activates the warning rule with the default parameters.

dqo> check activate -c=connection_name -t=schema_name.table_name -col=column_name -ch=daily_texts_not_matching_regex_percent --enable-warning

You can also use patterns to activate the check on all matching tables and columns.

dqo> check activate -c=connection_name -t=schema_prefix*.fact_* -col=column_name -ch=daily_texts_not_matching_regex_percent --enable-warning

Additional rule parameters are passed using the -Wrule_parameter_name=value.

dqo> check activate -c=connection_name -t=schema_prefix*.fact_* -col=column_name -ch=daily_texts_not_matching_regex_percent --enable-warning
                    -Wmax_percent=value

Activate this data quality using the check activate CLI command, providing the connection name, table name, check name, and all other filters. Activates the error rule with the default parameters.

dqo> check activate -c=connection_name -t=schema_name.table_name -col=column_name -ch=daily_texts_not_matching_regex_percent --enable-error

You can also use patterns to activate the check on all matching tables and columns.

dqo> check activate -c=connection_name -t=schema_prefix*.fact_* -col=column_name -ch=daily_texts_not_matching_regex_percent --enable-error

Additional rule parameters are passed using the -Erule_parameter_name=value.

dqo> check activate -c=connection_name -t=schema_prefix*.fact_* -col=column_name -ch=daily_texts_not_matching_regex_percent --enable-error
                    -Emax_percent=value

Run this data quality check using the check run CLI command by providing the check name and all other targeting filters. The following example shows how to run the daily_texts_not_matching_regex_percent check on all tables and columns on a single data source.

dqo> check run -c=data_source_name -ch=daily_texts_not_matching_regex_percent

It is also possible to run this check on a specific connection and table. In order to do this, use the connection name and the full table name parameters.

dqo> check run -c=connection_name -t=schema_name.table_name -ch=daily_texts_not_matching_regex_percent

You can also run this check on all tables (and columns) on which the daily_texts_not_matching_regex_percent check is enabled using patterns to find tables.

dqo> check run -c=connection_name -t=schema_prefix*.fact_* -col=column_name_* -ch=daily_texts_not_matching_regex_percent

YAML configuration

The sample schema_name.table_name.dqotable.yaml file with the check configured is shown below.

# yaml-language-server: $schema=https://cloud.dqops.com/dqo-yaml-schema/TableYaml-schema.json
apiVersion: dqo/v1
kind: table
spec:
  columns:
    target_column:
      monitoring_checks:
        daily:
          patterns:
            daily_texts_not_matching_regex_percent:
              parameters:
                regex: "^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$"
              warning:
                max_percent: 0.0
              error:
                max_percent: 1.0
              fatal:
                max_percent: 5.0
      labels:
      - This is the column that is analyzed for data quality issues

Samples of generated SQL queries for each data source type

Please expand the database engine name section to see the SQL query rendered by a Jinja2 template for the texts_not_matching_regex_percent data quality sensor.

BigQuery

Sensor template for BigQueryRendered SQL for BigQuery

{% import '/dialects/bigquery.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_CONTAINS({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_CONTAINS(analyzed_table.`target_column`, r'^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value
FROM `your-google-project-id`.`<target_schema>`.`<target_table>` AS analyzed_table

ClickHouse

Sensor template for ClickHouseRendered SQL for ClickHouse

{% import '/dialects/clickhouse.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT match({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT match(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value
FROM "<target_schema>"."<target_table>" AS analyzed_table

Databricks

Sensor template for DatabricksRendered SQL for Databricks

{% import '/dialects/databricks.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} RLIKE {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table.`target_column` RLIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value
FROM `<target_schema>`.`<target_table>` AS analyzed_table

DB2

Sensor template for DB2Rendered SQL for DB2

{% import '/dialects/db2.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value
FROM (
    SELECT
        original_table.*
    FROM "<target_schema>"."<target_table>" original_table
) analyzed_table

DuckDB

Sensor template for DuckDBRendered SQL for DuckDB

{% import '/dialects/duckdb.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN REGEXP_MATCHES({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }}) IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN REGEXP_MATCHES(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$') IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value
FROM  AS analyzed_table

HANA

Sensor template for HANARendered SQL for HANA

{% import '/dialects/hana.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} NOT LIKE_REGEXPR {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" NOT LIKE_REGEXPR '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value
FROM (
    SELECT
        original_table.*
    FROM "<target_schema>"."<target_table>" original_table
) analyzed_table

MariaDB

Sensor template for MariaDBRendered SQL for MariaDB

{% import '/dialects/mariadb.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_regex(lib.render_target_column('analyzed_table'), parameters.regex ) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT 
    analyzed_table.`target_column` RLIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'

                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value
FROM `<target_table>` AS analyzed_table

MySQL

Sensor template for MySQLRendered SQL for MySQL

{% import '/dialects/mysql.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_regex(lib.render_target_column('analyzed_table'), parameters.regex ) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table.`target_column`, '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value
FROM `<target_table>` AS analyzed_table

Oracle

Sensor template for OracleRendered SQL for Oracle

{% import '/dialects/oracle.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value
FROM (
    SELECT
        original_table.*
    FROM "<target_schema>"."<target_table>" original_table
) analyzed_table

PostgreSQL

Sensor template for PostgreSQLRendered SQL for PostgreSQL

{% import '/dialects/postgresql.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} ~ {{ lib.render_regex(parameters.regex) }} IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" ~ '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$' IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value
FROM "your_postgresql_database"."<target_schema>"."<target_table>" AS analyzed_table

Presto

Sensor template for PrestoRendered SQL for Presto

{% import '/dialects/presto.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT(analyzed_table."target_column")
    END AS actual_value
FROM (
    SELECT
        original_table.*
    FROM "your_trino_database"."<target_schema>"."<target_table>" original_table
) analyzed_table

QuestDB

Sensor template for QuestDBRendered SQL for QuestDB

{% import '/dialects/questdb.sql.jinja2' as lib with context -%}

SELECT
    COALESCE(100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} ~ {{ lib.render_regex(parameters.regex) }} = FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }}), 100.0)
    AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM(
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
    WHERE {{ lib.render_target_column('original_table') }} IS NOT NULL
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    COALESCE(100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" ~ '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$' = FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column"), 100.0)
    AS actual_value
FROM(
    SELECT
        original_table.*
    FROM "<target_table>" original_table
    WHERE original_table."target_column" IS NOT NULL
) analyzed_table

Redshift

Sensor template for RedshiftRendered SQL for Redshift

{% import '/dialects/redshift.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} ~ {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table."target_column" ~ '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value
FROM "your_redshift_database"."<target_schema>"."<target_table>" AS analyzed_table

Snowflake

Sensor template for SnowflakeRendered SQL for Snowflake

{% import '/dialects/snowflake.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} REGEXP {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table."target_column" REGEXP '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value
FROM "your_snowflake_database"."<target_schema>"."<target_table>" AS analyzed_table

Spark

Sensor template for SparkRendered SQL for Spark

{% import '/dialects/spark.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} RLIKE {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table.`target_column` RLIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value
FROM `<target_schema>`.`<target_table>` AS analyzed_table

SQL Server

Sensor template for SQL ServerRendered SQL for SQL Server

{% import '/dialects/sqlserver.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT_BIG({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} LIKE {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT_BIG({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT_BIG(analyzed_table.[target_column]) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table.[target_column] LIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT_BIG(analyzed_table.[target_column])
    END AS actual_value
FROM [your_sql_server_database].[<target_schema>].[<target_table>] AS analyzed_table

Teradata

Sensor template for TeradataRendered SQL for Teradata

{% import '/dialects/teradata.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} IS NOT NULL AND
                        REGEXP_SUBSTR({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }}) IS NULL
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" IS NOT NULL AND
                        REGEXP_SUBSTR(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$') IS NULL
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value
FROM "<target_schema>"."<target_table>" AS analyzed_table

Trino

Sensor template for TrinoRendered SQL for Trino

{% import '/dialects/trino.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT(analyzed_table."target_column")
    END AS actual_value
FROM (
    SELECT
        original_table.*
    FROM "your_trino_catalog"."<target_schema>"."<target_table>" original_table
) analyzed_table

Expand the Configure with data grouping section to see additional examples for configuring this data quality checks to use data grouping (GROUP BY).

Configuration with data grouping

Sample configuration with data grouping enabled (YAML) The sample below shows how to configure the data grouping and how it affects the generated SQL query.

# yaml-language-server: $schema=https://cloud.dqops.com/dqo-yaml-schema/TableYaml-schema.json
apiVersion: dqo/v1
kind: table
spec:
  default_grouping_name: group_by_country_and_state
  groupings:
    group_by_country_and_state:
      level_1:
        source: column_value
        column: country
      level_2:
        source: column_value
        column: state
  columns:
    target_column:
      monitoring_checks:
        daily:
          patterns:
            daily_texts_not_matching_regex_percent:
              parameters:
                regex: "^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$"
              warning:
                max_percent: 0.0
              error:
                max_percent: 1.0
              fatal:
                max_percent: 5.0
      labels:
      - This is the column that is analyzed for data quality issues
    country:
      labels:
      - column used as the first grouping key
    state:
      labels:
      - column used as the second grouping key

Please expand the database engine name section to see the SQL query rendered by a Jinja2 template for the texts_not_matching_regex_percent sensor.

BigQuery

Sensor template for BigQueryRendered SQL for BigQuery

{% import '/dialects/bigquery.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_CONTAINS({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_CONTAINS(analyzed_table.`target_column`, r'^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value,
    analyzed_table.`country` AS grouping_level_1,
    analyzed_table.`state` AS grouping_level_2
FROM `your-google-project-id`.`<target_schema>`.`<target_table>` AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

ClickHouse

Sensor template for ClickHouseRendered SQL for ClickHouse

{% import '/dialects/clickhouse.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT match({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT match(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    analyzed_table."country" AS grouping_level_1,
    analyzed_table."state" AS grouping_level_2
FROM "<target_schema>"."<target_table>" AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

Databricks

Sensor template for DatabricksRendered SQL for Databricks

{% import '/dialects/databricks.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} RLIKE {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table.`target_column` RLIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value,
    analyzed_table.`country` AS grouping_level_1,
    analyzed_table.`state` AS grouping_level_2
FROM `<target_schema>`.`<target_table>` AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

DB2

Sensor template for DB2Rendered SQL for DB2

{% import '/dialects/db2.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,

                analyzed_table.grouping_level_1,

                analyzed_table.grouping_level_2
FROM (
    SELECT
        original_table.*,
    original_table."country" AS grouping_level_1,
    original_table."state" AS grouping_level_2
    FROM "<target_schema>"."<target_table>" original_table
) analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

DuckDB

Sensor template for DuckDBRendered SQL for DuckDB

{% import '/dialects/duckdb.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN REGEXP_MATCHES({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }}) IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN REGEXP_MATCHES(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$') IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    analyzed_table."country" AS grouping_level_1,
    analyzed_table."state" AS grouping_level_2
FROM  AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

HANA

Sensor template for HANARendered SQL for HANA

{% import '/dialects/hana.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} NOT LIKE_REGEXPR {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" NOT LIKE_REGEXPR '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,

                analyzed_table.grouping_level_1,

                analyzed_table.grouping_level_2
FROM (
    SELECT
        original_table.*,
    original_table."country" AS grouping_level_1,
    original_table."state" AS grouping_level_2
    FROM "<target_schema>"."<target_table>" original_table
) analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

MariaDB

Sensor template for MariaDBRendered SQL for MariaDB

{% import '/dialects/mariadb.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_regex(lib.render_target_column('analyzed_table'), parameters.regex ) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT 
    analyzed_table.`target_column` RLIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'

                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value,
    analyzed_table.`country` AS grouping_level_1,
    analyzed_table.`state` AS grouping_level_2
FROM `<target_table>` AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

MySQL

Sensor template for MySQLRendered SQL for MySQL

{% import '/dialects/mysql.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_regex(lib.render_target_column('analyzed_table'), parameters.regex ) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table.`target_column`, '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value,
    analyzed_table.`country` AS grouping_level_1,
    analyzed_table.`state` AS grouping_level_2
FROM `<target_table>` AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

Oracle

Sensor template for OracleRendered SQL for Oracle

{% import '/dialects/oracle.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,

                analyzed_table.grouping_level_1,

                analyzed_table.grouping_level_2

FROM (
    SELECT
        original_table.*,
    original_table."country" AS grouping_level_1,
    original_table."state" AS grouping_level_2
    FROM "<target_schema>"."<target_table>" original_table
) analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

PostgreSQL

Sensor template for PostgreSQLRendered SQL for PostgreSQL

{% import '/dialects/postgresql.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} ~ {{ lib.render_regex(parameters.regex) }} IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" ~ '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$' IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    analyzed_table."country" AS grouping_level_1,
    analyzed_table."state" AS grouping_level_2
FROM "your_postgresql_database"."<target_schema>"."<target_table>" AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

Presto

Sensor template for PrestoRendered SQL for Presto

{% import '/dialects/presto.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT(analyzed_table."target_column")
    END AS actual_value,

                analyzed_table.grouping_level_1,

                analyzed_table.grouping_level_2

FROM (
    SELECT
        original_table.*,
    original_table."country" AS grouping_level_1,
    original_table."state" AS grouping_level_2
    FROM "your_trino_database"."<target_schema>"."<target_table>" original_table
) analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

QuestDB

Sensor template for QuestDBRendered SQL for QuestDB

{% import '/dialects/questdb.sql.jinja2' as lib with context -%}

SELECT
    COALESCE(100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} ~ {{ lib.render_regex(parameters.regex) }} = FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }}), 100.0)
    AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM(
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
    WHERE {{ lib.render_target_column('original_table') }} IS NOT NULL
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    COALESCE(100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" ~ '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$' = FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column"), 100.0)
    AS actual_value,

                analyzed_table.grouping_level_1,

                analyzed_table.grouping_level_2
FROM(
    SELECT
        original_table.*,
    original_table."country" AS grouping_level_1,
    original_table."state" AS grouping_level_2
    FROM "<target_table>" original_table
    WHERE original_table."target_column" IS NOT NULL
) analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

Redshift

Sensor template for RedshiftRendered SQL for Redshift

{% import '/dialects/redshift.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} ~ {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table."target_column" ~ '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    analyzed_table."country" AS grouping_level_1,
    analyzed_table."state" AS grouping_level_2
FROM "your_redshift_database"."<target_schema>"."<target_table>" AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

Snowflake

Sensor template for SnowflakeRendered SQL for Snowflake

{% import '/dialects/snowflake.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} REGEXP {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table."target_column" REGEXP '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    analyzed_table."country" AS grouping_level_1,
    analyzed_table."state" AS grouping_level_2
FROM "your_snowflake_database"."<target_schema>"."<target_table>" AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

Spark

Sensor template for SparkRendered SQL for Spark

{% import '/dialects/spark.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} RLIKE {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table.`target_column` RLIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value,
    analyzed_table.`country` AS grouping_level_1,
    analyzed_table.`state` AS grouping_level_2
FROM `<target_schema>`.`<target_table>` AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

SQL Server

Sensor template for SQL ServerRendered SQL for SQL Server

{% import '/dialects/sqlserver.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT_BIG({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} LIKE {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT_BIG({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT_BIG(analyzed_table.[target_column]) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table.[target_column] LIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT_BIG(analyzed_table.[target_column])
    END AS actual_value,
    analyzed_table.[country] AS grouping_level_1,
    analyzed_table.[state] AS grouping_level_2
FROM [your_sql_server_database].[<target_schema>].[<target_table>] AS analyzed_table
GROUP BY analyzed_table.[country], analyzed_table.[state]
ORDER BY level_1, level_2
        ,

Teradata

Sensor template for TeradataRendered SQL for Teradata

{% import '/dialects/teradata.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} IS NOT NULL AND
                        REGEXP_SUBSTR({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }}) IS NULL
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" IS NOT NULL AND
                        REGEXP_SUBSTR(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$') IS NULL
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    analyzed_table."country" AS grouping_level_1,
    analyzed_table."state" AS grouping_level_2
FROM "<target_schema>"."<target_table>" AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

Trino

Sensor template for TrinoRendered SQL for Trino

{% import '/dialects/trino.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT(analyzed_table."target_column")
    END AS actual_value,

                analyzed_table.grouping_level_1,

                analyzed_table.grouping_level_2

FROM (
    SELECT
        original_table.*,
    original_table."country" AS grouping_level_1,
    original_table."state" AS grouping_level_2
    FROM "your_trino_catalog"."<target_schema>"."<target_table>" original_table
) analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

monthly texts not matching regex percent

Check description

Verifies that the percentage of strings not matching the custom regular expression pattern does not exceed the maximum accepted percentage.

Data quality check name	Friendly name	Category	Check type	Time scale	Quality dimension	Sensor definition	Quality rule	Standard
`monthly_texts_not_matching_regex_percent`	Maximum percent of rows containing texts values not matching regex	patterns	monitoring	monthly	Validity	texts_not_matching_regex_percent	max_percent

Command-line examples

Please expand the section below to see the DQOps command-line examples to run or activate the monthly texts not matching regex percent data quality check.

Managing monthly texts not matching regex percent check from DQOps shell

Activate the check with a warning ruleActivate the check with an error ruleRun all configured checks

Activate this data quality using the check activate CLI command, providing the connection name, table name, check name, and all other filters. Activates the warning rule with the default parameters.

dqo> check activate -c=connection_name -t=schema_name.table_name -col=column_name -ch=monthly_texts_not_matching_regex_percent --enable-warning

You can also use patterns to activate the check on all matching tables and columns.

dqo> check activate -c=connection_name -t=schema_prefix*.fact_* -col=column_name -ch=monthly_texts_not_matching_regex_percent --enable-warning

Additional rule parameters are passed using the -Wrule_parameter_name=value.

dqo> check activate -c=connection_name -t=schema_prefix*.fact_* -col=column_name -ch=monthly_texts_not_matching_regex_percent --enable-warning
                    -Wmax_percent=value

Activate this data quality using the check activate CLI command, providing the connection name, table name, check name, and all other filters. Activates the error rule with the default parameters.

dqo> check activate -c=connection_name -t=schema_name.table_name -col=column_name -ch=monthly_texts_not_matching_regex_percent --enable-error

You can also use patterns to activate the check on all matching tables and columns.

dqo> check activate -c=connection_name -t=schema_prefix*.fact_* -col=column_name -ch=monthly_texts_not_matching_regex_percent --enable-error

Additional rule parameters are passed using the -Erule_parameter_name=value.

dqo> check activate -c=connection_name -t=schema_prefix*.fact_* -col=column_name -ch=monthly_texts_not_matching_regex_percent --enable-error
                    -Emax_percent=value

Run this data quality check using the check run CLI command by providing the check name and all other targeting filters. The following example shows how to run the monthly_texts_not_matching_regex_percent check on all tables and columns on a single data source.

dqo> check run -c=data_source_name -ch=monthly_texts_not_matching_regex_percent

It is also possible to run this check on a specific connection and table. In order to do this, use the connection name and the full table name parameters.

dqo> check run -c=connection_name -t=schema_name.table_name -ch=monthly_texts_not_matching_regex_percent

You can also run this check on all tables (and columns) on which the monthly_texts_not_matching_regex_percent check is enabled using patterns to find tables.

dqo> check run -c=connection_name -t=schema_prefix*.fact_* -col=column_name_* -ch=monthly_texts_not_matching_regex_percent

YAML configuration

The sample schema_name.table_name.dqotable.yaml file with the check configured is shown below.

# yaml-language-server: $schema=https://cloud.dqops.com/dqo-yaml-schema/TableYaml-schema.json
apiVersion: dqo/v1
kind: table
spec:
  columns:
    target_column:
      monitoring_checks:
        monthly:
          patterns:
            monthly_texts_not_matching_regex_percent:
              parameters:
                regex: "^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$"
              warning:
                max_percent: 0.0
              error:
                max_percent: 1.0
              fatal:
                max_percent: 5.0
      labels:
      - This is the column that is analyzed for data quality issues

Samples of generated SQL queries for each data source type

Please expand the database engine name section to see the SQL query rendered by a Jinja2 template for the texts_not_matching_regex_percent data quality sensor.

BigQuery

Sensor template for BigQueryRendered SQL for BigQuery

{% import '/dialects/bigquery.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_CONTAINS({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_CONTAINS(analyzed_table.`target_column`, r'^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value
FROM `your-google-project-id`.`<target_schema>`.`<target_table>` AS analyzed_table

ClickHouse

Sensor template for ClickHouseRendered SQL for ClickHouse

{% import '/dialects/clickhouse.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT match({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT match(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value
FROM "<target_schema>"."<target_table>" AS analyzed_table

Databricks

Sensor template for DatabricksRendered SQL for Databricks

{% import '/dialects/databricks.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} RLIKE {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table.`target_column` RLIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value
FROM `<target_schema>`.`<target_table>` AS analyzed_table

DB2

Sensor template for DB2Rendered SQL for DB2

{% import '/dialects/db2.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value
FROM (
    SELECT
        original_table.*
    FROM "<target_schema>"."<target_table>" original_table
) analyzed_table

DuckDB

Sensor template for DuckDBRendered SQL for DuckDB

{% import '/dialects/duckdb.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN REGEXP_MATCHES({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }}) IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN REGEXP_MATCHES(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$') IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value
FROM  AS analyzed_table

HANA

Sensor template for HANARendered SQL for HANA

{% import '/dialects/hana.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} NOT LIKE_REGEXPR {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" NOT LIKE_REGEXPR '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value
FROM (
    SELECT
        original_table.*
    FROM "<target_schema>"."<target_table>" original_table
) analyzed_table

MariaDB

Sensor template for MariaDBRendered SQL for MariaDB

{% import '/dialects/mariadb.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_regex(lib.render_target_column('analyzed_table'), parameters.regex ) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT 
    analyzed_table.`target_column` RLIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'

                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value
FROM `<target_table>` AS analyzed_table

MySQL

Sensor template for MySQLRendered SQL for MySQL

{% import '/dialects/mysql.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_regex(lib.render_target_column('analyzed_table'), parameters.regex ) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table.`target_column`, '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value
FROM `<target_table>` AS analyzed_table

Oracle

Sensor template for OracleRendered SQL for Oracle

{% import '/dialects/oracle.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value
FROM (
    SELECT
        original_table.*
    FROM "<target_schema>"."<target_table>" original_table
) analyzed_table

PostgreSQL

Sensor template for PostgreSQLRendered SQL for PostgreSQL

{% import '/dialects/postgresql.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} ~ {{ lib.render_regex(parameters.regex) }} IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" ~ '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$' IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value
FROM "your_postgresql_database"."<target_schema>"."<target_table>" AS analyzed_table

Presto

Sensor template for PrestoRendered SQL for Presto

{% import '/dialects/presto.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT(analyzed_table."target_column")
    END AS actual_value
FROM (
    SELECT
        original_table.*
    FROM "your_trino_database"."<target_schema>"."<target_table>" original_table
) analyzed_table

QuestDB

Sensor template for QuestDBRendered SQL for QuestDB

{% import '/dialects/questdb.sql.jinja2' as lib with context -%}

SELECT
    COALESCE(100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} ~ {{ lib.render_regex(parameters.regex) }} = FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }}), 100.0)
    AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM(
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
    WHERE {{ lib.render_target_column('original_table') }} IS NOT NULL
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    COALESCE(100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" ~ '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$' = FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column"), 100.0)
    AS actual_value
FROM(
    SELECT
        original_table.*
    FROM "<target_table>" original_table
    WHERE original_table."target_column" IS NOT NULL
) analyzed_table

Redshift

Sensor template for RedshiftRendered SQL for Redshift

{% import '/dialects/redshift.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} ~ {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table."target_column" ~ '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value
FROM "your_redshift_database"."<target_schema>"."<target_table>" AS analyzed_table

Snowflake

Sensor template for SnowflakeRendered SQL for Snowflake

{% import '/dialects/snowflake.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} REGEXP {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table."target_column" REGEXP '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value
FROM "your_snowflake_database"."<target_schema>"."<target_table>" AS analyzed_table

Spark

Sensor template for SparkRendered SQL for Spark

{% import '/dialects/spark.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} RLIKE {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table.`target_column` RLIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value
FROM `<target_schema>`.`<target_table>` AS analyzed_table

SQL Server

Sensor template for SQL ServerRendered SQL for SQL Server

{% import '/dialects/sqlserver.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT_BIG({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} LIKE {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT_BIG({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT_BIG(analyzed_table.[target_column]) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table.[target_column] LIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT_BIG(analyzed_table.[target_column])
    END AS actual_value
FROM [your_sql_server_database].[<target_schema>].[<target_table>] AS analyzed_table

Teradata

Sensor template for TeradataRendered SQL for Teradata

{% import '/dialects/teradata.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} IS NOT NULL AND
                        REGEXP_SUBSTR({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }}) IS NULL
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" IS NOT NULL AND
                        REGEXP_SUBSTR(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$') IS NULL
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value
FROM "<target_schema>"."<target_table>" AS analyzed_table

Trino

Sensor template for TrinoRendered SQL for Trino

{% import '/dialects/trino.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT(analyzed_table."target_column")
    END AS actual_value
FROM (
    SELECT
        original_table.*
    FROM "your_trino_catalog"."<target_schema>"."<target_table>" original_table
) analyzed_table

Expand the Configure with data grouping section to see additional examples for configuring this data quality checks to use data grouping (GROUP BY).

Configuration with data grouping

Sample configuration with data grouping enabled (YAML) The sample below shows how to configure the data grouping and how it affects the generated SQL query.

# yaml-language-server: $schema=https://cloud.dqops.com/dqo-yaml-schema/TableYaml-schema.json
apiVersion: dqo/v1
kind: table
spec:
  default_grouping_name: group_by_country_and_state
  groupings:
    group_by_country_and_state:
      level_1:
        source: column_value
        column: country
      level_2:
        source: column_value
        column: state
  columns:
    target_column:
      monitoring_checks:
        monthly:
          patterns:
            monthly_texts_not_matching_regex_percent:
              parameters:
                regex: "^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$"
              warning:
                max_percent: 0.0
              error:
                max_percent: 1.0
              fatal:
                max_percent: 5.0
      labels:
      - This is the column that is analyzed for data quality issues
    country:
      labels:
      - column used as the first grouping key
    state:
      labels:
      - column used as the second grouping key

Please expand the database engine name section to see the SQL query rendered by a Jinja2 template for the texts_not_matching_regex_percent sensor.

BigQuery

Sensor template for BigQueryRendered SQL for BigQuery

{% import '/dialects/bigquery.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_CONTAINS({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_CONTAINS(analyzed_table.`target_column`, r'^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value,
    analyzed_table.`country` AS grouping_level_1,
    analyzed_table.`state` AS grouping_level_2
FROM `your-google-project-id`.`<target_schema>`.`<target_table>` AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

ClickHouse

Sensor template for ClickHouseRendered SQL for ClickHouse

{% import '/dialects/clickhouse.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT match({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT match(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    analyzed_table."country" AS grouping_level_1,
    analyzed_table."state" AS grouping_level_2
FROM "<target_schema>"."<target_table>" AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

Databricks

Sensor template for DatabricksRendered SQL for Databricks

{% import '/dialects/databricks.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} RLIKE {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table.`target_column` RLIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value,
    analyzed_table.`country` AS grouping_level_1,
    analyzed_table.`state` AS grouping_level_2
FROM `<target_schema>`.`<target_table>` AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

DB2

Sensor template for DB2Rendered SQL for DB2

{% import '/dialects/db2.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,

                analyzed_table.grouping_level_1,

                analyzed_table.grouping_level_2
FROM (
    SELECT
        original_table.*,
    original_table."country" AS grouping_level_1,
    original_table."state" AS grouping_level_2
    FROM "<target_schema>"."<target_table>" original_table
) analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

DuckDB

Sensor template for DuckDBRendered SQL for DuckDB

{% import '/dialects/duckdb.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN REGEXP_MATCHES({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }}) IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN REGEXP_MATCHES(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$') IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    analyzed_table."country" AS grouping_level_1,
    analyzed_table."state" AS grouping_level_2
FROM  AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

HANA

Sensor template for HANARendered SQL for HANA

{% import '/dialects/hana.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} NOT LIKE_REGEXPR {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" NOT LIKE_REGEXPR '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,

                analyzed_table.grouping_level_1,

                analyzed_table.grouping_level_2
FROM (
    SELECT
        original_table.*,
    original_table."country" AS grouping_level_1,
    original_table."state" AS grouping_level_2
    FROM "<target_schema>"."<target_table>" original_table
) analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

MariaDB

Sensor template for MariaDBRendered SQL for MariaDB

{% import '/dialects/mariadb.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_regex(lib.render_target_column('analyzed_table'), parameters.regex ) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT 
    analyzed_table.`target_column` RLIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'

                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value,
    analyzed_table.`country` AS grouping_level_1,
    analyzed_table.`state` AS grouping_level_2
FROM `<target_table>` AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

MySQL

Sensor template for MySQLRendered SQL for MySQL

{% import '/dialects/mysql.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_regex(lib.render_target_column('analyzed_table'), parameters.regex ) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table.`target_column`, '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value,
    analyzed_table.`country` AS grouping_level_1,
    analyzed_table.`state` AS grouping_level_2
FROM `<target_table>` AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

Oracle

Sensor template for OracleRendered SQL for Oracle

{% import '/dialects/oracle.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,

                analyzed_table.grouping_level_1,

                analyzed_table.grouping_level_2

FROM (
    SELECT
        original_table.*,
    original_table."country" AS grouping_level_1,
    original_table."state" AS grouping_level_2
    FROM "<target_schema>"."<target_table>" original_table
) analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

PostgreSQL

Sensor template for PostgreSQLRendered SQL for PostgreSQL

{% import '/dialects/postgresql.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} ~ {{ lib.render_regex(parameters.regex) }} IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" ~ '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$' IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    analyzed_table."country" AS grouping_level_1,
    analyzed_table."state" AS grouping_level_2
FROM "your_postgresql_database"."<target_schema>"."<target_table>" AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

Presto

Sensor template for PrestoRendered SQL for Presto

{% import '/dialects/presto.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT(analyzed_table."target_column")
    END AS actual_value,

                analyzed_table.grouping_level_1,

                analyzed_table.grouping_level_2

FROM (
    SELECT
        original_table.*,
    original_table."country" AS grouping_level_1,
    original_table."state" AS grouping_level_2
    FROM "your_trino_database"."<target_schema>"."<target_table>" original_table
) analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

QuestDB

Sensor template for QuestDBRendered SQL for QuestDB

{% import '/dialects/questdb.sql.jinja2' as lib with context -%}

SELECT
    COALESCE(100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} ~ {{ lib.render_regex(parameters.regex) }} = FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }}), 100.0)
    AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM(
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
    WHERE {{ lib.render_target_column('original_table') }} IS NOT NULL
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    COALESCE(100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" ~ '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$' = FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column"), 100.0)
    AS actual_value,

                analyzed_table.grouping_level_1,

                analyzed_table.grouping_level_2
FROM(
    SELECT
        original_table.*,
    original_table."country" AS grouping_level_1,
    original_table."state" AS grouping_level_2
    FROM "<target_table>" original_table
    WHERE original_table."target_column" IS NOT NULL
) analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

Redshift

Sensor template for RedshiftRendered SQL for Redshift

{% import '/dialects/redshift.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} ~ {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table."target_column" ~ '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    analyzed_table."country" AS grouping_level_1,
    analyzed_table."state" AS grouping_level_2
FROM "your_redshift_database"."<target_schema>"."<target_table>" AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

Snowflake

Sensor template for SnowflakeRendered SQL for Snowflake

{% import '/dialects/snowflake.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} REGEXP {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table."target_column" REGEXP '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    analyzed_table."country" AS grouping_level_1,
    analyzed_table."state" AS grouping_level_2
FROM "your_snowflake_database"."<target_schema>"."<target_table>" AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

Spark

Sensor template for SparkRendered SQL for Spark

{% import '/dialects/spark.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} RLIKE {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table.`target_column` RLIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value,
    analyzed_table.`country` AS grouping_level_1,
    analyzed_table.`state` AS grouping_level_2
FROM `<target_schema>`.`<target_table>` AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

SQL Server

Sensor template for SQL ServerRendered SQL for SQL Server

{% import '/dialects/sqlserver.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT_BIG({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} LIKE {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT_BIG({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT_BIG(analyzed_table.[target_column]) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table.[target_column] LIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT_BIG(analyzed_table.[target_column])
    END AS actual_value,
    analyzed_table.[country] AS grouping_level_1,
    analyzed_table.[state] AS grouping_level_2
FROM [your_sql_server_database].[<target_schema>].[<target_table>] AS analyzed_table
GROUP BY analyzed_table.[country], analyzed_table.[state]
ORDER BY level_1, level_2
        ,

Teradata

Sensor template for TeradataRendered SQL for Teradata

{% import '/dialects/teradata.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} IS NOT NULL AND
                        REGEXP_SUBSTR({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }}) IS NULL
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" IS NOT NULL AND
                        REGEXP_SUBSTR(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$') IS NULL
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    analyzed_table."country" AS grouping_level_1,
    analyzed_table."state" AS grouping_level_2
FROM "<target_schema>"."<target_table>" AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

Trino

Sensor template for TrinoRendered SQL for Trino

{% import '/dialects/trino.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT(analyzed_table."target_column")
    END AS actual_value,

                analyzed_table.grouping_level_1,

                analyzed_table.grouping_level_2

FROM (
    SELECT
        original_table.*,
    original_table."country" AS grouping_level_1,
    original_table."state" AS grouping_level_2
    FROM "your_trino_catalog"."<target_schema>"."<target_table>" original_table
) analyzed_table
GROUP BY grouping_level_1, grouping_level_2
ORDER BY grouping_level_1, grouping_level_2

daily partition texts not matching regex percent

Check description

Verifies that the percentage of strings matching the custom regular expression pattern does not exceed the maximum accepted percentage.

Data quality check name	Friendly name	Category	Check type	Time scale	Quality dimension	Sensor definition	Quality rule	Standard
`daily_partition_texts_not_matching_regex_percent`	Maximum percent of rows containing texts values not matching regex	patterns	partitioned	daily	Validity	texts_not_matching_regex_percent	max_percent

Command-line examples

Please expand the section below to see the DQOps command-line examples to run or activate the daily partition texts not matching regex percent data quality check.

Managing daily partition texts not matching regex percent check from DQOps shell

Activate the check with a warning ruleActivate the check with an error ruleRun all configured checks

Activate this data quality using the check activate CLI command, providing the connection name, table name, check name, and all other filters. Activates the warning rule with the default parameters.

dqo> check activate -c=connection_name -t=schema_name.table_name -col=column_name -ch=daily_partition_texts_not_matching_regex_percent --enable-warning

You can also use patterns to activate the check on all matching tables and columns.

dqo> check activate -c=connection_name -t=schema_prefix*.fact_* -col=column_name -ch=daily_partition_texts_not_matching_regex_percent --enable-warning

Additional rule parameters are passed using the -Wrule_parameter_name=value.

dqo> check activate -c=connection_name -t=schema_prefix*.fact_* -col=column_name -ch=daily_partition_texts_not_matching_regex_percent --enable-warning
                    -Wmax_percent=value

Activate this data quality using the check activate CLI command, providing the connection name, table name, check name, and all other filters. Activates the error rule with the default parameters.

dqo> check activate -c=connection_name -t=schema_name.table_name -col=column_name -ch=daily_partition_texts_not_matching_regex_percent --enable-error

You can also use patterns to activate the check on all matching tables and columns.

dqo> check activate -c=connection_name -t=schema_prefix*.fact_* -col=column_name -ch=daily_partition_texts_not_matching_regex_percent --enable-error

Additional rule parameters are passed using the -Erule_parameter_name=value.

dqo> check activate -c=connection_name -t=schema_prefix*.fact_* -col=column_name -ch=daily_partition_texts_not_matching_regex_percent --enable-error
                    -Emax_percent=value

Run this data quality check using the check run CLI command by providing the check name and all other targeting filters. The following example shows how to run the daily_partition_texts_not_matching_regex_percent check on all tables and columns on a single data source.

dqo> check run -c=data_source_name -ch=daily_partition_texts_not_matching_regex_percent

It is also possible to run this check on a specific connection and table. In order to do this, use the connection name and the full table name parameters.

dqo> check run -c=connection_name -t=schema_name.table_name -ch=daily_partition_texts_not_matching_regex_percent

You can also run this check on all tables (and columns) on which the daily_partition_texts_not_matching_regex_percent check is enabled using patterns to find tables.

dqo> check run -c=connection_name -t=schema_prefix*.fact_* -col=column_name_* -ch=daily_partition_texts_not_matching_regex_percent

YAML configuration

The sample schema_name.table_name.dqotable.yaml file with the check configured is shown below.

# yaml-language-server: $schema=https://cloud.dqops.com/dqo-yaml-schema/TableYaml-schema.json
apiVersion: dqo/v1
kind: table
spec:
  timestamp_columns:
    partition_by_column: date_column
  incremental_time_window:
    daily_partitioning_recent_days: 7
    monthly_partitioning_recent_months: 1
  columns:
    target_column:
      partitioned_checks:
        daily:
          patterns:
            daily_partition_texts_not_matching_regex_percent:
              parameters:
                regex: "^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$"
              warning:
                max_percent: 0.0
              error:
                max_percent: 1.0
              fatal:
                max_percent: 5.0
      labels:
      - This is the column that is analyzed for data quality issues
    date_column:
      labels:
      - "date or datetime column used as a daily or monthly partitioning key, dates\
        \ (and times) are truncated to a day or a month by the sensor's query for\
        \ partitioned checks"

Samples of generated SQL queries for each data source type

Please expand the database engine name section to see the SQL query rendered by a Jinja2 template for the texts_not_matching_regex_percent data quality sensor.

BigQuery

Sensor template for BigQueryRendered SQL for BigQuery

{% import '/dialects/bigquery.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_CONTAINS({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_CONTAINS(analyzed_table.`target_column`, r'^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value,
    CAST(analyzed_table.`date_column` AS DATE) AS time_period,
    TIMESTAMP(CAST(analyzed_table.`date_column` AS DATE)) AS time_period_utc
FROM `your-google-project-id`.`<target_schema>`.`<target_table>` AS analyzed_table
GROUP BY time_period, time_period_utc
ORDER BY time_period, time_period_utc

ClickHouse

Sensor template for ClickHouseRendered SQL for ClickHouse

{% import '/dialects/clickhouse.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT match({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT match(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    CAST(analyzed_table."date_column" AS DATE) AS time_period,
    toDateTime64(CAST(analyzed_table."date_column" AS DATE), 3) AS time_period_utc
FROM "<target_schema>"."<target_table>" AS analyzed_table
GROUP BY time_period, time_period_utc
ORDER BY time_period, time_period_utc

Databricks

Sensor template for DatabricksRendered SQL for Databricks

{% import '/dialects/databricks.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} RLIKE {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table.`target_column` RLIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value,
    CAST(analyzed_table.`date_column` AS DATE) AS time_period,
    TIMESTAMP(CAST(analyzed_table.`date_column` AS DATE)) AS time_period_utc
FROM `<target_schema>`.`<target_table>` AS analyzed_table
GROUP BY time_period, time_period_utc
ORDER BY time_period, time_period_utc

DB2

Sensor template for DB2Rendered SQL for DB2

{% import '/dialects/db2.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    time_period,
    time_period_utc
FROM (
    SELECT
        original_table.*,
    CAST(original_table."date_column" AS DATE) AS time_period,
    TIMESTAMP(CAST(original_table."date_column" AS DATE)) AS time_period_utc
    FROM "<target_schema>"."<target_table>" original_table
) analyzed_table
GROUP BY time_period, time_period_utc
ORDER BY time_period, time_period_utc

DuckDB

Sensor template for DuckDBRendered SQL for DuckDB

{% import '/dialects/duckdb.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN REGEXP_MATCHES({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }}) IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN REGEXP_MATCHES(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$') IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    CAST(analyzed_table."date_column" AS date) AS time_period,
    CAST((CAST(analyzed_table."date_column" AS date)) AS TIMESTAMP WITH TIME ZONE) AS time_period_utc
FROM  AS analyzed_table
GROUP BY time_period, time_period_utc
ORDER BY time_period, time_period_utc

HANA

Sensor template for HANARendered SQL for HANA

{% import '/dialects/hana.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} NOT LIKE_REGEXPR {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" NOT LIKE_REGEXPR '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    time_period,
    time_period_utc
FROM (
    SELECT
        original_table.*,
    CAST(original_table."date_column" AS DATE) AS time_period,
    TO_TIMESTAMP(CAST(original_table."date_column" AS DATE)) AS time_period_utc
    FROM "<target_schema>"."<target_table>" original_table
) analyzed_table
GROUP BY time_period, time_period_utc
ORDER BY time_period, time_period_utc

MariaDB

Sensor template for MariaDBRendered SQL for MariaDB

{% import '/dialects/mariadb.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_regex(lib.render_target_column('analyzed_table'), parameters.regex ) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT 
    analyzed_table.`target_column` RLIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'

                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value,
    DATE_FORMAT(analyzed_table.`date_column`, '%Y-%m-%d 00:00:00') AS time_period,
    FROM_UNIXTIME(UNIX_TIMESTAMP(DATE_FORMAT(analyzed_table.`date_column`, '%Y-%m-%d 00:00:00'))) AS time_period_utc
FROM `<target_table>` AS analyzed_table
GROUP BY time_period, time_period_utc
ORDER BY time_period, time_period_utc

MySQL

Sensor template for MySQLRendered SQL for MySQL

{% import '/dialects/mysql.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_regex(lib.render_target_column('analyzed_table'), parameters.regex ) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table.`target_column`, '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value,
    DATE_FORMAT(analyzed_table.`date_column`, '%Y-%m-%d 00:00:00') AS time_period,
    FROM_UNIXTIME(UNIX_TIMESTAMP(DATE_FORMAT(analyzed_table.`date_column`, '%Y-%m-%d 00:00:00'))) AS time_period_utc
FROM `<target_table>` AS analyzed_table
GROUP BY time_period, time_period_utc
ORDER BY time_period, time_period_utc

Oracle

Sensor template for OracleRendered SQL for Oracle

{% import '/dialects/oracle.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    time_period,
    time_period_utc
FROM (
    SELECT
        original_table.*,
    TRUNC(CAST(original_table."date_column" AS DATE)) AS time_period,
    CAST(TRUNC(CAST(original_table."date_column" AS DATE)) AS TIMESTAMP WITH TIME ZONE) AS time_period_utc
    FROM "<target_schema>"."<target_table>" original_table
) analyzed_table
GROUP BY time_period, time_period_utc
ORDER BY time_period, time_period_utc

PostgreSQL

Sensor template for PostgreSQLRendered SQL for PostgreSQL

{% import '/dialects/postgresql.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} ~ {{ lib.render_regex(parameters.regex) }} IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" ~ '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$' IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    CAST(analyzed_table."date_column" AS date) AS time_period,
    CAST((CAST(analyzed_table."date_column" AS date)) AS TIMESTAMP WITH TIME ZONE) AS time_period_utc
FROM "your_postgresql_database"."<target_schema>"."<target_table>" AS analyzed_table
GROUP BY time_period, time_period_utc
ORDER BY time_period, time_period_utc

Presto

Sensor template for PrestoRendered SQL for Presto

{% import '/dialects/presto.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    time_period,
    time_period_utc
FROM (
    SELECT
        original_table.*,
    CAST(original_table."date_column" AS date) AS time_period,
    CAST(CAST(original_table."date_column" AS date) AS TIMESTAMP) AS time_period_utc
    FROM "your_trino_database"."<target_schema>"."<target_table>" original_table
) analyzed_table
GROUP BY time_period, time_period_utc
ORDER BY time_period, time_period_utc

QuestDB

Sensor template for QuestDBRendered SQL for QuestDB

{% import '/dialects/questdb.sql.jinja2' as lib with context -%}

SELECT
    COALESCE(100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} ~ {{ lib.render_regex(parameters.regex) }} = FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }}), 100.0)
    AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM(
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
    WHERE {{ lib.render_target_column('original_table') }} IS NOT NULL
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    COALESCE(100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" ~ '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$' = FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column"), 100.0)
    AS actual_value,
    time_period,
    time_period_utc
FROM(
    SELECT
        original_table.*,
    CAST(DATE_TRUNC('day', original_table."date_column") AS DATE) AS time_period,
    CAST((CAST(DATE_TRUNC('day', original_table."date_column") AS DATE)) AS TIMESTAMP WITH TIME ZONE) AS time_period_utc
    FROM "<target_table>" original_table
    WHERE original_table."target_column" IS NOT NULL
) analyzed_table
GROUP BY time_period, time_period_utc
ORDER BY time_period, time_period_utc

Redshift

Sensor template for RedshiftRendered SQL for Redshift

{% import '/dialects/redshift.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} ~ {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table."target_column" ~ '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    CAST(analyzed_table."date_column" AS date) AS time_period,
    CAST((CAST(analyzed_table."date_column" AS date)) AS TIMESTAMP WITH TIME ZONE) AS time_period_utc
FROM "your_redshift_database"."<target_schema>"."<target_table>" AS analyzed_table
GROUP BY time_period, time_period_utc
ORDER BY time_period, time_period_utc

Snowflake

Sensor template for SnowflakeRendered SQL for Snowflake

{% import '/dialects/snowflake.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} REGEXP {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table."target_column" REGEXP '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    CAST(analyzed_table."date_column" AS date) AS time_period,
    TO_TIMESTAMP(CAST(analyzed_table."date_column" AS date)) AS time_period_utc
FROM "your_snowflake_database"."<target_schema>"."<target_table>" AS analyzed_table
GROUP BY time_period, time_period_utc
ORDER BY time_period, time_period_utc

Spark

Sensor template for SparkRendered SQL for Spark

{% import '/dialects/spark.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} RLIKE {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table.`target_column` RLIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value,
    CAST(analyzed_table.`date_column` AS DATE) AS time_period,
    TIMESTAMP(CAST(analyzed_table.`date_column` AS DATE)) AS time_period_utc
FROM `<target_schema>`.`<target_table>` AS analyzed_table
GROUP BY time_period, time_period_utc
ORDER BY time_period, time_period_utc

SQL Server

Sensor template for SQL ServerRendered SQL for SQL Server

{% import '/dialects/sqlserver.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT_BIG({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} LIKE {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT_BIG({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT_BIG(analyzed_table.[target_column]) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table.[target_column] LIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT_BIG(analyzed_table.[target_column])
    END AS actual_value,
    CAST(analyzed_table.[date_column] AS date) AS time_period,
    CAST((CAST(analyzed_table.[date_column] AS date)) AS DATETIME) AS time_period_utc
FROM [your_sql_server_database].[<target_schema>].[<target_table>] AS analyzed_table
GROUP BY CAST(analyzed_table.[date_column] AS date), CAST(analyzed_table.[date_column] AS date)
ORDER BY CAST(analyzed_table.[date_column] AS date)

Teradata

Sensor template for TeradataRendered SQL for Teradata

{% import '/dialects/teradata.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} IS NOT NULL AND
                        REGEXP_SUBSTR({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }}) IS NULL
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" IS NOT NULL AND
                        REGEXP_SUBSTR(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$') IS NULL
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    CAST(analyzed_table."date_column" AS DATE) AS time_period,
    CAST(CAST(analyzed_table."date_column" AS DATE) AS TIMESTAMP) AS time_period_utc
FROM "<target_schema>"."<target_table>" AS analyzed_table
GROUP BY time_period, time_period_utc
ORDER BY time_period, time_period_utc

Trino

Sensor template for TrinoRendered SQL for Trino

{% import '/dialects/trino.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    time_period,
    time_period_utc
FROM (
    SELECT
        original_table.*,
    CAST(original_table."date_column" AS date) AS time_period,
    CAST(CAST(original_table."date_column" AS date) AS TIMESTAMP) AS time_period_utc
    FROM "your_trino_catalog"."<target_schema>"."<target_table>" original_table
) analyzed_table
GROUP BY time_period, time_period_utc
ORDER BY time_period, time_period_utc

Expand the Configure with data grouping section to see additional examples for configuring this data quality checks to use data grouping (GROUP BY).

Configuration with data grouping

Sample configuration with data grouping enabled (YAML) The sample below shows how to configure the data grouping and how it affects the generated SQL query.

# yaml-language-server: $schema=https://cloud.dqops.com/dqo-yaml-schema/TableYaml-schema.json
apiVersion: dqo/v1
kind: table
spec:
  timestamp_columns:
    partition_by_column: date_column
  incremental_time_window:
    daily_partitioning_recent_days: 7
    monthly_partitioning_recent_months: 1
  default_grouping_name: group_by_country_and_state
  groupings:
    group_by_country_and_state:
      level_1:
        source: column_value
        column: country
      level_2:
        source: column_value
        column: state
  columns:
    target_column:
      partitioned_checks:
        daily:
          patterns:
            daily_partition_texts_not_matching_regex_percent:
              parameters:
                regex: "^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$"
              warning:
                max_percent: 0.0
              error:
                max_percent: 1.0
              fatal:
                max_percent: 5.0
      labels:
      - This is the column that is analyzed for data quality issues
    date_column:
      labels:
      - "date or datetime column used as a daily or monthly partitioning key, dates\
        \ (and times) are truncated to a day or a month by the sensor's query for\
        \ partitioned checks"
    country:
      labels:
      - column used as the first grouping key
    state:
      labels:
      - column used as the second grouping key

Please expand the database engine name section to see the SQL query rendered by a Jinja2 template for the texts_not_matching_regex_percent sensor.

BigQuery

Sensor template for BigQueryRendered SQL for BigQuery

{% import '/dialects/bigquery.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_CONTAINS({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_CONTAINS(analyzed_table.`target_column`, r'^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value,
    analyzed_table.`country` AS grouping_level_1,
    analyzed_table.`state` AS grouping_level_2,
    CAST(analyzed_table.`date_column` AS DATE) AS time_period,
    TIMESTAMP(CAST(analyzed_table.`date_column` AS DATE)) AS time_period_utc
FROM `your-google-project-id`.`<target_schema>`.`<target_table>` AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2, time_period, time_period_utc
ORDER BY grouping_level_1, grouping_level_2, time_period, time_period_utc

ClickHouse

Sensor template for ClickHouseRendered SQL for ClickHouse

{% import '/dialects/clickhouse.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT match({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT match(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    analyzed_table."country" AS grouping_level_1,
    analyzed_table."state" AS grouping_level_2,
    CAST(analyzed_table."date_column" AS DATE) AS time_period,
    toDateTime64(CAST(analyzed_table."date_column" AS DATE), 3) AS time_period_utc
FROM "<target_schema>"."<target_table>" AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2, time_period, time_period_utc
ORDER BY grouping_level_1, grouping_level_2, time_period, time_period_utc

Databricks

Sensor template for DatabricksRendered SQL for Databricks

{% import '/dialects/databricks.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} RLIKE {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table.`target_column` RLIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value,
    analyzed_table.`country` AS grouping_level_1,
    analyzed_table.`state` AS grouping_level_2,
    CAST(analyzed_table.`date_column` AS DATE) AS time_period,
    TIMESTAMP(CAST(analyzed_table.`date_column` AS DATE)) AS time_period_utc
FROM `<target_schema>`.`<target_table>` AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2, time_period, time_period_utc
ORDER BY grouping_level_1, grouping_level_2, time_period, time_period_utc

DB2

Sensor template for DB2Rendered SQL for DB2

{% import '/dialects/db2.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,

                analyzed_table.grouping_level_1,

                analyzed_table.grouping_level_2,
    time_period,
    time_period_utc
FROM (
    SELECT
        original_table.*,
    original_table."country" AS grouping_level_1,
    original_table."state" AS grouping_level_2,
    CAST(original_table."date_column" AS DATE) AS time_period,
    TIMESTAMP(CAST(original_table."date_column" AS DATE)) AS time_period_utc
    FROM "<target_schema>"."<target_table>" original_table
) analyzed_table
GROUP BY grouping_level_1, grouping_level_2, time_period, time_period_utc
ORDER BY grouping_level_1, grouping_level_2, time_period, time_period_utc

DuckDB

Sensor template for DuckDBRendered SQL for DuckDB

{% import '/dialects/duckdb.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN REGEXP_MATCHES({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }}) IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN REGEXP_MATCHES(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$') IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    analyzed_table."country" AS grouping_level_1,
    analyzed_table."state" AS grouping_level_2,
    CAST(analyzed_table."date_column" AS date) AS time_period,
    CAST((CAST(analyzed_table."date_column" AS date)) AS TIMESTAMP WITH TIME ZONE) AS time_period_utc
FROM  AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2, time_period, time_period_utc
ORDER BY grouping_level_1, grouping_level_2, time_period, time_period_utc

HANA

Sensor template for HANARendered SQL for HANA

{% import '/dialects/hana.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} NOT LIKE_REGEXPR {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" NOT LIKE_REGEXPR '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,

                analyzed_table.grouping_level_1,

                analyzed_table.grouping_level_2,
    time_period,
    time_period_utc
FROM (
    SELECT
        original_table.*,
    original_table."country" AS grouping_level_1,
    original_table."state" AS grouping_level_2,
    CAST(original_table."date_column" AS DATE) AS time_period,
    TO_TIMESTAMP(CAST(original_table."date_column" AS DATE)) AS time_period_utc
    FROM "<target_schema>"."<target_table>" original_table
) analyzed_table
GROUP BY grouping_level_1, grouping_level_2, time_period, time_period_utc
ORDER BY grouping_level_1, grouping_level_2, time_period, time_period_utc

MariaDB

Sensor template for MariaDBRendered SQL for MariaDB

{% import '/dialects/mariadb.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_regex(lib.render_target_column('analyzed_table'), parameters.regex ) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT 
    analyzed_table.`target_column` RLIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'

                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value,
    analyzed_table.`country` AS grouping_level_1,
    analyzed_table.`state` AS grouping_level_2,
    DATE_FORMAT(analyzed_table.`date_column`, '%Y-%m-%d 00:00:00') AS time_period,
    FROM_UNIXTIME(UNIX_TIMESTAMP(DATE_FORMAT(analyzed_table.`date_column`, '%Y-%m-%d 00:00:00'))) AS time_period_utc
FROM `<target_table>` AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2, time_period, time_period_utc
ORDER BY grouping_level_1, grouping_level_2, time_period, time_period_utc

MySQL

Sensor template for MySQLRendered SQL for MySQL

{% import '/dialects/mysql.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_regex(lib.render_target_column('analyzed_table'), parameters.regex ) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table.`target_column`, '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value,
    analyzed_table.`country` AS grouping_level_1,
    analyzed_table.`state` AS grouping_level_2,
    DATE_FORMAT(analyzed_table.`date_column`, '%Y-%m-%d 00:00:00') AS time_period,
    FROM_UNIXTIME(UNIX_TIMESTAMP(DATE_FORMAT(analyzed_table.`date_column`, '%Y-%m-%d 00:00:00'))) AS time_period_utc
FROM `<target_table>` AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2, time_period, time_period_utc
ORDER BY grouping_level_1, grouping_level_2, time_period, time_period_utc

Oracle

Sensor template for OracleRendered SQL for Oracle

{% import '/dialects/oracle.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,

                analyzed_table.grouping_level_1,

                analyzed_table.grouping_level_2
,
    time_period,
    time_period_utc
FROM (
    SELECT
        original_table.*,
    original_table."country" AS grouping_level_1,
    original_table."state" AS grouping_level_2,
    TRUNC(CAST(original_table."date_column" AS DATE)) AS time_period,
    CAST(TRUNC(CAST(original_table."date_column" AS DATE)) AS TIMESTAMP WITH TIME ZONE) AS time_period_utc
    FROM "<target_schema>"."<target_table>" original_table
) analyzed_table
GROUP BY grouping_level_1, grouping_level_2, time_period, time_period_utc
ORDER BY grouping_level_1, grouping_level_2, time_period, time_period_utc

PostgreSQL

Sensor template for PostgreSQLRendered SQL for PostgreSQL

{% import '/dialects/postgresql.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} ~ {{ lib.render_regex(parameters.regex) }} IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" ~ '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$' IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    analyzed_table."country" AS grouping_level_1,
    analyzed_table."state" AS grouping_level_2,
    CAST(analyzed_table."date_column" AS date) AS time_period,
    CAST((CAST(analyzed_table."date_column" AS date)) AS TIMESTAMP WITH TIME ZONE) AS time_period_utc
FROM "your_postgresql_database"."<target_schema>"."<target_table>" AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2, time_period, time_period_utc
ORDER BY grouping_level_1, grouping_level_2, time_period, time_period_utc

Presto

Sensor template for PrestoRendered SQL for Presto

{% import '/dialects/presto.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT(analyzed_table."target_column")
    END AS actual_value,

                analyzed_table.grouping_level_1,

                analyzed_table.grouping_level_2
,
    time_period,
    time_period_utc
FROM (
    SELECT
        original_table.*,
    original_table."country" AS grouping_level_1,
    original_table."state" AS grouping_level_2,
    CAST(original_table."date_column" AS date) AS time_period,
    CAST(CAST(original_table."date_column" AS date) AS TIMESTAMP) AS time_period_utc
    FROM "your_trino_database"."<target_schema>"."<target_table>" original_table
) analyzed_table
GROUP BY grouping_level_1, grouping_level_2, time_period, time_period_utc
ORDER BY grouping_level_1, grouping_level_2, time_period, time_period_utc

QuestDB

Sensor template for QuestDBRendered SQL for QuestDB

{% import '/dialects/questdb.sql.jinja2' as lib with context -%}

SELECT
    COALESCE(100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} ~ {{ lib.render_regex(parameters.regex) }} = FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }}), 100.0)
    AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM(
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
    WHERE {{ lib.render_target_column('original_table') }} IS NOT NULL
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    COALESCE(100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" ~ '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$' = FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column"), 100.0)
    AS actual_value,

                analyzed_table.grouping_level_1,

                analyzed_table.grouping_level_2,
    time_period,
    time_period_utc
FROM(
    SELECT
        original_table.*,
    original_table."country" AS grouping_level_1,
    original_table."state" AS grouping_level_2,
    CAST(DATE_TRUNC('day', original_table."date_column") AS DATE) AS time_period,
    CAST((CAST(DATE_TRUNC('day', original_table."date_column") AS DATE)) AS TIMESTAMP WITH TIME ZONE) AS time_period_utc
    FROM "<target_table>" original_table
    WHERE original_table."target_column" IS NOT NULL
) analyzed_table
GROUP BY grouping_level_1, grouping_level_2, time_period, time_period_utc
ORDER BY grouping_level_1, grouping_level_2, time_period, time_period_utc

Redshift

Sensor template for RedshiftRendered SQL for Redshift

{% import '/dialects/redshift.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} ~ {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table."target_column" ~ '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    analyzed_table."country" AS grouping_level_1,
    analyzed_table."state" AS grouping_level_2,
    CAST(analyzed_table."date_column" AS date) AS time_period,
    CAST((CAST(analyzed_table."date_column" AS date)) AS TIMESTAMP WITH TIME ZONE) AS time_period_utc
FROM "your_redshift_database"."<target_schema>"."<target_table>" AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2, time_period, time_period_utc
ORDER BY grouping_level_1, grouping_level_2, time_period, time_period_utc

Snowflake

Sensor template for SnowflakeRendered SQL for Snowflake

{% import '/dialects/snowflake.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} REGEXP {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table."target_column" REGEXP '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    analyzed_table."country" AS grouping_level_1,
    analyzed_table."state" AS grouping_level_2,
    CAST(analyzed_table."date_column" AS date) AS time_period,
    TO_TIMESTAMP(CAST(analyzed_table."date_column" AS date)) AS time_period_utc
FROM "your_snowflake_database"."<target_schema>"."<target_table>" AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2, time_period, time_period_utc
ORDER BY grouping_level_1, grouping_level_2, time_period, time_period_utc

Spark

Sensor template for SparkRendered SQL for Spark

{% import '/dialects/spark.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} RLIKE {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table.`target_column` RLIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value,
    analyzed_table.`country` AS grouping_level_1,
    analyzed_table.`state` AS grouping_level_2,
    CAST(analyzed_table.`date_column` AS DATE) AS time_period,
    TIMESTAMP(CAST(analyzed_table.`date_column` AS DATE)) AS time_period_utc
FROM `<target_schema>`.`<target_table>` AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2, time_period, time_period_utc
ORDER BY grouping_level_1, grouping_level_2, time_period, time_period_utc

SQL Server

Sensor template for SQL ServerRendered SQL for SQL Server

{% import '/dialects/sqlserver.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT_BIG({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} LIKE {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT_BIG({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT_BIG(analyzed_table.[target_column]) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table.[target_column] LIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT_BIG(analyzed_table.[target_column])
    END AS actual_value,
    analyzed_table.[country] AS grouping_level_1,
    analyzed_table.[state] AS grouping_level_2,
    CAST(analyzed_table.[date_column] AS date) AS time_period,
    CAST((CAST(analyzed_table.[date_column] AS date)) AS DATETIME) AS time_period_utc
FROM [your_sql_server_database].[<target_schema>].[<target_table>] AS analyzed_table
GROUP BY analyzed_table.[country], analyzed_table.[state], CAST(analyzed_table.[date_column] AS date), CAST(analyzed_table.[date_column] AS date)
ORDER BY level_1, level_2CAST(analyzed_table.[date_column] AS date)

Teradata

Sensor template for TeradataRendered SQL for Teradata

{% import '/dialects/teradata.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} IS NOT NULL AND
                        REGEXP_SUBSTR({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }}) IS NULL
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" IS NOT NULL AND
                        REGEXP_SUBSTR(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$') IS NULL
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    analyzed_table."country" AS grouping_level_1,
    analyzed_table."state" AS grouping_level_2,
    CAST(analyzed_table."date_column" AS DATE) AS time_period,
    CAST(CAST(analyzed_table."date_column" AS DATE) AS TIMESTAMP) AS time_period_utc
FROM "<target_schema>"."<target_table>" AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2, time_period, time_period_utc
ORDER BY grouping_level_1, grouping_level_2, time_period, time_period_utc

Trino

Sensor template for TrinoRendered SQL for Trino

{% import '/dialects/trino.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT(analyzed_table."target_column")
    END AS actual_value,

                analyzed_table.grouping_level_1,

                analyzed_table.grouping_level_2
,
    time_period,
    time_period_utc
FROM (
    SELECT
        original_table.*,
    original_table."country" AS grouping_level_1,
    original_table."state" AS grouping_level_2,
    CAST(original_table."date_column" AS date) AS time_period,
    CAST(CAST(original_table."date_column" AS date) AS TIMESTAMP) AS time_period_utc
    FROM "your_trino_catalog"."<target_schema>"."<target_table>" original_table
) analyzed_table
GROUP BY grouping_level_1, grouping_level_2, time_period, time_period_utc
ORDER BY grouping_level_1, grouping_level_2, time_period, time_period_utc

monthly partition texts not matching regex percent

Check description

Verifies that the percentage of strings matching the custom regular expression pattern does not exceed the maximum accepted percentage.

Data quality check name	Friendly name	Category	Check type	Time scale	Quality dimension	Sensor definition	Quality rule	Standard
`monthly_partition_texts_not_matching_regex_percent`	Maximum percent of rows containing texts values not matching regex	patterns	partitioned	monthly	Validity	texts_not_matching_regex_percent	max_percent

Command-line examples

Please expand the section below to see the DQOps command-line examples to run or activate the monthly partition texts not matching regex percent data quality check.

Managing monthly partition texts not matching regex percent check from DQOps shell

Activate the check with a warning ruleActivate the check with an error ruleRun all configured checks

Activate this data quality using the check activate CLI command, providing the connection name, table name, check name, and all other filters. Activates the warning rule with the default parameters.

dqo> check activate -c=connection_name -t=schema_name.table_name -col=column_name -ch=monthly_partition_texts_not_matching_regex_percent --enable-warning

You can also use patterns to activate the check on all matching tables and columns.

dqo> check activate -c=connection_name -t=schema_prefix*.fact_* -col=column_name -ch=monthly_partition_texts_not_matching_regex_percent --enable-warning

Additional rule parameters are passed using the -Wrule_parameter_name=value.

dqo> check activate -c=connection_name -t=schema_prefix*.fact_* -col=column_name -ch=monthly_partition_texts_not_matching_regex_percent --enable-warning
                    -Wmax_percent=value

Activate this data quality using the check activate CLI command, providing the connection name, table name, check name, and all other filters. Activates the error rule with the default parameters.

dqo> check activate -c=connection_name -t=schema_name.table_name -col=column_name -ch=monthly_partition_texts_not_matching_regex_percent --enable-error

You can also use patterns to activate the check on all matching tables and columns.

dqo> check activate -c=connection_name -t=schema_prefix*.fact_* -col=column_name -ch=monthly_partition_texts_not_matching_regex_percent --enable-error

Additional rule parameters are passed using the -Erule_parameter_name=value.

dqo> check activate -c=connection_name -t=schema_prefix*.fact_* -col=column_name -ch=monthly_partition_texts_not_matching_regex_percent --enable-error
                    -Emax_percent=value

Run this data quality check using the check run CLI command by providing the check name and all other targeting filters. The following example shows how to run the monthly_partition_texts_not_matching_regex_percent check on all tables and columns on a single data source.

dqo> check run -c=data_source_name -ch=monthly_partition_texts_not_matching_regex_percent

It is also possible to run this check on a specific connection and table. In order to do this, use the connection name and the full table name parameters.

dqo> check run -c=connection_name -t=schema_name.table_name -ch=monthly_partition_texts_not_matching_regex_percent

You can also run this check on all tables (and columns) on which the monthly_partition_texts_not_matching_regex_percent check is enabled using patterns to find tables.

dqo> check run -c=connection_name -t=schema_prefix*.fact_* -col=column_name_* -ch=monthly_partition_texts_not_matching_regex_percent

YAML configuration

The sample schema_name.table_name.dqotable.yaml file with the check configured is shown below.

# yaml-language-server: $schema=https://cloud.dqops.com/dqo-yaml-schema/TableYaml-schema.json
apiVersion: dqo/v1
kind: table
spec:
  timestamp_columns:
    partition_by_column: date_column
  incremental_time_window:
    daily_partitioning_recent_days: 7
    monthly_partitioning_recent_months: 1
  columns:
    target_column:
      partitioned_checks:
        monthly:
          patterns:
            monthly_partition_texts_not_matching_regex_percent:
              parameters:
                regex: "^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$"
              warning:
                max_percent: 0.0
              error:
                max_percent: 1.0
              fatal:
                max_percent: 5.0
      labels:
      - This is the column that is analyzed for data quality issues
    date_column:
      labels:
      - "date or datetime column used as a daily or monthly partitioning key, dates\
        \ (and times) are truncated to a day or a month by the sensor's query for\
        \ partitioned checks"

Samples of generated SQL queries for each data source type

Please expand the database engine name section to see the SQL query rendered by a Jinja2 template for the texts_not_matching_regex_percent data quality sensor.

BigQuery

Sensor template for BigQueryRendered SQL for BigQuery

{% import '/dialects/bigquery.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_CONTAINS({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_CONTAINS(analyzed_table.`target_column`, r'^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value,
    DATE_TRUNC(CAST(analyzed_table.`date_column` AS DATE), MONTH) AS time_period,
    TIMESTAMP(DATE_TRUNC(CAST(analyzed_table.`date_column` AS DATE), MONTH)) AS time_period_utc
FROM `your-google-project-id`.`<target_schema>`.`<target_table>` AS analyzed_table
GROUP BY time_period, time_period_utc
ORDER BY time_period, time_period_utc

ClickHouse

Sensor template for ClickHouseRendered SQL for ClickHouse

{% import '/dialects/clickhouse.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT match({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT match(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    DATE_TRUNC('month', CAST(analyzed_table."date_column" AS DATE)) AS time_period,
    toDateTime64(DATE_TRUNC('month', CAST(analyzed_table."date_column" AS DATE)), 3) AS time_period_utc
FROM "<target_schema>"."<target_table>" AS analyzed_table
GROUP BY time_period, time_period_utc
ORDER BY time_period, time_period_utc

Databricks

Sensor template for DatabricksRendered SQL for Databricks

{% import '/dialects/databricks.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} RLIKE {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table.`target_column` RLIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value,
    DATE_TRUNC('MONTH', CAST(analyzed_table.`date_column` AS DATE)) AS time_period,
    TIMESTAMP(DATE_TRUNC('MONTH', CAST(analyzed_table.`date_column` AS DATE))) AS time_period_utc
FROM `<target_schema>`.`<target_table>` AS analyzed_table
GROUP BY time_period, time_period_utc
ORDER BY time_period, time_period_utc

DB2

Sensor template for DB2Rendered SQL for DB2

{% import '/dialects/db2.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    time_period,
    time_period_utc
FROM (
    SELECT
        original_table.*,
    DATE_TRUNC('MONTH', CAST(original_table."date_column" AS DATE)) AS time_period,
    TIMESTAMP(DATE_TRUNC('MONTH', CAST(original_table."date_column" AS DATE))) AS time_period_utc
    FROM "<target_schema>"."<target_table>" original_table
) analyzed_table
GROUP BY time_period, time_period_utc
ORDER BY time_period, time_period_utc

DuckDB

Sensor template for DuckDBRendered SQL for DuckDB

{% import '/dialects/duckdb.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN REGEXP_MATCHES({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }}) IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN REGEXP_MATCHES(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$') IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    DATE_TRUNC('MONTH', CAST(analyzed_table."date_column" AS date)) AS time_period,
    CAST((DATE_TRUNC('MONTH', CAST(analyzed_table."date_column" AS date))) AS TIMESTAMP WITH TIME ZONE) AS time_period_utc
FROM  AS analyzed_table
GROUP BY time_period, time_period_utc
ORDER BY time_period, time_period_utc

HANA

Sensor template for HANARendered SQL for HANA

{% import '/dialects/hana.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} NOT LIKE_REGEXPR {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" NOT LIKE_REGEXPR '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    time_period,
    time_period_utc
FROM (
    SELECT
        original_table.*,
    SERIES_ROUND(CAST(original_table."date_column" AS DATE), 'INTERVAL 1 MONTH', ROUND_DOWN) AS time_period,
    TO_TIMESTAMP(SERIES_ROUND(CAST(original_table."date_column" AS DATE), 'INTERVAL 1 MONTH', ROUND_DOWN)) AS time_period_utc
    FROM "<target_schema>"."<target_table>" original_table
) analyzed_table
GROUP BY time_period, time_period_utc
ORDER BY time_period, time_period_utc

MariaDB

Sensor template for MariaDBRendered SQL for MariaDB

{% import '/dialects/mariadb.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_regex(lib.render_target_column('analyzed_table'), parameters.regex ) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT 
    analyzed_table.`target_column` RLIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'

                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value,
    DATE_FORMAT(analyzed_table.`date_column`, '%Y-%m-01 00:00:00') AS time_period,
    FROM_UNIXTIME(UNIX_TIMESTAMP(DATE_FORMAT(analyzed_table.`date_column`, '%Y-%m-01 00:00:00'))) AS time_period_utc
FROM `<target_table>` AS analyzed_table
GROUP BY time_period, time_period_utc
ORDER BY time_period, time_period_utc

MySQL

Sensor template for MySQLRendered SQL for MySQL

{% import '/dialects/mysql.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_regex(lib.render_target_column('analyzed_table'), parameters.regex ) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table.`target_column`, '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value,
    DATE_FORMAT(analyzed_table.`date_column`, '%Y-%m-01 00:00:00') AS time_period,
    FROM_UNIXTIME(UNIX_TIMESTAMP(DATE_FORMAT(analyzed_table.`date_column`, '%Y-%m-01 00:00:00'))) AS time_period_utc
FROM `<target_table>` AS analyzed_table
GROUP BY time_period, time_period_utc
ORDER BY time_period, time_period_utc

Oracle

Sensor template for OracleRendered SQL for Oracle

{% import '/dialects/oracle.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    time_period,
    time_period_utc
FROM (
    SELECT
        original_table.*,
    TRUNC(CAST(original_table."date_column" AS DATE), 'MONTH') AS time_period,
    CAST(TRUNC(CAST(original_table."date_column" AS DATE), 'MONTH') AS TIMESTAMP WITH TIME ZONE) AS time_period_utc
    FROM "<target_schema>"."<target_table>" original_table
) analyzed_table
GROUP BY time_period, time_period_utc
ORDER BY time_period, time_period_utc

PostgreSQL

Sensor template for PostgreSQLRendered SQL for PostgreSQL

{% import '/dialects/postgresql.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} ~ {{ lib.render_regex(parameters.regex) }} IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" ~ '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$' IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    DATE_TRUNC('MONTH', CAST(analyzed_table."date_column" AS date)) AS time_period,
    CAST((DATE_TRUNC('MONTH', CAST(analyzed_table."date_column" AS date))) AS TIMESTAMP WITH TIME ZONE) AS time_period_utc
FROM "your_postgresql_database"."<target_schema>"."<target_table>" AS analyzed_table
GROUP BY time_period, time_period_utc
ORDER BY time_period, time_period_utc

Presto

Sensor template for PrestoRendered SQL for Presto

{% import '/dialects/presto.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    time_period,
    time_period_utc
FROM (
    SELECT
        original_table.*,
    DATE_TRUNC('MONTH', CAST(original_table."date_column" AS date)) AS time_period,
    CAST(DATE_TRUNC('MONTH', CAST(original_table."date_column" AS date)) AS TIMESTAMP) AS time_period_utc
    FROM "your_trino_database"."<target_schema>"."<target_table>" original_table
) analyzed_table
GROUP BY time_period, time_period_utc
ORDER BY time_period, time_period_utc

QuestDB

Sensor template for QuestDBRendered SQL for QuestDB

{% import '/dialects/questdb.sql.jinja2' as lib with context -%}

SELECT
    COALESCE(100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} ~ {{ lib.render_regex(parameters.regex) }} = FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }}), 100.0)
    AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM(
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
    WHERE {{ lib.render_target_column('original_table') }} IS NOT NULL
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    COALESCE(100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" ~ '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$' = FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column"), 100.0)
    AS actual_value,
    time_period,
    time_period_utc
FROM(
    SELECT
        original_table.*,
    CAST(DATE_TRUNC('month', original_table."date_column") AS DATE) AS time_period,
    CAST((CAST(DATE_TRUNC('month', original_table."date_column") AS DATE)) AS TIMESTAMP WITH TIME ZONE) AS time_period_utc
    FROM "<target_table>" original_table
    WHERE original_table."target_column" IS NOT NULL
) analyzed_table
GROUP BY time_period, time_period_utc
ORDER BY time_period, time_period_utc

Redshift

Sensor template for RedshiftRendered SQL for Redshift

{% import '/dialects/redshift.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} ~ {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table."target_column" ~ '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    DATE_TRUNC('MONTH', CAST(analyzed_table."date_column" AS date)) AS time_period,
    CAST((DATE_TRUNC('MONTH', CAST(analyzed_table."date_column" AS date))) AS TIMESTAMP WITH TIME ZONE) AS time_period_utc
FROM "your_redshift_database"."<target_schema>"."<target_table>" AS analyzed_table
GROUP BY time_period, time_period_utc
ORDER BY time_period, time_period_utc

Snowflake

Sensor template for SnowflakeRendered SQL for Snowflake

{% import '/dialects/snowflake.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} REGEXP {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table."target_column" REGEXP '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    DATE_TRUNC('MONTH', CAST(analyzed_table."date_column" AS date)) AS time_period,
    TO_TIMESTAMP(DATE_TRUNC('MONTH', CAST(analyzed_table."date_column" AS date))) AS time_period_utc
FROM "your_snowflake_database"."<target_schema>"."<target_table>" AS analyzed_table
GROUP BY time_period, time_period_utc
ORDER BY time_period, time_period_utc

Spark

Sensor template for SparkRendered SQL for Spark

{% import '/dialects/spark.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} RLIKE {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table.`target_column` RLIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value,
    DATE_TRUNC('MONTH', CAST(analyzed_table.`date_column` AS DATE)) AS time_period,
    TIMESTAMP(DATE_TRUNC('MONTH', CAST(analyzed_table.`date_column` AS DATE))) AS time_period_utc
FROM `<target_schema>`.`<target_table>` AS analyzed_table
GROUP BY time_period, time_period_utc
ORDER BY time_period, time_period_utc

SQL Server

Sensor template for SQL ServerRendered SQL for SQL Server

{% import '/dialects/sqlserver.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT_BIG({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} LIKE {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT_BIG({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT_BIG(analyzed_table.[target_column]) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table.[target_column] LIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT_BIG(analyzed_table.[target_column])
    END AS actual_value,
    DATEFROMPARTS(YEAR(CAST(analyzed_table.[date_column] AS date)), MONTH(CAST(analyzed_table.[date_column] AS date)), 1) AS time_period,
    CAST((DATEFROMPARTS(YEAR(CAST(analyzed_table.[date_column] AS date)), MONTH(CAST(analyzed_table.[date_column] AS date)), 1)) AS DATETIME) AS time_period_utc
FROM [your_sql_server_database].[<target_schema>].[<target_table>] AS analyzed_table
GROUP BY DATEFROMPARTS(YEAR(CAST(analyzed_table.[date_column] AS date)), MONTH(CAST(analyzed_table.[date_column] AS date)), 1), DATEADD(month, DATEDIFF(month, 0, analyzed_table.[date_column]), 0)
ORDER BY DATEFROMPARTS(YEAR(CAST(analyzed_table.[date_column] AS date)), MONTH(CAST(analyzed_table.[date_column] AS date)), 1)

Teradata

Sensor template for TeradataRendered SQL for Teradata

{% import '/dialects/teradata.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} IS NOT NULL AND
                        REGEXP_SUBSTR({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }}) IS NULL
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" IS NOT NULL AND
                        REGEXP_SUBSTR(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$') IS NULL
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    TRUNC(CAST(analyzed_table."date_column" AS DATE), 'MM') AS time_period,
    CAST(TRUNC(CAST(analyzed_table."date_column" AS DATE), 'MM') AS TIMESTAMP) AS time_period_utc
FROM "<target_schema>"."<target_table>" AS analyzed_table
GROUP BY time_period, time_period_utc
ORDER BY time_period, time_period_utc

Trino

Sensor template for TrinoRendered SQL for Trino

{% import '/dialects/trino.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    time_period,
    time_period_utc
FROM (
    SELECT
        original_table.*,
    DATE_TRUNC('MONTH', CAST(original_table."date_column" AS date)) AS time_period,
    CAST(DATE_TRUNC('MONTH', CAST(original_table."date_column" AS date)) AS TIMESTAMP) AS time_period_utc
    FROM "your_trino_catalog"."<target_schema>"."<target_table>" original_table
) analyzed_table
GROUP BY time_period, time_period_utc
ORDER BY time_period, time_period_utc

Expand the Configure with data grouping section to see additional examples for configuring this data quality checks to use data grouping (GROUP BY).

Configuration with data grouping

Sample configuration with data grouping enabled (YAML) The sample below shows how to configure the data grouping and how it affects the generated SQL query.

# yaml-language-server: $schema=https://cloud.dqops.com/dqo-yaml-schema/TableYaml-schema.json
apiVersion: dqo/v1
kind: table
spec:
  timestamp_columns:
    partition_by_column: date_column
  incremental_time_window:
    daily_partitioning_recent_days: 7
    monthly_partitioning_recent_months: 1
  default_grouping_name: group_by_country_and_state
  groupings:
    group_by_country_and_state:
      level_1:
        source: column_value
        column: country
      level_2:
        source: column_value
        column: state
  columns:
    target_column:
      partitioned_checks:
        monthly:
          patterns:
            monthly_partition_texts_not_matching_regex_percent:
              parameters:
                regex: "^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$"
              warning:
                max_percent: 0.0
              error:
                max_percent: 1.0
              fatal:
                max_percent: 5.0
      labels:
      - This is the column that is analyzed for data quality issues
    date_column:
      labels:
      - "date or datetime column used as a daily or monthly partitioning key, dates\
        \ (and times) are truncated to a day or a month by the sensor's query for\
        \ partitioned checks"
    country:
      labels:
      - column used as the first grouping key
    state:
      labels:
      - column used as the second grouping key

Please expand the database engine name section to see the SQL query rendered by a Jinja2 template for the texts_not_matching_regex_percent sensor.

BigQuery

Sensor template for BigQueryRendered SQL for BigQuery

{% import '/dialects/bigquery.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_CONTAINS({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_CONTAINS(analyzed_table.`target_column`, r'^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value,
    analyzed_table.`country` AS grouping_level_1,
    analyzed_table.`state` AS grouping_level_2,
    DATE_TRUNC(CAST(analyzed_table.`date_column` AS DATE), MONTH) AS time_period,
    TIMESTAMP(DATE_TRUNC(CAST(analyzed_table.`date_column` AS DATE), MONTH)) AS time_period_utc
FROM `your-google-project-id`.`<target_schema>`.`<target_table>` AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2, time_period, time_period_utc
ORDER BY grouping_level_1, grouping_level_2, time_period, time_period_utc

ClickHouse

Sensor template for ClickHouseRendered SQL for ClickHouse

{% import '/dialects/clickhouse.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT match({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT match(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    analyzed_table."country" AS grouping_level_1,
    analyzed_table."state" AS grouping_level_2,
    DATE_TRUNC('month', CAST(analyzed_table."date_column" AS DATE)) AS time_period,
    toDateTime64(DATE_TRUNC('month', CAST(analyzed_table."date_column" AS DATE)), 3) AS time_period_utc
FROM "<target_schema>"."<target_table>" AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2, time_period, time_period_utc
ORDER BY grouping_level_1, grouping_level_2, time_period, time_period_utc

Databricks

Sensor template for DatabricksRendered SQL for Databricks

{% import '/dialects/databricks.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} RLIKE {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table.`target_column` RLIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value,
    analyzed_table.`country` AS grouping_level_1,
    analyzed_table.`state` AS grouping_level_2,
    DATE_TRUNC('MONTH', CAST(analyzed_table.`date_column` AS DATE)) AS time_period,
    TIMESTAMP(DATE_TRUNC('MONTH', CAST(analyzed_table.`date_column` AS DATE))) AS time_period_utc
FROM `<target_schema>`.`<target_table>` AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2, time_period, time_period_utc
ORDER BY grouping_level_1, grouping_level_2, time_period, time_period_utc

DB2

Sensor template for DB2Rendered SQL for DB2

{% import '/dialects/db2.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,

                analyzed_table.grouping_level_1,

                analyzed_table.grouping_level_2,
    time_period,
    time_period_utc
FROM (
    SELECT
        original_table.*,
    original_table."country" AS grouping_level_1,
    original_table."state" AS grouping_level_2,
    DATE_TRUNC('MONTH', CAST(original_table."date_column" AS DATE)) AS time_period,
    TIMESTAMP(DATE_TRUNC('MONTH', CAST(original_table."date_column" AS DATE))) AS time_period_utc
    FROM "<target_schema>"."<target_table>" original_table
) analyzed_table
GROUP BY grouping_level_1, grouping_level_2, time_period, time_period_utc
ORDER BY grouping_level_1, grouping_level_2, time_period, time_period_utc

DuckDB

Sensor template for DuckDBRendered SQL for DuckDB

{% import '/dialects/duckdb.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN REGEXP_MATCHES({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }}) IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN REGEXP_MATCHES(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$') IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    analyzed_table."country" AS grouping_level_1,
    analyzed_table."state" AS grouping_level_2,
    DATE_TRUNC('MONTH', CAST(analyzed_table."date_column" AS date)) AS time_period,
    CAST((DATE_TRUNC('MONTH', CAST(analyzed_table."date_column" AS date))) AS TIMESTAMP WITH TIME ZONE) AS time_period_utc
FROM  AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2, time_period, time_period_utc
ORDER BY grouping_level_1, grouping_level_2, time_period, time_period_utc

HANA

Sensor template for HANARendered SQL for HANA

{% import '/dialects/hana.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} NOT LIKE_REGEXPR {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" NOT LIKE_REGEXPR '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,

                analyzed_table.grouping_level_1,

                analyzed_table.grouping_level_2,
    time_period,
    time_period_utc
FROM (
    SELECT
        original_table.*,
    original_table."country" AS grouping_level_1,
    original_table."state" AS grouping_level_2,
    SERIES_ROUND(CAST(original_table."date_column" AS DATE), 'INTERVAL 1 MONTH', ROUND_DOWN) AS time_period,
    TO_TIMESTAMP(SERIES_ROUND(CAST(original_table."date_column" AS DATE), 'INTERVAL 1 MONTH', ROUND_DOWN)) AS time_period_utc
    FROM "<target_schema>"."<target_table>" original_table
) analyzed_table
GROUP BY grouping_level_1, grouping_level_2, time_period, time_period_utc
ORDER BY grouping_level_1, grouping_level_2, time_period, time_period_utc

MariaDB

Sensor template for MariaDBRendered SQL for MariaDB

{% import '/dialects/mariadb.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_regex(lib.render_target_column('analyzed_table'), parameters.regex ) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT 
    analyzed_table.`target_column` RLIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'

                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value,
    analyzed_table.`country` AS grouping_level_1,
    analyzed_table.`state` AS grouping_level_2,
    DATE_FORMAT(analyzed_table.`date_column`, '%Y-%m-01 00:00:00') AS time_period,
    FROM_UNIXTIME(UNIX_TIMESTAMP(DATE_FORMAT(analyzed_table.`date_column`, '%Y-%m-01 00:00:00'))) AS time_period_utc
FROM `<target_table>` AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2, time_period, time_period_utc
ORDER BY grouping_level_1, grouping_level_2, time_period, time_period_utc

MySQL

Sensor template for MySQLRendered SQL for MySQL

{% import '/dialects/mysql.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_regex(lib.render_target_column('analyzed_table'), parameters.regex ) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table.`target_column`, '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value,
    analyzed_table.`country` AS grouping_level_1,
    analyzed_table.`state` AS grouping_level_2,
    DATE_FORMAT(analyzed_table.`date_column`, '%Y-%m-01 00:00:00') AS time_period,
    FROM_UNIXTIME(UNIX_TIMESTAMP(DATE_FORMAT(analyzed_table.`date_column`, '%Y-%m-01 00:00:00'))) AS time_period_utc
FROM `<target_table>` AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2, time_period, time_period_utc
ORDER BY grouping_level_1, grouping_level_2, time_period, time_period_utc

Oracle

Sensor template for OracleRendered SQL for Oracle

{% import '/dialects/oracle.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,

                analyzed_table.grouping_level_1,

                analyzed_table.grouping_level_2
,
    time_period,
    time_period_utc
FROM (
    SELECT
        original_table.*,
    original_table."country" AS grouping_level_1,
    original_table."state" AS grouping_level_2,
    TRUNC(CAST(original_table."date_column" AS DATE), 'MONTH') AS time_period,
    CAST(TRUNC(CAST(original_table."date_column" AS DATE), 'MONTH') AS TIMESTAMP WITH TIME ZONE) AS time_period_utc
    FROM "<target_schema>"."<target_table>" original_table
) analyzed_table
GROUP BY grouping_level_1, grouping_level_2, time_period, time_period_utc
ORDER BY grouping_level_1, grouping_level_2, time_period, time_period_utc

PostgreSQL

Sensor template for PostgreSQLRendered SQL for PostgreSQL

{% import '/dialects/postgresql.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} ~ {{ lib.render_regex(parameters.regex) }} IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" ~ '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$' IS FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    analyzed_table."country" AS grouping_level_1,
    analyzed_table."state" AS grouping_level_2,
    DATE_TRUNC('MONTH', CAST(analyzed_table."date_column" AS date)) AS time_period,
    CAST((DATE_TRUNC('MONTH', CAST(analyzed_table."date_column" AS date))) AS TIMESTAMP WITH TIME ZONE) AS time_period_utc
FROM "your_postgresql_database"."<target_schema>"."<target_table>" AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2, time_period, time_period_utc
ORDER BY grouping_level_1, grouping_level_2, time_period, time_period_utc

Presto

Sensor template for PrestoRendered SQL for Presto

{% import '/dialects/presto.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT(analyzed_table."target_column")
    END AS actual_value,

                analyzed_table.grouping_level_1,

                analyzed_table.grouping_level_2
,
    time_period,
    time_period_utc
FROM (
    SELECT
        original_table.*,
    original_table."country" AS grouping_level_1,
    original_table."state" AS grouping_level_2,
    DATE_TRUNC('MONTH', CAST(original_table."date_column" AS date)) AS time_period,
    CAST(DATE_TRUNC('MONTH', CAST(original_table."date_column" AS date)) AS TIMESTAMP) AS time_period_utc
    FROM "your_trino_database"."<target_schema>"."<target_table>" original_table
) analyzed_table
GROUP BY grouping_level_1, grouping_level_2, time_period, time_period_utc
ORDER BY grouping_level_1, grouping_level_2, time_period, time_period_utc

QuestDB

Sensor template for QuestDBRendered SQL for QuestDB

{% import '/dialects/questdb.sql.jinja2' as lib with context -%}

SELECT
    COALESCE(100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} ~ {{ lib.render_regex(parameters.regex) }} = FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }}), 100.0)
    AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM(
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
    WHERE {{ lib.render_target_column('original_table') }} IS NOT NULL
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    COALESCE(100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" ~ '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$' = FALSE
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column"), 100.0)
    AS actual_value,

                analyzed_table.grouping_level_1,

                analyzed_table.grouping_level_2,
    time_period,
    time_period_utc
FROM(
    SELECT
        original_table.*,
    original_table."country" AS grouping_level_1,
    original_table."state" AS grouping_level_2,
    CAST(DATE_TRUNC('month', original_table."date_column") AS DATE) AS time_period,
    CAST((CAST(DATE_TRUNC('month', original_table."date_column") AS DATE)) AS TIMESTAMP WITH TIME ZONE) AS time_period_utc
    FROM "<target_table>" original_table
    WHERE original_table."target_column" IS NOT NULL
) analyzed_table
GROUP BY grouping_level_1, grouping_level_2, time_period, time_period_utc
ORDER BY grouping_level_1, grouping_level_2, time_period, time_period_utc

Redshift

Sensor template for RedshiftRendered SQL for Redshift

{% import '/dialects/redshift.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} ~ {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table."target_column" ~ '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    analyzed_table."country" AS grouping_level_1,
    analyzed_table."state" AS grouping_level_2,
    DATE_TRUNC('MONTH', CAST(analyzed_table."date_column" AS date)) AS time_period,
    CAST((DATE_TRUNC('MONTH', CAST(analyzed_table."date_column" AS date))) AS TIMESTAMP WITH TIME ZONE) AS time_period_utc
FROM "your_redshift_database"."<target_schema>"."<target_table>" AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2, time_period, time_period_utc
ORDER BY grouping_level_1, grouping_level_2, time_period, time_period_utc

Snowflake

Sensor template for SnowflakeRendered SQL for Snowflake

{% import '/dialects/snowflake.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} REGEXP {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table."target_column" REGEXP '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    analyzed_table."country" AS grouping_level_1,
    analyzed_table."state" AS grouping_level_2,
    DATE_TRUNC('MONTH', CAST(analyzed_table."date_column" AS date)) AS time_period,
    TO_TIMESTAMP(DATE_TRUNC('MONTH', CAST(analyzed_table."date_column" AS date))) AS time_period_utc
FROM "your_snowflake_database"."<target_schema>"."<target_table>" AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2, time_period, time_period_utc
ORDER BY grouping_level_1, grouping_level_2, time_period, time_period_utc

Spark

Sensor template for SparkRendered SQL for Spark

{% import '/dialects/spark.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} RLIKE {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table.`target_column`) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table.`target_column` RLIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table.`target_column`)
    END AS actual_value,
    analyzed_table.`country` AS grouping_level_1,
    analyzed_table.`state` AS grouping_level_2,
    DATE_TRUNC('MONTH', CAST(analyzed_table.`date_column` AS DATE)) AS time_period,
    TIMESTAMP(DATE_TRUNC('MONTH', CAST(analyzed_table.`date_column` AS DATE))) AS time_period_utc
FROM `<target_schema>`.`<target_table>` AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2, time_period, time_period_utc
ORDER BY grouping_level_1, grouping_level_2, time_period, time_period_utc

SQL Server

Sensor template for SQL ServerRendered SQL for SQL Server

{% import '/dialects/sqlserver.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT_BIG({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT {{ lib.render_target_column('analyzed_table') }} LIKE {{ lib.render_regex(parameters.regex) }}
                    THEN 1
                ELSE 0
            END
        ) / COUNT_BIG({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT_BIG(analyzed_table.[target_column]) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN NOT analyzed_table.[target_column] LIKE '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$'
                    THEN 1
                ELSE 0
            END
        ) / COUNT_BIG(analyzed_table.[target_column])
    END AS actual_value,
    analyzed_table.[country] AS grouping_level_1,
    analyzed_table.[state] AS grouping_level_2,
    DATEFROMPARTS(YEAR(CAST(analyzed_table.[date_column] AS date)), MONTH(CAST(analyzed_table.[date_column] AS date)), 1) AS time_period,
    CAST((DATEFROMPARTS(YEAR(CAST(analyzed_table.[date_column] AS date)), MONTH(CAST(analyzed_table.[date_column] AS date)), 1)) AS DATETIME) AS time_period_utc
FROM [your_sql_server_database].[<target_schema>].[<target_table>] AS analyzed_table
GROUP BY analyzed_table.[country], analyzed_table.[state], DATEFROMPARTS(YEAR(CAST(analyzed_table.[date_column] AS date)), MONTH(CAST(analyzed_table.[date_column] AS date)), 1), DATEADD(month, DATEDIFF(month, 0, analyzed_table.[date_column]), 0)
ORDER BY level_1, level_2DATEFROMPARTS(YEAR(CAST(analyzed_table.[date_column] AS date)), MONTH(CAST(analyzed_table.[date_column] AS date)), 1)

Teradata

Sensor template for TeradataRendered SQL for Teradata

{% import '/dialects/teradata.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN {{ lib.render_target_column('analyzed_table') }} IS NOT NULL AND
                        REGEXP_SUBSTR({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }}) IS NULL
                    THEN 1
                ELSE 0
            END
        ) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections('analyzed_table') }}
    {{- lib.render_time_dimension_projection('analyzed_table') }}
FROM {{ lib.render_target_table() }} AS analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE 100.0 * SUM(
            CASE
                WHEN analyzed_table."target_column" IS NOT NULL AND
                        REGEXP_SUBSTR(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$') IS NULL
                    THEN 1
                ELSE 0
            END
        ) / COUNT(analyzed_table."target_column")
    END AS actual_value,
    analyzed_table."country" AS grouping_level_1,
    analyzed_table."state" AS grouping_level_2,
    TRUNC(CAST(analyzed_table."date_column" AS DATE), 'MM') AS time_period,
    CAST(TRUNC(CAST(analyzed_table."date_column" AS DATE), 'MM') AS TIMESTAMP) AS time_period_utc
FROM "<target_schema>"."<target_table>" AS analyzed_table
GROUP BY grouping_level_1, grouping_level_2, time_period, time_period_utc
ORDER BY grouping_level_1, grouping_level_2, time_period, time_period_utc

Trino

Sensor template for TrinoRendered SQL for Trino

{% import '/dialects/trino.sql.jinja2' as lib with context -%}

SELECT
    CASE
        WHEN COUNT({{ lib.render_target_column('analyzed_table') }}) = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE({{ lib.render_target_column('analyzed_table') }}, {{ lib.render_regex(parameters.regex) }})
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT({{ lib.render_target_column('analyzed_table') }})
    END AS actual_value
    {{- lib.render_data_grouping_projections_reference('analyzed_table') }}
    {{- lib.render_time_dimension_projection_reference('analyzed_table') }}
FROM (
    SELECT
        original_table.*
        {{- lib.render_data_grouping_projections('original_table') }}
        {{- lib.render_time_dimension_projection('original_table') }}
    FROM {{ lib.render_target_table() }} original_table
) analyzed_table
{{- lib.render_where_clause() -}}
{{- lib.render_group_by() -}}
{{- lib.render_order_by() -}}

SELECT
    CASE
        WHEN COUNT(analyzed_table."target_column") = 0 THEN 100.0
        ELSE CAST(100.0 * SUM(
            CASE
                WHEN NOT REGEXP_LIKE(analyzed_table."target_column", '^((25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])[.]){3}(25[0-5]|2[0-4][0-9]|1[0-9][0-9]|[0-9][0-9]|[0-9])$')
                    THEN 1
                ELSE 0
            END
        ) AS DOUBLE) / COUNT(analyzed_table."target_column")
    END AS actual_value,

                analyzed_table.grouping_level_1,

                analyzed_table.grouping_level_2
,
    time_period,
    time_period_utc
FROM (
    SELECT
        original_table.*,
    original_table."country" AS grouping_level_1,
    original_table."state" AS grouping_level_2,
    DATE_TRUNC('MONTH', CAST(original_table."date_column" AS date)) AS time_period,
    CAST(DATE_TRUNC('MONTH', CAST(original_table."date_column" AS date)) AS TIMESTAMP) AS time_period_utc
    FROM "your_trino_catalog"."<target_schema>"."<target_table>" original_table
) analyzed_table
GROUP BY grouping_level_1, grouping_level_2, time_period, time_period_utc
ORDER BY grouping_level_1, grouping_level_2, time_period, time_period_utc

What's next

Learn how to configure data quality checks in DQOps
Look at the examples of running data quality checks, targeting tables and columns