column list or order changed
column list or order changed checks
Description
Table level check that detects if the list of columns and the order of columns have changed since the last time this check was run.
This check will retrieve the metadata of a tested table and calculate a hash of the column names. The hash will depend on the order of columns.
A data quality issue will be detected if new columns were added, columns that existed during the previous test were dropped or the columns were reordered.
profile column list or order changed
Check description
Detects if new columns were added, existing columns were removed or the columns were reordered. Retrieves the metadata of the monitored table and calculates an ordered hash of the column names. Compares the current hash to the previously known hash to detect any changes to the list of columns or their order.
Check name | Check type | Time scale | Sensor definition | Quality rule |
---|---|---|---|---|
profile_column_list_or_order_changed | profiling | column_list_ordered_hash | value_changed |
Enable check (Shell)
To enable this check provide connection name and check name in check enable command
To run this check provide check name in check run command It is also possible to run this check on a specific connection. In order to do this, add the connection name to the below It is additionally feasible to run this check on a specific table. In order to do this, add the table name to the below It is furthermore viable to combine run this check on a specific column. In order to do this, add the column name to the below
dqo> check run -c=connection_name -t=table_name -col=column_name -ch=profile_column_list_or_order_changed
# yaml-language-server: $schema=https://cloud.dqo.ai/dqo-yaml-schema/TableYaml-schema.json
apiVersion: dqo/v1
kind: table
spec:
timestamp_columns:
event_timestamp_column: col_event_timestamp
ingestion_timestamp_column: col_inserted_at
incremental_time_window:
daily_partitioning_recent_days: 7
monthly_partitioning_recent_months: 1
profiling_checks:
schema:
profile_column_list_or_order_changed:
warning: {}
error: {}
fatal: {}
columns:
col_event_timestamp:
labels:
- optional column that stores the timestamp when the event/transaction happened
col_inserted_at:
labels:
- optional column that stores the timestamp when row was ingested
daily column list or order changed
Check description
Detects if new columns were added, existing columns were removed or the columns were reordered since the most recent day. Retrieves the metadata of the monitored table and calculates an ordered hash of the column names. Compares the current hash to the previously known hash to detect any changes to the list of columns or their order.
Check name | Check type | Time scale | Sensor definition | Quality rule |
---|---|---|---|---|
daily_column_list_or_order_changed | recurring | daily | column_list_ordered_hash | value_changed |
Enable check (Shell)
To enable this check provide connection name and check name in check enable command
To run this check provide check name in check run command It is also possible to run this check on a specific connection. In order to do this, add the connection name to the below It is additionally feasible to run this check on a specific table. In order to do this, add the table name to the below It is furthermore viable to combine run this check on a specific column. In order to do this, add the column name to the below
dqo> check run -c=connection_name -t=table_name -col=column_name -ch=daily_column_list_or_order_changed
recurring_checks:
daily:
schema:
daily_column_list_or_order_changed:
warning: {}
error: {}
fatal: {}
# yaml-language-server: $schema=https://cloud.dqo.ai/dqo-yaml-schema/TableYaml-schema.json
apiVersion: dqo/v1
kind: table
spec:
timestamp_columns:
event_timestamp_column: col_event_timestamp
ingestion_timestamp_column: col_inserted_at
incremental_time_window:
daily_partitioning_recent_days: 7
monthly_partitioning_recent_months: 1
recurring_checks:
daily:
schema:
daily_column_list_or_order_changed:
warning: {}
error: {}
fatal: {}
columns:
col_event_timestamp:
labels:
- optional column that stores the timestamp when the event/transaction happened
col_inserted_at:
labels:
- optional column that stores the timestamp when row was ingested
monthly column list or order changed
Check description
Detects if new columns were added, existing columns were removed or the columns were reordered since the last month. Retrieves the metadata of the monitored table and calculates an ordered hash of the column names. Compares the current hash to the previously known hash to detect any changes to the list of columns or their order.
Check name | Check type | Time scale | Sensor definition | Quality rule |
---|---|---|---|---|
monthly_column_list_or_order_changed | recurring | monthly | column_list_ordered_hash | value_changed |
Enable check (Shell)
To enable this check provide connection name and check name in check enable command
To run this check provide check name in check run command It is also possible to run this check on a specific connection. In order to do this, add the connection name to the below It is additionally feasible to run this check on a specific table. In order to do this, add the table name to the below It is furthermore viable to combine run this check on a specific column. In order to do this, add the column name to the below
dqo> check run -c=connection_name -t=table_name -col=column_name -ch=monthly_column_list_or_order_changed
recurring_checks:
monthly:
schema:
monthly_column_list_or_order_changed:
warning: {}
error: {}
fatal: {}
# yaml-language-server: $schema=https://cloud.dqo.ai/dqo-yaml-schema/TableYaml-schema.json
apiVersion: dqo/v1
kind: table
spec:
timestamp_columns:
event_timestamp_column: col_event_timestamp
ingestion_timestamp_column: col_inserted_at
incremental_time_window:
daily_partitioning_recent_days: 7
monthly_partitioning_recent_months: 1
recurring_checks:
monthly:
schema:
monthly_column_list_or_order_changed:
warning: {}
error: {}
fatal: {}
columns:
col_event_timestamp:
labels:
- optional column that stores the timestamp when the event/transaction happened
col_inserted_at:
labels:
- optional column that stores the timestamp when row was ingested