Skip to content

models

CheckTarget

Enumeration of targets where the check is applied. It is one of "table" or "column".

The structure of this object is described below

 Data type   Enum values 
string column
table

CheckType

Enumeration of data quality check types: profiling, monitoring, partitioned.

The structure of this object is described below

 Data type   Enum values 
string profiling
partitioned
monitoring

CheckTimeScale

Enumeration of time scale of monitoring and partitioned data quality checks (daily, monthly, etc.)

The structure of this object is described below

 Data type   Enum values 
string daily
monthly

FieldModel

Model of a single field that is used to edit a parameter value for a sensor or a rule. Describes the type of the field and the current value.

The structure of this object is described below

 Property name   Description                       Data type 
definition Field name that matches the field name (snake_case) used in the YAML specification. ParameterDefinitionSpec
optional Field value is optional and may be null, when false - the field is required and must be filled. boolean
string_value Field value for a string field. string
boolean_value Field value for a boolean field. boolean
integer_value Field value for an integer (32-bit) field. integer
long_value Field value for a long (64-bit) field. long
double_value Field value for a double field. double
datetime_value Field value for a date time field. datetime
column_name_value Field value for a column name field. string
enum_value Field value for an enum (choice) field. string
string_list_value Field value for an array (list) of strings. List[string]
integer_list_value Field value for an array (list) of integers, using 64 bit integers. List[integer]
date_value Field value for an date. date

RuleParametersModel

Model that returns the form definition and the form data to edit parameters (thresholds) for a rule at a single severity level (low, medium, high).

The structure of this object is described below

 Property name   Description                       Data type 
rule_name Full rule name. This field is for information purposes and could be used to create additional custom checks that are reusing the same data quality rule. string
rule_parameters List of fields for editing the rule parameters like thresholds. List[FieldModel]
disabled Disable the rule. The rule will not be evaluated. The sensor will also not be executed if it has no enabled rules. boolean
configured Returns true when the rule is configured (is not null), so it should be shown in the UI as configured (having values). boolean

CheckConfigurationModel

Model containing fundamental configuration of a single data quality check.

The structure of this object is described below

 Property name   Description                       Data type 
connection_name Connection name. string
schema_name Schema name. string
table_name Table name. string
column_name Column name, if the check is set up on a column. string
check_target Check target (table or column). CheckTarget
check_type Check type (profiling, monitoring, partitioned). CheckType
check_time_scale Check timescale (for monitoring and partitioned checks). CheckTimeScale
category_name Category to which this check belongs. string
check_name Check name that is used in YAML file. string
sensor_parameters List of fields for editing the sensor parameters. List[FieldModel]
table_level_filter SQL WHERE clause added to the sensor query for every check on this table. string
sensor_level_filter SQL WHERE clause added to the sensor query for this check. string
warning Rule parameters for the warning severity rule. RuleParametersModel
error Rule parameters for the error severity rule. RuleParametersModel
fatal Rule parameters for the fatal severity rule. RuleParametersModel
disabled Whether the check has been disabled. boolean
configured Whether the check is configured (not null). boolean

CheckListModel

Simplistic model that returns a single data quality check, its name and "configured" flag.

The structure of this object is described below

 Property name   Description                       Data type 
check_category Check category. string
check_name Data quality check name that is used in YAML. string
help_text Help text that describes the data quality check. string
configured True if the data quality check is configured (not null). When saving the data quality check configuration, set the flag to true for storing the check. boolean

CheckContainerListModel

Simplistic model that returns the list of data quality checks, their names, categories and "configured" flag.

The structure of this object is described below

 Property name   Description                       Data type 
checks Simplistic list of all data quality checks. List[CheckListModel]
can_edit Boolean flag that decides if the current user can edit the check. boolean
can_run_checks Boolean flag that decides if the current user can run checks. boolean
can_delete_data Boolean flag that decides if the current user can delete data (results). boolean

RuleThresholdsModel

Model that returns the form definition and the form data to edit a single rule with all three threshold levels (low, medium, high).

The structure of this object is described below

 Property name   Description                       Data type 
error Rule parameters for the error severity rule. RuleParametersModel
warning Rule parameters for the warning severity rule. RuleParametersModel
fatal Rule parameters for the fatal severity rule. RuleParametersModel

MonitoringScheduleSpec

Monitoring job schedule specification.

The structure of this object is described below

 Property name   Description                       Data type 
cron_expression Unix style cron expression that specifies when to execute scheduled operations like running data quality checks or synchronizing the configuration with the cloud. string
disabled Disables the schedule. When the value of this 'disable' field is false, the schedule is stored in the metadata but it is not activated to run data quality checks. boolean

CheckRunScheduleGroup

The run check scheduling group (profiling, daily checks, monthly checks, etc), which identifies the configuration of a schedule (cron expression) used schedule these checks on the job scheduler.

The structure of this object is described below

 Data type   Enum values 
string monitoring_monthly
profiling
partitioned_daily
monitoring_daily
partitioned_monthly

EffectiveScheduleLevelModel

Enumeration of possible levels at which a schedule could be configured.

The structure of this object is described below

 Data type   Enum values 
string check_override
connection
table_override

EffectiveScheduleModel

Model of a configured schedule (on connection or table) or schedule override (on check). Describes the CRON expression and the time of the upcoming execution, as well as the duration until this time.

The structure of this object is described below

 Property name   Description                       Data type 
schedule_group Field value for a schedule group to which this schedule belongs. CheckRunScheduleGroup
schedule_level Field value for the level at which the schedule has been configured. EffectiveScheduleLevelModel
cron_expression Field value for a CRON expression defining the scheduling. string
disabled Field value stating if the schedule has been explicitly disabled. boolean

ScheduleEnabledStatusModel

Enumeration of possible ways a schedule can be configured.

The structure of this object is described below

 Data type   Enum values 
string not_configured
disabled
overridden_by_checks
enabled

CommentSpec

Comment entry. Comments are added when a change was made and the change should be recorded in a persisted format.

The structure of this object is described below

 Property name   Description                       Data type 
date Comment date and time datetime
comment_by Commented by string
comment Comment text string

CommentsListSpec

List of comments.

The structure of this object is described below

 Property name   Description                       Data type 
self List[CommentSpec]

CheckSearchFilters

Target data quality checks filter, identifies which checks on which tables and columns should be executed.

The structure of this object is described below

 Property name   Description                       Data type 
column The column name. This field accepts search patterns in the format: 'fk_*', '*_id', 'prefix*suffix'. string
column_data_type The column data type that was imported from the data source and is stored in the columns -> column_name -> type_snapshot -> column_type field in the .dqotable.yaml file. string
column_nullable Optional filter to find only nullable (when the value is true) or not nullable (when the value is false) columns, based on the value of the columns -> column_name -> type_snapshot -> nullable field in the .dqotable.yaml file. boolean
check_target The target type of object to run checks. Supported values are: table to run only table level checks or column to run only column level checks. CheckTarget
check_type The target type of checks to run. Supported values are profiling, monitoring and partitioned. CheckType
time_scale The time scale of monitoring or partitioned checks to run. Supports running only daily or monthly checks. Daily monitoring checks will replace today's value for all captured check results. CheckTimeScale
check_category The target check category, for example: nulls, volume, anomaly. string
table_comparison_name The name of a configured table comparison. When the table comparison is provided, DQOps will only perform table comparison checks that compare data between tables. string
check_name The target check name to run only this named check. Uses the short check name which is the name of the deepest folder in the checks folder. This field supports search patterns such as: 'profiling_*', '*count', 'profiling*_percent'. string
sensor_name The target sensor name to run only data quality checks that are using this sensor. Uses the full sensor name which is the full folder path within the sensors folder. This field supports search patterns such as: 'table/volume/row_*', '*count', 'table/volume/prefix*_suffix'. string
connection The connection (data source) name. Supports search patterns in the format: 'source*', '*_prod', 'prefix*suffix'. string
full_table_name The schema and table name. It is provided as <schema_name>.<table_name>, for example public.fact_sales. The schema and table name accept patterns both in the schema name and table name parts. Sample patterns are: 'schema_name.tab_prefix_*', 'schema_name.', '.*', 'schema_name.*customer', 'schema_name.tab*_suffix'. string
enabled A boolean flag to target enabled tables, columns or checks. When the value of this field is not set, the default value of this field is true, targeting only tables, columns and checks that are not implicitly disabled. boolean

DeleteStoredDataQueueJobParameters

Parameters for the "delete stored data queue job that deletes data from parquet files stored in DQOps user home's .data* directory.

The structure of this object is described below

 Property name   Description                       Data type 
connection The connection name. string
full_table_name The schema and table name. It is provided as <schema_name>.<table_name>, for example public.fact_sales. This filter does not support patterns. string
date_start The start date (inclusive) to delete the data, based on the time_period column in Parquet files. date
date_end The end date (inclusive) to delete the data, based on the time_period column in Parquet files. date
delete_errors Delete the data from the errors table. Because the default value is false, this parameter must be set to true to delete the errors. boolean
delete_statistics Delete the data from the statistics table. Because the default value is false, this parameter must be set to true to delete the statistics. boolean
delete_check_results Delete the data from the check_results table. Because the default value is false, this parameter must be set to true to delete the check results. boolean
delete_sensor_readouts Delete the data from the sensor_readouts table. Because the default value is false, this parameter must be set to true to delete the sensor readouts. boolean
column_names The list of column names to delete the data for column level results or errors only for selected columns. List[string]
check_category The check category name, for example volume or anomaly. string
table_comparison_name The name of a table comparison configuration. Deletes only table comparison results (and errors) for a given comparison. string
check_name The name of a data quality check. Uses the short check name, for example daily_row_count. string
check_type The type of checks whose results and errors should be deleted. For example, use monitoring to delete only monitoring checks data. string
sensor_name The full sensor name whose results, checks based on the sensor, statistics and errors generated by the sensor sound be deleted. Uses a full sensor name, for example: table/volume/row_count. string
data_group_tag The names of data groups in any of the grouping_level_1...grouping_level_9 columns in the Parquet tables. Enables deleting data tagged for one data source or a subset of results when the group level is captured from a column in a monitored table. string
quality_dimension The data quality dimension name, for example Timeliness or Completeness. string
time_gradient The time gradient (time scale) of the sensor and check results that are captured. string
collector_category The statistics collector category when statistics should be deleted. A statistics category is a group of statistics, for example sampling for the column value samples. string
collector_name The statistics collector name when only statistics are deleted for a selected collector, for example sample_values. string
collector_target The type of the target object for which the basic statistics are deleted. Supported values are table and column. string

CheckTargetModel

Enumeration of possible targets for check model request result.

The structure of this object is described below

 Data type   Enum values 
string column
table

SimilarCheckModel

Describes a single check that is similar to other checks in other check types.

The structure of this object is described below

 Property name   Description                       Data type 
check_target The check target (table or column). CheckTarget
check_type The check type. CheckType
time_scale The time scale (daily, monthly). The time scale is optional and could be null (for profiling checks). CheckTimeScale
category The check's category. string
check_name The similar check name in another category. string

CheckModel

Model that returns the form definition and the form data to edit a single data quality check.

The structure of this object is described below

 Property name   Description                       Data type 
check_name Data quality check name that is used in YAML. string
help_text Help text that describes the data quality check. string
sensor_parameters List of fields for editing the sensor parameters. List[FieldModel]
sensor_name Full sensor name. This field is for information purposes and could be used to create additional custom checks that are reusing the same data quality sensor. string
quality_dimension Data quality dimension used for tagging the results of this data quality checks. string
rule Threshold (alerting) rules defined for a check. RuleThresholdsModel
supports_grouping The data quality check supports a custom data grouping configuration. boolean
data_grouping_override Data grouping configuration for this check. When a data grouping configuration is assigned at a check level, it overrides the data grouping configuration from the table level. Data grouping is configured in two cases: (1) the data in the table should be analyzed with a GROUP BY condition, to analyze different groups of rows using separate time series, for example a table contains data from multiple countries and there is a 'country' column used for partitioning. (2) a static data grouping configuration is assigned to a table, when the data is partitioned at a table level (similar tables store the same information, but for different countries, etc.). DataGroupingConfigurationSpec
schedule_override Run check scheduling configuration. Specifies the schedule (a cron expression) when the data quality checks are executed by the scheduler. MonitoringScheduleSpec
effective_schedule Model of configured schedule enabled on the check level. EffectiveScheduleModel
schedule_enabled_status State of the scheduling override for this check. ScheduleEnabledStatusModel
comments Comments for change tracking. Please put comments in this collection because YAML comments may be removed when the YAML file is modified by the tool (serialization and deserialization will remove non tracked comments). CommentsListSpec
disabled Disables the data quality check. Only enabled checks are executed. The sensor should be disabled if it should not work, but the configuration of the sensor and rules should be preserved in the configuration. boolean
exclude_from_kpi Data quality check results (alerts) are included in the data quality KPI calculation by default. Set this field to true in order to exclude this data quality check from the data quality KPI calculation. boolean
include_in_sla Marks the data quality check as part of a data quality SLA. The data quality SLA is a set of critical data quality checks that must always pass and are considered as a data contract for the dataset. boolean
configured True if the data quality check is configured (not null). When saving the data quality check configuration, set the flag to true for storing the check. boolean
filter SQL WHERE clause added to the sensor query. Both the table level filter and a sensor query filter are added, separated by an AND operator. string
run_checks_job_template Configured parameters for the "check run" job that should be pushed to the job queue in order to start the job. CheckSearchFilters
data_clean_job_template Configured parameters for the "data clean" job that after being supplied with a time range should be pushed to the job queue in order to remove stored results connected with this check. DeleteStoredDataQueueJobParameters
data_grouping_configuration The name of a data grouping configuration defined at a table that should be used for this check. string
check_target Type of the check's target (column, table). CheckTargetModel
configuration_requirements_errors List of configuration errors that must be fixed before the data quality check could be executed. List[string]
similar_checks List of similar checks in other check types or in other time scales. List[SimilarCheckModel]
can_edit Boolean flag that decides if the current user can edit the check. boolean
can_run_checks Boolean flag that decides if the current user can run checks. boolean
can_delete_data Boolean flag that decides if the current user can delete data (results). boolean

QualityCategoryModel

Model that returns the form definition and the form data to edit all checks within a single category.

The structure of this object is described below

 Property name   Description                       Data type 
category Data quality check category name. string
comparison_name The name of the reference table configuration used for a cross table data comparison (when the category is 'comparisons'). string
compare_to_column The name of the column in the reference table that is compared. string
help_text Help text that describes the category. string
checks List of data quality checks within the category. List[CheckModel]
run_checks_job_template Configured parameters for the "check run" job that should be pushed to the job queue in order to start the job. CheckSearchFilters
data_clean_job_template Configured parameters for the "data clean" job that after being supplied with a time range should be pushed to the job queue in order to remove stored results connected with this quality category. DeleteStoredDataQueueJobParameters

CheckContainerModel

Model that returns the form definition and the form data to edit all data quality checks divided by categories.

The structure of this object is described below

 Property name   Description                       Data type 
categories List of all data quality categories that contain data quality checks inside. List[QualityCategoryModel]
effective_schedule Model of configured schedule enabled on the check container. EffectiveScheduleModel
effective_schedule_enabled_status State of the effective scheduling on the check container. ScheduleEnabledStatusModel
partition_by_column The name of the column that partitioned checks will use for the time period partitioning. Important only for partitioned checks. string
run_checks_job_template Configured parameters for the "check run" job that should be pushed to the job queue in order to start the job. CheckSearchFilters
data_clean_job_template Configured parameters for the "data clean" job that after being supplied with a time range should be pushed to the job queue in order to remove stored results connected with this check container DeleteStoredDataQueueJobParameters
can_edit Boolean flag that decides if the current user can edit the check. boolean
can_run_checks Boolean flag that decides if the current user can run checks. boolean
can_delete_data Boolean flag that decides if the current user can delete data (results). boolean

CheckContainerTypeModel

Model identifying the check type and timescale of checks belonging to a container.

The structure of this object is described below

 Property name   Description                       Data type 
check_type Check type. CheckType
check_time_scale Check timescale. CheckTimeScale

CheckTemplate

Model depicting a named data quality check that can potentially be enabled, regardless to its position in hierarchy tree.

The structure of this object is described below

 Property name   Description                       Data type 
check_target Check target (table, column) CheckTarget
check_category Data quality check category. string
check_name Data quality check name that is used in YAML. string
help_text Help text that describes the data quality check. string
check_container_type Check type with time-scale. CheckContainerTypeModel
sensor_name Full sensor name. string
sensor_parameters_definitions List of sensor parameter fields definitions. List[ParameterDefinitionSpec]
rule_parameters_definitions List of threshold (alerting) rule's parameters definitions (for a single rule, regardless of severity). List[ParameterDefinitionSpec]

ProviderType

Data source provider type (dialect type). We will use lower case names to avoid issues with parsing, even if the enum names are not named following the Java naming convention.

The structure of this object is described below

 Data type   Enum values 
string snowflake
oracle
postgresql
redshift
sqlserver
mysql
bigquery

StatisticsCollectorTarget

The structure of this object is described below

 Data type   Enum values 
string column
table

StatisticsCollectorSearchFilters

Hierarchy node search filters for finding enabled statistics collectors (basic profilers) to be started.

The structure of this object is described below

 Property name   Description                       Data type 
collector_name The target statistics collector name to capture only selected statistics. Uses the short collector nameThis field supports search patterns such as: 'prefix*', '*suffix', 'prefix_*_suffix'. In order to collect only top 10 most common column samples, use 'column_samples'. string
sensor_name The target sensor name to run only data quality checks that are using this sensor. Uses the full sensor name which is the full folder path within the sensors folder. This field supports search patterns such as: 'table/volume/row_*', '*count', 'table/volume/prefix*_suffix'. string
collector_category The target statistics collector category, for example: nulls, volume, sampling. string
target The target type of object to collect statistics from. Supported values are: table to collect only table level statistics or column to collect only column level statistics. StatisticsCollectorTarget
connection The connection (data source) name. Supports search patterns in the format: 'source*', '*_prod', 'prefix*suffix'. string
full_table_name The schema and table name. It is provided as <schema_name>.<table_name>, for example public.fact_sales. The schema and table name accept patterns both in the schema name and table name parts. Sample patterns are: 'schema_name.tab_prefix_*', 'schema_name.', '.*', 'schema_name.*customer', 'schema_name.tab*_suffix'. string
enabled A boolean flag to target enabled tables, columns or checks. When the value of this field is not set, the default value of this field is true, targeting only tables, columns and checks that are not implicitly disabled. boolean

ConnectionModel

Connection model returned by the rest api that is limited only to the basic fields, excluding nested nodes.

The structure of this object is described below

 Property name   Description                       Data type 
connection_name Connection name. string
connection_hash Connection hash that identifies the connection using a unique hash code. long
parallel_runs_limit The concurrency limit for the maximum number of parallel SQL queries executed on this connection. integer
provider_type Database provider type (required). Accepts: bigquery, snowflake, etc. ProviderType
bigquery BigQuery connection parameters. Specify parameters in the bigquery section. BigQueryParametersSpec
snowflake Snowflake connection parameters. SnowflakeParametersSpec
postgresql PostgreSQL connection parameters. PostgresqlParametersSpec
redshift Redshift connection parameters. RedshiftParametersSpec
sqlserver SqlServer connection parameters. SqlServerParametersSpec
mysql MySQL connection parameters. MysqlParametersSpec
oracle Oracle connection parameters. OracleParametersSpec
run_checks_job_template Configured parameters for the "check run" job that should be pushed to the job queue in order to run all checks within this connection. CheckSearchFilters
run_profiling_checks_job_template Configured parameters for the "check run" job that should be pushed to the job queue in order to run profiling checks within this connection. CheckSearchFilters
run_monitoring_checks_job_template Configured parameters for the "check run" job that should be pushed to the job queue in order to run monitoring checks within this connection. CheckSearchFilters
run_partition_checks_job_template Configured parameters for the "check run" job that should be pushed to the job queue in order to run partition partitioned checks within this connection. CheckSearchFilters
collect_statistics_job_template Configured parameters for the "collect statistics" job that should be pushed to the job queue in order to run all statistics collectors within this connection. StatisticsCollectorSearchFilters
data_clean_job_template Configured parameters for the "data clean" job that after being supplied with a time range should be pushed to the job queue in order to remove stored results connected with this connection. DeleteStoredDataQueueJobParameters
can_edit Boolean flag that decides if the current user can update or delete the connection to the data source. boolean
can_collect_statistics Boolean flag that decides if the current user can collect statistics. boolean
can_run_checks Boolean flag that decides if the current user can run checks. boolean
can_delete_data Boolean flag that decides if the current user can delete data (results). boolean
yaml_parsing_error Optional parsing error that was captured when parsing the YAML file. This field is null when the YAML file is valid. If an error was captured, this field returns the file parsing error message and the file location. string

DqoQueueJobId

Identifies a single job.

The structure of this object is described below

 Property name   Description                       Data type 
job_id Job id. long
job_business_key Optional job business key that was assigned to the job. A business key is an alternative user assigned unique job identifier used to find the status of a job finding it by the business key. string
parent_job_id Parent job id. Filled only for nested jobs, for example a sub-job that runs data quality checks on a single table. DqoQueueJobId