Last updated: July 22, 2025
DQOps incidents parquet table schema
The parquet file schema for the incidents table stored in the $DQO_USER_HOME/.data/incidents folder in DQOps.
Table description
The data quality incidents table that tracks open incidents. Incidents are grouping multiple failed data quality checks (stored in the check_results table). The check results that are part of an incident can be matched to incidents by the incident_hash column. The incidents are stored in the errors table is located in the $DQO_USER_HOME/.data/incidents folder that contains uncompressed parquet files. The table is partitioned using a Hive compatible partitioning folder structure. When the $DQO_USER_HOME is not configured, it is the folder where DQOps was started (the DQOps user's home folder).
The folder partitioning structure for this table is: c=[connection_name]/m=[first_day_of_month]/, for example: c=myconnection/m=2023-01-01/.
Parquet table schema
The columns of this table are described below.
Column name | Description | Hive data type |
---|---|---|
id |
The incident id (primary key), it is a UUID created from a hash of target affected by the incident (target_hash) and a first_seen_utc. This value identifies a single row. | STRING |
incident_hash |
The hash of the incident. | BIGINT |
schema_name |
The table schema. | STRING |
table_name |
The table name. | STRING |
table_priority |
The table priority. | INTEGER |
data_group_name |
The data group name, it is a concatenated name of the data group dimension values, created from [grouping_level_1] / [grouping_level_2] / ... | STRING |
quality_dimension |
The data quality dimension. | STRING |
check_category |
The check category. | STRING |
check_type |
The check type (profiling, checkpoint, partitioned). | STRING |
check_name |
The check name. | STRING |
highest_severity |
The highest data quality check result severity detected as part of this incident. The values are 0, 1, 2, 3 for none, warning, error and fatal severity alerts. | INTEGER |
minimum_severity |
Minimum severity of data quality issues (data quality check results) that are included in the incident. It is copied from the incident configuration at a connection or table level at the time when the incident is first seen. The values are 0, 1, 2, 3 for none, warning, error and fatal severity alerts. | INTEGER |
first_seen |
Stores the exact time when the incident was raised (seen) for the first time, as a UTC timestamp: first_seen. | TIMESTAMP |
last_seen |
Stores the exact time when the incident was raised (seen) for the last time, as a UTC timestamp: last_seen. | TIMESTAMP |
incident_until |
Stores the timestamp of the end of the incident when new issues will not be appended to this incident, as a UTC timestamp: incident_until. | TIMESTAMP |
failed_checks_count |
Stores the number of checks that failed. | INTEGER |
issue_url |
Stores the user provided url to an external ticket management platform that is tracking this incident. | STRING |
resolved_by |
Stores the login of the user who resolved the incident. | STRING |
status |
Stores the current status of the incident. The statuses are described in the {@link IncidentStatus IncidentStatus} enumeration. | STRING |
created_at |
The timestamp when the row was created at. | TIMESTAMP |
updated_at |
The timestamp when the row was updated at. | TIMESTAMP |
created_by |
Stores the login of the user who created the incident by running a check. | STRING |
updated_by |
The login of the user that updated the row. | STRING |
What's more
- You can find more information on how the Parquet files are partitioned in the data quality results storage concept.