Run data quality check
After adding your first connection in the previous step, we describe how to run first checks.
In our example on BigQuery public dataset Austin Crime Data you will enable and run a table-level row_count and column-level nulls_percent checks using the graphical interface.
For more information about checks, see DQO concepts section.
Run table-level advanced profiling check
-
In DQO User Interface Console go to the Profiling section.
Click Profiling in the navigation bar at the top of the screen.
Another option is to expand the tree view of the newly added connection on the left side, click on the "crime" table and use the "Advanced Profiling" link.
-
Enable row_count table-level data quality check on "crime" table.
Row_count check verifies that the number of rows in the table does not exceed the minimum accepted count set as the threshold level.
To enable the row_count check that you are on the "crime" table in the tree view on the left side. In the list of checks on the right, enable the row count data quality check by clinking switch button Leave the default value of the error threshold level as "0". You can read more about threshold severity levels in DQO concepts section.
-
Run row_count data quality check by clicking the Run Check icon
A green square should appear next to the name of the checks indicating that the results of the run check is valid. You can view the details by placing the mouse cursor on the green square.
-
Click the "Results" icon to view more details of the results.
A table will appear with more details about the run check.
Run column-level advanced profiling check
-
In the tree view on the left navigate to "clearance_status" column
-
Enable nulls_percent column level check on "clearance_status" column.
Nulls_percent check ensures that there are no more than a set percentage of null values in the monitored column.
Add Warning and Fatal thresholds. Leave the default options (1 for Warning, 2 for Error and 5 for Fatal)
-
Run check by clicking the Run Check icon.
This time an orange square should appear indicating that the test detected and Error in the data.
-
Click the "Results" icon to view more details of the results.
The screen with the results should look as the one below.
-
Synchronize locally stored results with your DQO Cloud account to be able to view the results on the dashboards.
To synchronize all the data click on the Synchronize button in the upper right corner or just run
cloud sync all
command in DQO Shell.You can read more about
cloud
command in Command-line specification section.
Next step
Now that you have run the checks, you can review the results on the dashboards.