Skip to content

Adding data source connection

After installation and starting DQO, we describe how to add a connection to BigQuery public dataset Austin Crime Data using the graphical interface.

For a full description of how to add a data source connection to other providers or add connection using CLI, see Working with DQO section. You can find more information about navigating the DQO graphical interface here.

Prerequisite credentials

To add BigQuery data source connection to DQO you need the following:

Adding BigQuery connection using the graphical interface

  1. Go to the Data Sources section and click + Add connection button in the upper left corner.

    Adding connection

  2. Select BiqQuery database type.

    Selecting BigQuery database type

  3. Add connection settings.

    Adding connection settings

    BigQuery connection settings Description
    Connection name The name of the connection that will be created in DQO. This will also be the name of the folder where the connection configuration files are stored. The name of the connection must be unique and consist of alphanumeric characters, hyphens and undescore. For example, "testconnection"
    Source GCP project ID Name of the project that has datasets that will be imported. In our example, it is "big-query-public-data".
    Billing GCP project ID Name of the project used as the default GCP project. The calling user must have a permission in this project.
    Authentication mode to the Google Cloud Type of authentication mode to the Google Cloud. You can select from the 3 options:
    - Google Application Credentials,
    - JSON Key Content
    - JSON Key Path
    Quota GCP project ID The Google Cloud Platform project ID which is used for invocation.
  4. After filling in the connection settings, click the Test Connection button to test the connection.

  5. Click the Save connection button when the test is successful otherwise you can check the details of what went wrong.

  6. Import the "austin_crime" schema by clicking on the Import Tables button.

    Importing schemas

  7. There is only one table in the dataset. Import the table by clicking Import all tables buttons in the upper right corner.

    Importing tables

  8. You can check the details of the imported table by expanding the tree view on the left and selecting the "crime" table.

    Viewing table details

    There are several tabs to explore:

    • Table - provide details about the table and allows you to add filters or stage names (for example, "Ingestion")
    • Schedule - allows setting schedule for running checks. Learn how to configure schedules
    • Comments - allows adding comments to your tables
    • Labels - allows adding labels to your tables
    • Data streams - allows configuring columns for data streams segmentation. Learn more about data streams segmentation in Concept section.
    • Date and time columns - allows setting date and time columns for partition checks type and table timeliness checks subcategory.

Next step

Now that you have connected a data source, it is time to run data quality checks.