Last updated: June 26, 2024
Run DQOps in Docker
This guide shows how to pull DQOps docker image from Docker Hub, and how to pass the right parameters to the container to start it in a production mode.
Overview
DQOps can be run as a Docker container in a server mode or in Shell mode. You can also build a custom DQOps container image.
Note
Running DQOps as docker container is a preferred method for starting in a long-running production mode.
Prerequisites
To run DQOps as a Docker container you need
-
Docker running locally. Follow the instructions to download and install Docker.
-
DQOps Cloud account and a DQOps Cloud API Key. If you want to use all DQOps features, such as storing data quality definitions and results in the cloud or data quality dashboards. Create a new DQOps Cloud account here.
-
A
DQOps User Homefolder is created locally which will be mounted to your container. Volumes are the preferred mechanism for persisting data generated by and used by Docker containers. TheDQOps User Homefolder stores local data such as sensor readouts, data quality check results, and the data source configuration.
Start DQOps in DQOps interactive shell mode
To start DQOps in a Shell mode follow the steps below.
-
Download the DQOps image from Docker Hub by running the following command in a terminal:
-
Create an empty folder where you want to create your
DQOps User Home.DQOps User Homeis a folder where DQOps will store the metadata of imported data sources, the configuration of activated data quality checks, and the data quality results. -
Run DQOps Docker image
docker run -v [path to local DQOps user home folder]:/dqo/userhome -it -p 8888:8888 dqops/dqo [--dqo.cloud.api-key=here-your-DQOps-Cloud-API-key]- The
-vflag mounts your locally createdDQOps User Homefolder into the container. You need to provide the path to your localDQOps User Homefolder - The
-iflag keeps STDIN open even if not attached. - The
-tflag allocates a pseudo-TTY. - The
-pflag creates a mapping between the host’s port 8888 to the container’s port 8888. Without the port mapping, you would not be able to access the application. - The
--dqo.cloud.api-keyargument specifies the API Key of your DQOps Cloud registration. When the DQOps Cloud API Key is not specified and you are starting DQOps using an emptyDQOps User Homefolder, DQOps will not be able to open the browser. Please copy the url to the DQOps Cloud Login that is shown to a browser and create or login to your DQOps Cloud account.
If you want to use the current folder as your
DQOps User Home, you can bind this folder to the/dqo/userhomemount point in the DQOps docker image. Please keep in mind that theDQOps User Homefolder should be empty (to initialize it on startup) or it should be already a validDQOps User Homefolder. Read the DQOps user home folder concept to learn more. - The
-
After a few seconds you can use the DQOps terminal or open the user interface by opening http://localhost:8888 in a web browser.
Start DQOps in server mode
To start DQOps in a server mode follow the steps below.
-
Download the
dqops/dqoimage from DockerHub by running the following command in a terminal: -
Run the DQOps Docker image
-
The
-vflag mounts your locally createdDQOps User Homefolder into the container. You need to provide the path to your localDQOps User Homefolder - The
-pflag creates a mapping between the host’s port 8888 to the container’s port 8888. Without the port mapping, you would not be able to access the application. - The
-dflag turns on a daemon mode - The
-mparameter configures the memory size for the container. We are advising to allocate at least 2 GB of memory for the DQOps container, which is configured by-m=2g. DQOps container runs one Java JVM process and several small Python processes (two per core) that are running the rules. DQOps runtime allocates 80% of the container memory for the JVM heap. The memory is used for caching YAML and parquet files in memory. The memory size can be changed by passing theDQO_JAVA_OPTSenvironment variable to the container using the following docker run parameter:-e DQO_JAVA_OPTS=-XX:MaxRAMPercentage=60.0 - The
--dqo.cloud.api-keyargument specifies the API Key of your DQOps Cloud account. -
The
runcommand at the end will run the run CLI command command and activate a server mode without the DQOps Shell. -
After a few seconds open your web browser to http://localhost:8888/. You should see the DQOps user interface.
Build a custom DQOps container image
-
Create an empty folder.
-
Open a terminal, navigate to the created directory and clone the DQOps repository from GitHub.
-
Modify the DQOps Docker file
Dockerfilelocated in the main directory. -
Run the following command to build a DQOps container image using a Dockerfile:
The
-tparameter specifies the name for the container image, in this case "your_dqo_image_name".
