Skip to content

dqo


dqo

Root command that permit control on CLI mode

Description

A root command that allows the user to access all the features and functionalities of the application from the command-line interface (CLI) level. It is the first command to be used before accessing any other commands of the application.

Summary (CLI)

$ dqo [root_level_parameter] [command]
Example
$ dqo --dqo.cloud.api-key=3242424324242 check run -c=connection_name

Options

Command argument     Description Required Accepted values
--dqo.cache.enabled
Enables or disables the specification cache.
This parameter could be also configured by setting DQO_CACHE_ENABLED environment variable.
--dqo.cache.expire-after-seconds
The time in seconds to expire the cache entries since they were added to the cache.
This parameter could be also configured by setting DQO_CACHE_EXPIRE_AFTER_SECONDS environment variable.
--dqo.cache.file-lists-limit
The maximum number of folders for which the list of files are cached to avoid listing the files.
This parameter could be also configured by setting DQO_CACHE_FILE_LISTS_LIMIT environment variable.
--dqo.cache.parquet-cache-memory-fraction
The maximum fraction of the JVM heap memory (configured using the -Xmx java parameter) that is used to cache parquet files in memory. The default value 0.6 means that up to 50% of the JVM heap memory could be used for caching files. The value of the reserved-heap-memory-bytes is subtracted from the total memory size (-Xmx parameter value) before the memory fraction is calculated.
This parameter could be also configured by setting DQO_CACHE_PARQUET_CACHE_MEMORY_FRACTION environment variable.
--dqo.cache.process-file-changes-delay-millis
The delay in milliseconds between processing file changes that would invalidate the cache.
This parameter could be also configured by setting DQO_CACHE_PROCESS_FILE_CHANGES_DELAY_MILLIS environment variable.
--dqo.cache.reserved-heap-memory-bytes
The memory size (in bytes) that is not subtracted from the total JVM heap memory before the memory fraction dedicated for the parquet cache is calculated.
This parameter could be also configured by setting DQO_CACHE_RESERVED_HEAP_MEMORY_BYTES environment variable.
--dqo.cache.watch-file-system-changes
Use a file watcher to detect file system changes and invalidate the in-memory file cache.
This parameter could be also configured by setting DQO_CACHE_WATCH_FILE_SYSTEM_CHANGES environment variable.
--dqo.cache.yaml-files-limit
The maximum number of specification files to cache.
This parameter could be also configured by setting DQO_CACHE_YAML_FILES_LIMIT environment variable.
--dqo.cli.terminal.width
Width of the terminal when no terminal window is available, e.g. in one-shot running mode.
This parameter could be also configured by setting DQO_CLI_TERMINAL_WIDTH environment variable.
--dqo.cloud.api-key
DQO cloud api key. Log in to https://cloud.dqo.ai/ to get the key.
This parameter could be also configured by setting DQO_CLOUD_API_KEY environment variable.
--dqo.cloud.authenticate-with-dqo-cloud
Turns on user authentication by using DQO Cloud credentials. Users will be redirected to the DQO Cloud login screen to login and will be returned back to the local DQO instance.
This parameter could be also configured by setting DQO_CLOUD_AUTHENTICATE_WITH_DQO_CLOUD environment variable.
--dqo.cloud.parallel-file-downloads
The number of files that are downloaded from DQO Cloud in parallel using HTTP/2 multiplexing.
This parameter could be also configured by setting DQO_CLOUD_PARALLEL_FILE_DOWNLOADS environment variable.
--dqo.cloud.parallel-file-uploads
The number of files that are uploaded to DQO Cloud in parallel using HTTP/2 multiplexing.
This parameter could be also configured by setting DQO_CLOUD_PARALLEL_FILE_UPLOADS environment variable.
--dqo.cloud.start-without-api-key
Allow starting DQO without a DQO Cloud API Key and without prompting to log in to DQO Cloud.
This parameter could be also configured by setting DQO_CLOUD_START_WITHOUT_API_KEY environment variable.
--dqo.core.lock-wait-timeout-seconds
Sets the maximum wait timeout in seconds to obtain a lock to read or write files.
This parameter could be also configured by setting DQO_CORE_LOCK_WAIT_TIMEOUT_SECONDS environment variable.
--dqo.core.print-stack-trace
Prints a full stack trace for errors on the console.
This parameter could be also configured by setting DQO_CORE_PRINT_STACK_TRACE environment variable.
--dqo.default-time-zone
Default time zone name used to convert the server's local dates to a local time in a time zone that is relevant for the user. Use official IANA time zone names. When the parameter is not configured, DQO uses the local time zone of the host running the application. The time zone could be reconfigured at a user settings level.
This parameter could be also configured by setting DQO_DEFAULT_TIME_ZONE environment variable.
--dqo.docker.user-home.allow-unmounted
When running DQO in a docker container, allow DQO user home folder to be initialized inside the container's filesystem if the folder hasn't been mounted to an external volume.
This parameter could be also configured by setting DQO_DOCKER_USER_HOME_ALLOW_UNMOUNTED environment variable.
--dqo.home
Overrides the path to the DQO system home (DQO_HOME). The default DQO_HOME contains the definition of built-in data quality sensors, rules and libraries.
This parameter could be also configured by setting DQO_HOME environment variable.
--dqo.incidents.check-histogram-size
The size of the data quality check histogram that is generated for a preview of a data quality incident.
This parameter could be also configured by setting DQO_INCIDENTS_CHECK_HISTOGRAM_SIZE environment variable.
--dqo.incidents.column-histogram-size
The size of the column histogram that is generated for a preview of a data quality incident.
This parameter could be also configured by setting DQO_INCIDENTS_COLUMN_HISTOGRAM_SIZE environment variable.
--dqo.incidents.count-open-incidents-days
The number of days since today that are scanned for open incidents first seen in since this number of days.
This parameter could be also configured by setting DQO_INCIDENTS_COUNT_OPEN_INCIDENTS_DAYS environment variable.
--dqo.instance.signature-key
DQO local instance signature key that is used to issue and verify digital signatures on API keys. It is a base64 encoded byte array (32 bytes). When not configured, DQO will generate a secure random key and store it in the .localsettings.dqosettings.yaml file.
This parameter could be also configured by setting DQO_INSTANCE_SIGNATURE_KEY environment variable.
--dqo.jdbc.expire-after-access-seconds
Sets the number of seconds when a connection in a JDBC pool is expired after the last access.
This parameter could be also configured by setting DQO_JDBC_EXPIRE_AFTER_ACCESS_SECONDS environment variable.
--dqo.jdbc.max-connection-in-pool
Sets the maximum number of connections in the JDBC connection pool, shared across all data sources using JDBC drivers.
This parameter could be also configured by setting DQO_JDBC_MAX_CONNECTION_IN_POOL environment variable.
--dqo.logging.enable-user-home-logging
Enables file logging inside the DQO User Home's .logs folder.
This parameter could be also configured by setting DQO_LOGGING_ENABLE_USER_HOME_LOGGING environment variable.
--dqo.logging.max-history
Sets the maximum number of log files that could be stored (archived) in the .logs folder.
This parameter could be also configured by setting DQO_LOGGING_MAX_HISTORY environment variable.
--dqo.logging.pattern
Log entry pattern for logback used for writing log entries.
This parameter could be also configured by setting DQO_LOGGING_PATTERN environment variable.
--dqo.logging.total-size-cap
Total log file size cap.
This parameter could be also configured by setting DQO_LOGGING_TOTAL_SIZE_CAP environment variable.
--dqo.python.interpreter-name
A list of python interpreter executable names, separated by a comma, containing possible python interpreter names such as 'python', 'python3', 'python3.exe' or an absolute path to the python interpreter. DQO will try to find the first python interpreter executable in directories defined in the PATH when a list of python interpreter names (not an absolute path) is used.
This parameter could be also configured by setting DQO_PYTHON_INTERPRETER_NAME environment variable.
--dqo.python.python-script-timeout-seconds
Python script execution time limit in seconds for running jinja2 and rule evaluation scripts.
This parameter could be also configured by setting DQO_PYTHON_PYTHON_SCRIPT_TIMEOUT_SECONDS environment variable.
--dqo.python.use-host-python
Disable creating a python virtual environment by DQO on startup. Instead, use the system python interpreter. DQO will not install any required python packages on startup and use packages from the user's python installation.
This parameter could be also configured by setting DQO_PYTHON_USE_HOST_PYTHON environment variable.
--dqo.queue.max-concurrent-jobs
Sets the maximum number of concurrent jobs that the job queue can process at once (running data quality checks, importing metadata, etc.). The maximum number of threads is also limited by the DQO license.
This parameter could be also configured by setting DQO_QUEUE_MAX_CONCURRENT_JOBS environment variable.
--dqo.queue.wait-timeouts.default-wait-timeout
Sets the default wait timeout (in seconds) for waiting for a job when the "waitTimeout" parameter is not given to the call to the "waitForJob" operation from the DQO client..
This parameter could be also configured by setting DQO_QUEUE_WAIT_TIMEOUTS_DEFAULT_WAIT_TIMEOUT environment variable.
--dqo.queue.wait-timeouts.run-checks
Sets the default timeout (in seconds) for the "run checks" rest api operation called from the DQO client when the "wait" parameter is true and the timeout is not provided by the client.
This parameter could be also configured by setting DQO_QUEUE_WAIT_TIMEOUTS_RUN_CHECKS environment variable.
--dqo.scheduler.check-run-mode
Configures the console logging mode for the '"check run" jobs performed by the job scheduler in the background.
This parameter could be also configured by setting DQO_SCHEDULER_CHECK_RUN_MODE environment variable.
silent
summary
info
debug
--dqo.scheduler.default-schedules.partitioned-daily
Sets the default schedule for running daily partitioned checks that is copied to the configuration of new data source connections that are registered in DQO. The default schedule runs checks once a day at 12 PM (noon). This parameter is used only once, during the first initialization of DQO user home. The value is copied to the .localsettings.dqosettings.yaml settings file.
This parameter could be also configured by setting DQO_SCHEDULER_DEFAULT_SCHEDULES_PARTITIONED_DAILY environment variable.
--dqo.scheduler.default-schedules.partitioned-monthly
Sets the default schedule for running monthly partitioned checks that is copied to the configuration of new data source connections that are registered in DQO. The default schedule runs checks once a day at 12 PM (noon). This parameter is used only once, during the first initialization of DQO user home. The value is copied to the .localsettings.dqosettings.yaml settings file.
This parameter could be also configured by setting DQO_SCHEDULER_DEFAULT_SCHEDULES_PARTITIONED_MONTHLY environment variable.
--dqo.scheduler.default-schedules.profiling
Sets the default schedule for running advanced profiling checks that is copied to the configuration of new data source connections that are registered in DQO. The default schedule runs checks once a day at 12 PM (noon). This parameter is used only once, during the first initialization of DQO user home. The value is copied to the .localsettings.dqosettings.yaml settings file.
This parameter could be also configured by setting DQO_SCHEDULER_DEFAULT_SCHEDULES_PROFILING environment variable.
--dqo.scheduler.default-schedules.recurring-daily
Sets the default schedule for running daily recurring checks that is copied to the configuration of new data source connections that are registered in DQO. The default schedule runs checks once a day at 12 PM (noon). This parameter is used only once, during the first initialization of DQO user home. The value is copied to the .localsettings.dqosettings.yaml settings file.
This parameter could be also configured by setting DQO_SCHEDULER_DEFAULT_SCHEDULES_RECURRING_DAILY environment variable.
--dqo.scheduler.default-schedules.recurring-monthly
Sets the default schedule for running monthly recurring checks that is copied to the configuration of new data source connections that are registered in DQO. The default schedule runs checks once a day at 12 PM (noon). This parameter is used only once, during the first initialization of DQO user home. The value is copied to the .localsettings.dqosettings.yaml settings file.
This parameter could be also configured by setting DQO_SCHEDULER_DEFAULT_SCHEDULES_RECURRING_MONTHLY environment variable.
--dqo.scheduler.enable-cloud-sync
Enable synchronization of metadata and results with DQO Cloud in the job scheduler.
This parameter could be also configured by setting DQO_SCHEDULER_ENABLE_CLOUD_SYNC environment variable.
--dqo.scheduler.start
Starts the job scheduler on startup (true) or disables the job scheduler (false).
This parameter could be also configured by setting DQO_SCHEDULER_START environment variable.
--dqo.scheduler.synchronization-mode
Configures the console logging mode for the '"cloud sync all" operations performed by the job scheduler in the background.
This parameter could be also configured by setting DQO_SCHEDULER_SYNCHRONIZATION_MODE environment variable.
silent
summary
debug
--dqo.scheduler.synchronize-cron-schedule
Unix cron expression to configure how often the scheduler will synchronize the local copy of the metadata with DQO Cloud and detect new cron schedules. Synchronization with DQO cloud could be disabled by setting --dqo.scheduler.enable-cloud-sync=false.
This parameter could be also configured by setting DQO_SCHEDULER_SYNCHRONIZE_CRON_SCHEDULE environment variable.
--dqo.scheduler.synchronized-folders
Configures which folders from the DQO user home folder are synchronized to DQO Cloud during a recurring synchronization (triggered by a cron schedule configured by --dqo.scheduler.synchronize-cron-schedule). By default, DQO synchronizes (pushes) only changes from folders that have local changes.
This parameter could be also configured by setting DQO_SCHEDULER_SYNCHRONIZED_FOLDERS environment variable.
all
locally_changed
--dqo.secrets.enable-gcp-secret-manager
Enables GCP secret manager to resolve parameters like null in the yaml files.
This parameter could be also configured by setting DQO_SECRETS_ENABLE_GCP_SECRET_MANAGER environment variable.
--dqo.secrets.gcp-project-id
GCP project name with a GCP secret manager enabled to pull the secrets.
This parameter could be also configured by setting DQO_SECRETS_GCP_PROJECT_ID environment variable.
--dqo.sensor.limit.fail-on-sensor-readout-limit-exceeded
Configures the behavior when the number of rows returned from a data quality sensor exceeds the limit configured in the 'sensor-readout-limit' parameter. When true, the whole check execution is failed. When false, only results up to the limit are analyzed.
This parameter could be also configured by setting DQO_SENSOR_LIMIT_FAIL_ON_SENSOR_READOUT_LIMIT_EXCEEDED environment variable.
--dqo.sensor.limit.max-merged-queries
The maximum number of queries that are merged into a bigger query, to calculate multiple sensors on the same table and to analyze multiple columns from the same table.
This parameter could be also configured by setting DQO_SENSOR_LIMIT_MAX_MERGED_QUERIES environment variable.
--dqo.sensor.limit.sensor-readout-limit
Default row count limit retrieved by a data quality sensor from the results of an SQL query for non-partitioned checks (profiling and recurring). This is the row count limit applied when querying the data source. When the data grouping configuration sets up a GROUP BY too many columns or columns with too many distinct values, the data source will return too many results to store them as data quality check results and sensor readouts. DQO will discard additional values returned from the data source or raise an error.
This parameter could be also configured by setting DQO_SENSOR_LIMIT_SENSOR_READOUT_LIMIT environment variable.
--dqo.sensor.limit.sensor-readout-limit-partitioned
Default row count limit retrieved by a data quality sensor from the results of an SQL query for partitioned checks. This is the row count limit applied when querying the data source. When the data grouping configuration sets up a GROUP BY too many columns or columns with too many distinct values, the data source will return too many results to store them as data quality check results and sensor readouts. DQO will discard additional values returned from the data source or return an error. The default value is 7x bigger than the sensor-readout-limit to allow analysing the last 7 daily partitions.
This parameter could be also configured by setting DQO_SENSOR_LIMIT_SENSOR_READOUT_LIMIT_PARTITIONED environment variable.
--dqo.user.home
Overrides the path to the DQO user home. The default user home is created in the current folder (.).
This parameter could be also configured by setting DQO_USER_HOME environment variable.
--dqo.user.initialize-user-home
Initializes an empty DQO user home (identified by the DQO_USER_HOME environment variable) without asking the user for confirmation.
This parameter could be also configured by setting DQO_USER_INITIALIZE_USER_HOME environment variable.
-fw
--file-write
Write command response to a file
This parameter could be also configured by setting _FWFILE_WRITE environment variable.
-hl
--headless
Run the command in an headless (no user input allowed) mode
This parameter could be also configured by setting _HLHEADLESS environment variable.
-h
--help
Show the help for the command and parameters
This parameter could be also configured by setting _HHELP environment variable.
--logging.level.com.dqops
Default logging level for the DQO runtime.
This parameter could be also configured by setting LOGGING_LEVEL_COM_DQOPS environment variable.
ERROR
WARN
INFO
DEBUG
TRACE
--logging.level.root
Default logging level at the root level of the logging hierarchy.
This parameter could be also configured by setting LOGGING_LEVEL_ROOT environment variable.
ERROR
WARN
INFO
DEBUG
TRACE
-of
--output-format
Output format for tabular responses
This parameter could be also configured by setting _OFOUTPUT_FORMAT environment variable.
TABLE
CSV
JSON
--server.port
Sets the web server port to host the DQO local web UI.
This parameter could be also configured by setting SERVER_PORT environment variable.
--spring.config.location
Sets a path to the folder that has the spring configuration files (application.properties or application.yml) or directly to an application.properties or application.yml file. The format of this value is: --spring.config.location=file:./foldername/,file:./alternativeapplication.yml
This parameter could be also configured by setting SPRING_CONFIG_LOCATION environment variable.