Schedules overview
With DQO, you can easily customize when checks are run by setting schedules. You can set schedules for an entire connection, table, or individual check.
To set up schedule you can use the graphical interface as described below or manually modify the YAML configuration file as described here.
Different types of checks, such as Profiling, Recurring, and Partitioned, have their own schedules. For more information on these different check types, please refer to the DQO Concepts section.
Configuring a schedule at connection and table level
To set up a schedule for the entire connection, follow these steps:
-
Navigate to the Data Source section.
-
Choose the connection you want to schedule from the tree view on the left.
-
Click on the Schedule tab.
-
Select the check type:
- Profiling
- Recurring Daily
- Recurring Monthly
- Partitioned Daily
- Partitioned Monthly
-
Specify the schedule using a Unix cron expression or select one of the options provided.
-
Once you have set the schedule, click on the Save button to save your changes.
Once a schedule is set up for a particular connection, it will execute all the checks that have been configured across all tables associated with that connection. If you wish to disable the schedule for a specific table, you can simply do so by checking the "Disable schedule" checkbox.
To set up a schedule for a specific table, simply select the desired table from the tree view on the left, then follow the steps above beginning at step 3. Please note that any changes made to the schedule on the table level will override the schedule set for the entire connection.
Configuring a schedule at check level.
To set up a schedule for a specific check, follow these steps:
-
Navigate to the section with a check type of interest (Profiling, Recurring Checks or Partition Checks).
-
Choose table or column of interest from the tree view on the left.
-
Enable the check of interest then click the "Setting" button and go to the "Schedule Override" tab.
-
Specify the schedule using a Unix cron expression or select one of the options provided.
-
Once you have set the schedule, click the Save button to save your changes.
Please note that any changes made to the schedule at the check level will override the schedule set for the entire connection or table.
Starting a scheduler
To initiate a scheduler in the DQO Shell, simply enter the command scheduler start
. To stop the scheduler, use the
command scheduler stop
.
You can also use the graphical interface to start the scheduler. Simply enable Jobs scheduler located in the Notifications on the right side of the navigation bar.
For further information on the scheduler
commands, please refer to the Command-line interface section.
Scheduler can also be started in a server mode that continuously run a job scheduler and synchronize the data every 10 minutes. To do this, simply enter the command below in your terminal:
To terminate dqo running in the background, simply use the Ctrl+C.For more information on the run
command, please refer to the Command-line interface section.
Synchronizing data
All the YAML configuration files with data source metadata and schedules configuration are stored in /sources
folder.
You can read more about the data storage in DQO here.
DQO allows you to modify the frequency of data synchronization when the scheduler is run in a server mode
In order to configure how often the scheduler will synchronize the local copy of the metadata with DQO Cloud and detect new schedules, start dqo with the following parameter:
Please use quotation marks when defining a frequency in cron format.You can also configure this parameter by setting DQO_SCHEDULER_SCAN_METADATA_CRON_SCHEDULE=<Unix cron expression>
environment variable.
To modify whether the job scheduler will sync configuration files and results with DQO Cloud, simply launch dqo with the following parameter:
To enable synchronization typetrue
, to disable it type false
.
You can also configure this parameter by setting DQO_SCHEDULER_ENABLE_CLOUD_SYNC=<TRUE/FALSE>
environment variable.