GBDI 4.1

Advanced Dispatcher Configuration

This section details advanced dispatcher options configured in /opt/sonarfinder/sonarFinder/dispatcher.conf

RDBMS pull configuration – See Integrating External RDBMS Data in GBDI.

LDAP pull configuration– See Data Level Security (DLS).

UTF-8 Support

Note

GBDI supports UTF-8 character encoding.

Web Service Pull

Use the Web Service field to specify that you want a job to invoke a Web service and insert/update data within GBDI. The Web service response can either be in the form of JSON or CSVs, as specified in the section of dispatcher.conf that matches the alias entered in the Web Service field. When the response is a JSON document or array it is processed/parsed using SonarGateway. Therefore, when using this feature you must also add a config section to SonarGateway that accepts the data elements. If the response is a CSV then it is placed into sonargd/incoming and you can control its parsing through sonargd.conf and the misc section.

You can use multi-runs and generate multiple pulls as you would for standard API calls. For example, if you have a collection that contains:

{
  "_id": "sn2",
  "name": "ServiceNow2",
  "ticket": "INC0010002"
},
{
  "_id": "sn1",
  "name": "ServiceNow2",
  "ticket": "INC0010001"
}

you can enter a URL in the job named ServiceNow2:

  1. https://dev74625.service-now.com/incident.do?JSONv2&sysparm_action=getRecords&sysparm_query=number="$$ticket"&displayvalue=true

    causing the following two REST APIs to be called:

  1. https://dev74625.service-now.com/incident.do?JSONv2&sysparm_action=getRecords&sysparm_query=number=INC0010001&displayvalue=true

  2. https://dev74625.service-now.com/incident.do?JSONv2&sysparm_action=getRecords&sysparm_query=number=INC0010002&displayvalue=true

Setting web-service definition in dispatcher.conf

The web-service config contains the next parameters:

  • [web_service_name] – name to be used from the scheduler UI

  • download_type – can be JSON or CSV

  • username – username to be used for downloading from the provided URL

  • password – password to be used for downloading from the provided URL

  • authURL – to be used instead of username/password, when a value is assigned to this variable, the dispatcher will append the provided value at the end of the Job's URL, and will not try to authenticate using username/password

  • SonarGateway_Address – IP-address/domain-name of the server running sonar syslog (to be used with JSON). In most cases will be “localhost”

  • SonarGateway_Port – port configured on SonarGateway to handle the specific JSON input

Example:

[MyWebService]
download_type = JSON
username = joesmith
password = password
authURL = 
SonarGateway_Address = localhost
SonarGateway_Port = 10532

When importing CSV files the next pattern will be used for the file names when saved to the “incoming” folder:

sonar-web-service-<web_service_name>-<timestamp>-<hash>.csv

Legend:

  • sonar-web-service: This is a fix string used to identify that the source of the file is the web-service

  • web_service_name: This is the web service section name in dispatcher.conf (used for this download)

  • timestamp: in "yyyymmddhhMMss" format (for example: 20170919234915)

  • Hash: ten character hashed uuid (for example: 1617e8d157)

Upload to Web Service

You can use dispatcher to upload data from sonarw collection documents to a Web service, e.g. ServiceNow®.

To do this, follow the steps below.

  1. Create a table in ServiceNow; see Create a table on the ServiceNow product documentation site.

    Note

    When creating the table, map field names in the sonarw collection documents to column names in the ServiceNow table.

  2. Configure dispatcher service (dispatcher.conf) as per the following:

    [upload_service_now]
    username = <Service Now username>
    password = <Service Now Password>
    wsbody_format = JSON

    Note

    wsbody_format is the format for web service body to be processed.

  3. Restart dispatcher for the changes to take effect.

  4. Create a pipeline and run the job with the following parameters:

    Webservice: upload_service_now
    Web Service Type: Invoke API
    Web Service Method: Post
    Web Service URL:  Service now table url, For example: https://dev39337.service-now.com/api/now/table/u_aditya_test

Example of a Post Body:

{"u_name": "$$name", "u_post": "$$post"}

Where u_name and u_post are column names in ServiceNow table, name and post are fields in the sonarw collection.

Archiving and Purging Reports

Reports generated by the dispatcher are stored on a local/mounted drive as defined during the system setup.

In order to manage the size usage of reports the “delete_days_interval” parameter in the dispatcher.config is set. Any report that is “older” than the number of days set will be deleted. In order to not delete any report delete_days_interval should be set to “-1”

Archiving reports:

The dispatcher provides an option to archive reports to a SCP destination.

There are 2 config parameters that govern this functionality:

  • archive_copy_dest

  • archive_after_hours

archive_copy_dest refers to a SCP section in the dispatcher.conf where the scp parameters are defined, for example:

[section_name]
copy_host = 192.168.1.1
copy_port = 22
copy_username = user1
copy_password = password1
copy_keyfile =   
copy_dest = /home/example

Archive_after_hours sets the number of hours a report needs to be “older than” in order to be archived. In order to disable the archiving set this value to “-1”

The archiving is done on an incremental basis, hence only reports that are younger than the last archive time (i.e. have not been archived previously) and and older than archive_after_hours will be archived.

Cloud backup of Reports

When a cloud storage is defined for GBDI, an automatic backup of all reports will be executed on a daily basis, using a cron-job that is created during the system setup.

All reports generated by the system are uploaded to the cloud storage.

In order to purge reports from the cloud storage a purge policy needs to be defined on the lmrm__scheduler.lmrm_dispatched_processed collection.

Reports older than the time defined in that policy will be deleted from the cloud storage once a day.

Concurrency Model

Dispatcher has the capability of running jobs concurrently. This option can be turned on or off using the config variable multi_process. This variable accepts two options: True or False.

When the option is set to True, Dispatcher will run jobs concurrently. Once it is turned on, all Dispatcher jobs will be executed using this mechanism.

The number and operation of the jobs is managed using the definition in the sonar_resource_pool section. Note:sonar_resource_pool is a reserved name; it should not be changed or used for other sections.

The sonar_resource_pool section contains the following parameters:

[sonar_resource_pool]
section_type = resource_pool
pool_size = 4                                
check_pool_period = 5                
max_execution_time = 60

If the default pool is not defined, or has definition errors, the following values are to be used by Dispatcher:

  • pool_size = 1

  • check_pool_period = 5 – In minutes

  • max_execution_time = 60 – In minutes

Overriding the max execution time

There may be cases the user would like to limit a specific job to a lower limit than the maximum time set in the configuration. It can be done by specifying the “autoCancel” parameter in the job’s URL.

Note: if the value set for autoCancel is larger than the one in the configuration, it will be overwritten by Dispatcher and the value from the configuration will be used.

For example:

https://localhost:8443/Gateway?name=timeout test&col=full_sql&type=agg&output=csv&**autoCancel=30**&published_by=admin&mongoUri=bW9uZ29kYjovLyR7dXNlcm5hbWV9OiR7cGFzc3dvcmR9QGxvY2FsaG9zdDoyNzExNy9hZG1pbg==&host=localhost&port=27117&db=sonargd&sdb=lmrm__sonarg
Automatic incrementation of the allowed number of concurrent jobs

There is a “check” cycle for Dispatcher to monitor the number of running jobs. The "check_pool_period" parameter is used for the mechanism that allow Dispatcher to increment the number of jobs on top of the “pool_size” when the pull is fully utilized.

The mechanism works as follows: Every cycle (for example check_pool_period = 5 minutes), Dispatcher reviews the existing jobs. If the number of concurrent jobs running is equal to or larger than the pool_size, Dispatcher checks if any of the running jobs have been started during the past cycle. If none of the jobs started during the last cycle (i.e. all jobs have been running for more than 5 minutes), Dispatcher is allowed to increment the actual pool-size by one (i.e. if there’s a job in the queue, it will be started in another concurrent process). Therefore, the actual pool_size is derived from the combination of all three config parameters.

Advanced Handling of Reports and CSVs

Dispatcher provides a set of config parameters (in the dispatcher.conf file) that enable the system to control the output of jobs that are generating reports.

max_CSV_line_limit – A limit, set to prevent huge PDF files from being generated. When a CSV file, returned as the report result, exceeds this limit, the output will be a CSV file, even if the job definition is set to PDF.

max_CSV_line_split – A value that controls the size of CSV files. Since some applications have limits on the number of CSV lines, this parameter is used to ensure those limits are not exceeded. When the downloaded CSV report exceeds this value, it will be split into multiple CSV files, named <name>_1.csv, <name>_2.csv, etc.

The files will be delivered in a zip archive, regardless of the bulk_email_report settings (see below).

bulk_email_report – Used when multiple jobs with the same name are scheduled at the same time. In this case, the dispatcher gathers the results of all reports, and handles them according to the settings shown below:

  • separate – Each report will be handled separately, regardless of the other jobs/reports.

  • zipped – Each report will be generated separately, and then all the reports will be added to a single zip archive for delivery.

  • merged – This applies only to PDF reports; Dispatcher will combine the outputs from individual reports into a single PDF report for delivery. Note: the max_CSV_line_limit still applies to the combination of all reports.

Mixed report types for zipped & merged:

The dispatcher can handle a mixed type of CSV/PDF reports. In both cases, a zip archive will be delivered containing all the reports; in the case of merged, the PDFs will be merged to a single file.

Clarifications to merged/zipped outcome:

Case #1:
   Config: bulk_email_report = zipped
   Scenario: there is a mix of job types (i.e. PDF in some jobs and CSV in others).
   Output:
       All files will be added to a single zip archive.

Case #2:
   Config: bulk_email_report = zipped
   Scenario: several jobs that are set to produce a PDF, some job(s) may hit the max_csv_line count and will not be converted PDFs.
   Output:
       All files (PDF and/or CSV) will be added to a single zip archive.

Case #3:
   Config: bulk_email_report = zipped
   Scenario: Some jobs are set to produce either CSV or PDF, other job/s are set to BOTH.
   Output:
       BOTH cases are switched to CSV and implemented as if the user selected "CSV".
       All files will be added to a single zip archive.

Case #4:
   Config: bulk_email_report = merged
   Scenario: there is a mix of job types (i.e. PDF in some jobs and CSV in others).
   Output:
       PDF files will be merged to a merged PDF.
       All files (including the merged PDF) will be added to a single zip archive.

Case #5:
   Config: bulk_email_report = merged
   Scenario: all jobs are set to produce PDF, one or more job(s) may hit the max_csv_line count and will not be converted to PDFs.
   Output:
       PDF files will be merged to one PDF.
       All files (including the merged PDF) will be added to a single zip archive.

Case #6:
   Config: bulk_email_report = merged
   Scenario: some Jobs are set to produce CSV or PDF, other job/s are set to BOTH.
   Output:
       BOTH cases are switched to CSV, and implemented as if the user selected "CSV".
       All files will be added to a single zip archive.

Enhanced Password Encryption

For a more secure system, there is a simple mechanism to encrypt passwords in dispatcher.conf, or similar config files.

The password encryption should be run from the virtualenv path:

/usr/lib/sonar/sonarfinder/virtualenv/bin/python /opt/sonarfinder/sonarFinder/encrypt_password.py

The encryption script can take a few flags. The -h flag will show you the options:

/usr/lib/sonar/sonarfinder/virtualenv/bin/python encrypt_password.py -h

usage: encrypt_password.py (--encryptonly | section.field | --conf_file [CONF_FILE])

Encrypt password.

Usage 1: python encrypt_password.py --encryptonly: Just displays the encrypted value without editing dispatcher.conf

Usage 2: python encrypt_password.py SECTION.FIELD: edit the field FIELD in section SECTION in dispatcher.conf, with the given encrypted value

Usage 3: python encrypt_password.py --conf_file <path of dispatcher.conf>: Encrypts every encryptable field, in every unencrypted section in dispatcher.conf.

Encryptable fields:

section[copy]:
        copy_password
section[dispatch]:
         archive_encryption_password
section[LDAP]:
         ldap_password
section[RDBMS]:
         Password
section[archive_signoff]:
         archive_signoff_pwd
section[Web-Service]:
         password,
         apikey,
         authURL,
         access_key,
         secret_key,
         client_secret

Arguments:

positional arguments:
   edit                  Replace the supplied field in dispatcher.conf
                         (SECTION.FIELD)

Working example:

/usr/lib/sonar/sonarfinder/virtualenv/bin/python encrypt_password.py ldap.ldap_password

The above command will use the positional argument ldap.ldap_password to encrypt the field ldap_password in the section ldap in /opt/sonarfinder/sonarFinder/dispatcher.conf

optional arguments:
   -h, --help            show this help message and exit
   --encryptonly         Displays the encrypted value and doesn't edit
                         dispatcher.conf; takes precedence over edit
   --conf_file [CONF_FILE]
                         Encrypt the entire given conf file, if this flag is
                         used on its own, it will default to
                         /opt/sonarfinder/sonarFinder/dispatcher.conf

Example:

sudo vi /opt/sonarfinder/sonarFinder.dispatcher.conf

[section_1]
useradmin_password = pass1
section_password = ldappass1

run:

/usr/lib/sonar/sonarfinder/virtualenv/bin/python encrypt_password.py section_1.useradmin_password
/usr/lib/sonar/sonarfinder/virtualenv/bin/python encrypt_password.py section_1.section_password

outcome in dispatcher.conf:

[section_1]
password_encrypted = true
useradmin_password = h+83As60QZR+wYL3QYSpAYfvNwLOtEGUfsGC90GEqQGH7zcCzrRBlH7BgvdBhKkBpLGFQYAUwpScWD37YuNMLg==
section_password = h+83As60QZR+wYL3QYSpAYfvNwLOtEGUfsGC90GEqQGH7zcCzrRBlH7BgvdBhKkBEtvfBQwoJ3UIgMaSiMYmZw==

Notes:

  • The very first action the script does is make a copy of the conf file in the same folder, suffixed with a date and timestamp.

  • Before making any actual changes to the original conf file, the script checks to make sure the number of lines changed is as expected.

  • If you try to encrypt a specific section, the section must exist in dispatcher.conf.

  • If the specified field does not currently exist, the script will prompt the user for a value, which it will then encrypt and add to the dispatcher.conf file.

  • The encryption key is specific to the product installation.

Switching back to clear text passwords is a manual process.

You will need to do this for each relevant section:

- Change the password values to clear text.
- Set password_encrypted flag to ‘false’.

Email Template

By defining an email template, you can customize the format of emailing the reports. The template use standard HTML basic formatting, including three optional placeholders:

  • $$header

  • $$footer

  • $$message

The template must be inserted into the lmrm__scheduler.lmrm__email_template collection; after this is done, you can select the template via the dropdown button in scheduler form. The identifier field for this process is “Email Template” which must hold a unique name. If the identifier field exists, dispatcher will use the template to customize the email; otherwise, it will use the default template (the regular format for emailing).

Chain_jobs

The chain_jobs feature in dispatcher allows you to group jobs which share the same name and cron string.

This feature is different from Bulk jobs (merged/zipped) in the following ways:

  • Once jobs are "linked", they will be executed in a serial order based on their "jobLinkedOrder" values, starting from the lowest value, and

  • they will be considered separate jobs that create separate reports. 

When the dispatcher identifies more than one job with the same name scheduled at the same time, it will check if at least one of the jobs has the "linkedJob" set to true. If so, it will regard them as linked jobs and not as Bulk.

If two or more jobs have the same "jobLinkedOrder" value, they will be executed one after the other arbitrarily.

If any of the jobs in the group fails, dispatcher will not continue processing the remaining jobs.

Python Plugin

You can add your own python programs and run them using the scheduler and dispatcher. To do this, a defined section needs to be added to dispatcher.conf file:

[section_name]
section_type = python-plugin
python_executable_path = /custom_path/venv/bin/python
plugin = /path1/script.py -p1 -p2.
  • section_type” is the identifier in the code which triggers the run of the python program.

  • python_executable_path” defines the path for the virtual env. If this field is missing or empty, the default python binary defined in [dispatch] section will be used: sonar_python_executable_path = /usr/lib/sonar/sonarfinder/virtualenv/bin/python

  • If “sonar_python_executable_path” is not defined, dispatcher will use the hard-coded path: DEFAULT_PYTHON_EXEC_PATH = "/usr/lib/sonar/sonarfinder/virtualenv/bin/python"

  • plugin” specifies the path to the python script. If any parameter needs to be passed to the script, it should be defined in this field (e.g. -p1 and -p2 are arguments passed to the python script).

To run the python program, “section_name” needs to be passed to Plugin in scheduler form. This job will not interfere with regular dispatcher jobs, as it only triggers the python program to run and does not create reports.

PDF Enhancement

The dispatcher provides a set of parameters in the config file, which allow users to customize their PDF reports.

pdf_default_document_height_inches - Default height in inches, different number of rows may fit in a page, depending on the input.

Calculated Page Height

A dynamic value, calculated based on the following two parameters, when relevant.

pdf_minimum_lines_per_page - Relevant only when the pdf_expand_page_on_big_cell is true.

pdf_expand_page_on_big_cell - Boolean. If true - page height will be calculated, based on the minimum lines per page. Final page height will be the bigger of the default or calculated page height.