GBDI 4.1

Ingesting Additional Guardium Data Domains

GBDI receives data extract files from Guardium collectors (a.k.a CMs) as part of the built-in GBDI ETL. This data includes session data, query data, Full SQL data, extraction logs, buffer usage monitor data, VA data, exceptions and more. These data extract files are part of the standard Guardium-to-GBDI interface.

Additional data from Guardium can be brought over to the GBDI system with a simple configuration on the Guardium appliances. This is based on the Data Mart (DM) mechanism available in Guardium V9 and up. No new code is required – just a configuration. The GBDI ETL can already ingest any data brought over from the Guardium infrastructure with no additional changes. This means that any data for which a report can be defined on the Guardium system can be brought over and ingested into the GBDI warehouse.

There are three stages to the configuration:

  1. Defining a data mart: Define the data mart based on a query or a report using the query builder or any report. Refer to Guardium documentation for more information about Data Mart definitions. When you define the DM specification, typically set it to run on an hourly basis. Set the extraction result to a file and set up a schedule. IMPORTANT: The file name must start with EXP_ -- this tells the GBDI ETL to process the file as a Guardium DM extract.

  2. Defining the export: For every DM stream that you want sent over to you need to specify two activation parameters using grdapi – one to include headers and one to tell the collector to copy files as they are generated to the GBDI host. The two commands to issue (per DM) are:

    grdapi datamart_include_file_header Name=”<your DM name>” includeFileHeader=”Yes”
    grdapi datamart_update_copy_file_info destination=…


    Refer to Guardium documentation for more information on grdapi.

  3. Configuring ingestion: As long as the DM is defined using the EXP prefix, no changes are needed, and all data will be ingested into the appropriate collections within the warehouse. For data type casting, change sonargd.conf to specify numeric and date fields.

From a mechanism perspective, all DMs defined as above use the from-to dates to provide incremental extracts to be passed to GBDI. Every run of the DM starts off at the time greater than or equal to the last DM run. The timestamp used is whatever is defined in the Guardium domain. Understand what this field is per domain to avoid confusion of what data will be sent. As an example, the Guardium user data has a timestamp for last login time. Therefore each extract will resend all user definitions for users that have logged in since the last time the DM was run. This means that all user data will be recorded in GBDI but the collection will include duplicates and thus GBDI reports on such data should use a group clause.

Another note on mechanism, when you are running a CM environment you only need to define the DM once on the CM or any collector and need to run the scheduling grdapi scripts on each collector.