Skip to main content

 

 

Coupa Success Portal

Extract Transform Load (ETL)

Overview

CLMA uses an Extract, Transform, Load (ETL) process to take template variables that are stored in key value pairs within the exarianswers database and transform them into normalised reporting tables. This allows ease of use for reporting on answer values committed during the interview process.

The ETL software that is used is Pentaho Data Integration (PDI). Out-of-the-box, CLMA ships with one ETL configured against the Standard Data Capture template, which covers all of the Universal Contract Model data points. For other templates and variables, a custom ETL will be required to make data available to the reporting application. As server and database access is required, this is a task that can only be performed by Coupa.

Scoping Questions

  • Are bespoke reports in scope / part of the solution?
  • Has custom fields been configured as part of the data model? Do these need to be available for bespoke reports?

Design Options

  • Will the data be stored within CLMA or pushed to a SFTP where a 3rd party application can import from? This question will determine whether all of the data needs to be ETL’ed, or only the additional custom fields created.
  • The next design option relates to the database schema i.e. designing the tables that will store the additional custom data points. As variables can contain multiple values and can repeat in groups, it is not just a case of adding additional columns to the existing tables. (If the target system is the exarianswers database, then the design of the tables must be done in consultation with a Coupa Technical Architect, who will ultimately need to create/modify the tables in the relevant database.)
  • The last design option is whether the ETL does a full data dump or just the delta. The latter is strongly recommended to ensure quicker processing times.

Considerations

  • The OOTB ETL points to the Contract Summary template located at the following path in DocGen: /files/Demonstration/Contracts/Contract-Summary-CM.xml. If a customer would like to use the standard reports or build bespoke reports, the ETL must be modified to point to their specific template path(s) within DocGen. 
  • The standard ETL relies on the template authors using the template variables exactly as they are in the Contract Summary template (or the specific template paths for their custom reports). Modifications or exclusions of variables may cause values to not populate within the normalised reporting tables.
  • By default, contract templates do not store the values of ‘derived’ or calculated data points. Whenever the template runs, it will re-calculate the values for such data points. An example is the ‘Expiry date’, which typically is set up to be calculated using the Effective date + the Contract Term/Duration.
  • If derived values need to be used in reporting, there is a task on the template author to tag a “relation” of ‘alwayssave onto the data point. This will ensure that the value is stored along with the rest of the data points.
  • Consider to put a limit on the number of variables that the customer wants to put into their reporting schema. Have the data dictionary and table schema predefined and completed before starting developing the ETL.

Things to avoid

  • Avoid starting a custom ETL until the data dictionary has been finalized and the templates are in a stable state. Many times variable names change during the authoring build phase. The ETL will need to be mapped to the exact variable name in use.
  • Avoid authoring that uses multiple variables pointing to the same UCM or custom field. This will create additional development work within the ETL.
  • Do not go live until it has been confirmed that all reportable fields are available to the reporting engine. This is especially true for ‘Derived’ data points. It is time consuming to apply an ETL to historic data to retrospectively update the reporting data tables.
  • Was this article helpful?