Compare Queries allow us to compare two data providers, that should contain similar data, and report on their differences.
Compare Queries can be used to validate that sources are identical, e.g. customer details in two systems or to reconcile aggregated data, e.g. aggregated sales per month from a sales system to similar data in billing.
In this tutorial we will create a Compare Query which compares two data sources containing customer information, and finds all discrepancies in the data.
When we are finished, we will have a Test that validates that customer information is equal in two different systems, and notifies an owner about all discrepancies.
- Create a basic Compare Query
- Data sets
- Column Mapping
- Preview Mode
- Comparing to sub-sets
- Compare Query Output
- Exception Management
Create a basic Compare Query
To create the Compare Query, right click the Tests folder and select: New > Compare Query. Give it a name and press OK.
A Compare Query consists of:
- Two data sets (DS1 and DS2), which both contain a Data Provider and a query to that provider that should return comparable data.
- Column mapping between the two data sets.
Data Provider is something we query to retrieve data on a structured format. This can, for example, be a SQL Server, Oracle server, PowerShell script, Excel document or an OLAP cube which is queried with MDX.
We select the data provider for each query and write a SQL query to retrieve customer information from the source systems.
We can test our SQL query by pressing F5 or pressing the lighting bolt icon.
It is recommended to name the data sets to describe to the end user the origin of the data. Below we have named DS1: ADW1 for Adventure Works 1, and DS2: ADW2 for Adventure Works 2, two separate instances of some business database.
It is also recommended to write a short description of the purpose of this compare query.
After querying our two data providers, we need to map the columns together to be able to compare the results.
We can do this manually by adding column by column, or we can use the Automap function. With Automap, exMon will guess how the data sets fit together.
For each column we map, we can choose between three mapping types:
- KEY: Key columns are one or many columns that create a unique identifier for the data. This can, for example, be a customer id in customer data, or a multi-part key consisting of product id and a year in data consisting of sales per product.
Each compare query must have at lease one key column mapped.
- VALUE: Value columns are columns we need to compare and are not a part of the unique key for the data. This comparison can be between strings, dates or numerical values. Variances in numerical values, for example amounts, are aggregated and summarized in the results. In our screenshot we are comparing string values: PersonType and Name columns.
- INFO: Info columns are columns that are informational only and are used to help the user to understand the results. For example, in a Compare Query comparing products, we could include the product name, even though we are not comparing those between data sources. We will not use an info column in this tutorial.
If a Value column is numeric, we can define a threshold for the acceptable variance; either as an absolute number (0.1) or a percentage (0.1%).
By settings a threshold, we are defining how much variance is acceptable between the columns before it counts as an error.
This can be helpful when comparing calculated values and need, for example, to allow difference of 0.1 or 0.5% because of rounding errors.
We can always test our Compare Query by using the Preview mode.
Click Preview to test the Compare Query. By default, it runs against Development Data Providers but you can also select Production.
To re-run the preview, click Start.
With preview mode you can see errors and variances created, see all emails generated and how it affects exception management.
In this example, exMon reconciled 19.772 rows. 12 rows have differences in the LastName and are marked with VARIANCE ERROR. 1 row exists in Data set 1 (ADW1) but does not exists in Data set 2 (ADW2) and results in a KEY ERROR.
Comparing to sub-sets
If we are comparing two data sets, were one is a sub-set of the other, we can toggle the "DS1 incl. DS2" and "DS2 incl. DS1" buttons.
For example, we could be comparing a master product list with a product list in a PoS system at a store, containing only the products sold at that location.
- When only DS2 incl. DS1 is checked, then all key combinations in Data set 1 must exist in Data set 2.
- When only DS1 incl. DS2 is checked, then all key combinations in Data set 2 must exist in Data set 1.
- When both buttons are checked, then key combinations in both Data sets must match.
- When neither buttons are checked, then exMon only reports on variance errors on rows that are matched.
In our case, we make sure only "DS2 incl. DS1" is toggled. When previewing the Compare Query again, we can see that the KEY ERROR is gone.
Compare Query Output
Compere Query Results can be used in three ways in exMon:
- A Compare Query can be configured to send out a Summary Email on each execution. The email includes a list of the results and an attached Excel documentation for easy manipulation. Summary emails are often used when something needs to be fixed right away.
- The failure of a Compare Query can be used to stop a process flow in exMon, for example stopping a process if some data does not reconcile.
- If a Compare Query is configured to use Exception management, then exMon will create exceptions for each error and variance row from the results and only send out notification when new exceptions are found. Visit Configuring Exception Management (Tutorial) for more information.
Summary email is configured in the properties pane and when enabled, will always send an email if the Compare Query finds any errors or variances when comparing the data sets.
We configure the recipient, subject and body.
When we test the Compare Query using preview, we now see our email under Query Emails. You can view the email and the Excel document attached with the email.
Looking at the results in Preview, we can see that they can be improved to make the data easier to understand.
With Column Formatting we can reorder columns, change captions, format results and hide columns.
Under Column Formatting in the property grid, select Edit....
After configuring the formatting, you can preview your changes using Preview.
Basic Compare Queries have no memory of past executions and in each execution it sends the full results of the Compare Query.
The Compare Query can be made more intelligent by enabling Exception Management. Exception Management tracks each row from the results as an exception, keeping track of it’s lifespan and only notifies users of new exceptions.
You can read more about Exception Management in Configuring Exception Management (Tutorial) and Using the exMon Continuous Monitoring Portal (Tutorial).