QuerySurge Finds Bad Data

These are the types of Data Bugs that QuerySurge can pinpoint in your critical data project

Home
Product Tour
Defects We Find

Finding Bad Data (also known as Data Bugs)

Below are the typical types of problems that QuerySurge™ will find in your data warehouse, big data, and BI projects. Using QuerySurge avoids the adverse impact any of these defects can have on your critical data and on the business decisions your team makes based on bad information in your Business Intelligence and Analytics reports.

Issue	Description	Possible Causes	Example(s)
Missing Data	Data that does not make it into the target database	- Invalid or incorrect lookup table in the transformation logic - Bad data from the source database (Needs cleansing) - Invalid joins	Lookup table should contain a field value of “High” which maps to “Critical”. However, Source data field contains “Hig” - missing the h and fails the lookup, resulting in the target data field containing null. If this occurs on a key field, a possible join would be missed and the entire row could fall out.
Truncation of Data	Data being lost by truncation of the data field	- Invalid field lengths on target database - Transformation logic not taking into account field lengths from source	Source field value “New Mexico City” is being truncated to “New Mexico C” since the source data field did not have the correct length to capture the entire field.
Data Type Mismatch	Data types not set up correctly on target database	Source data field not configured correctly	Source data field was required to be a date, however, when initially configured, was setup as a VarChar.
Null Translation	Null source values not being transformed to correct target values	Development team did not include the null translation in the transformation logic	A Source data field for null was supposed to be transformed to ‘None’ in the target data field. However, the logic was not implemented, resulting in the target data field containing null values.
Wrong Translation	Opposite of the Null Translation error. Field should be null but is populated with a non-null value or field should be populated, but with the wrong value	Development team incorrectly translated the source field for certain values	Ex. 1) Target field should only be populated when the source field contains certain values, otherwise should be set to null Ex. 2) Target field should be "Odd" if the source value is an odd number but target field is "Even" (This is a very basic example)
Misplaced Data	Source data fields not being transformed to the correct target data field	Development team inadvertently mapped the source data field to the wrong target data field	A source data field was supposed to be transformed to target data field 'Last_Name'. However, the development team inadvertently mapped the source data field to 'First_Name'
Extra Records	Records which should not be in the ETL are included in the ETL	Development team did not include filter in their code	If a case has the deleted field populated, the case and any data related to the case should not be in any ETL
Not Enough Records	Records which should be in the ETL are included in the ETL	Development team had a filter in their code which should not have been there	If a case was in a certain state, it should be ETL’d over to the data warehouse but not the data mart
Transformation Logic Errors/Holes	Testing sometimes can lead to finding “holes” in the transformation logic or realizing the logic is unclear	Development team did not take into account special cases. For example international cities that contain special language specific characters might need to be dealt with in the ETL code	Ex. 1) Most cases may fall into a certain branch of logic for a transformation but a small subset of cases (sometimes with unusual data) may not fall into any branches. How the testers code and the developers code handle these cases could be different (and possibly both end up being wrong) and the logic is changed to accommodate the cases. Ex. 2) Tester and developer have different interpretations of the transformation logic, which results in different values. This will lead to the logic being re-written to become more clear.
Simple/Small Errors	Capitalization, spacing and other small errors	Development team did not add an additional space after a comma for populating the target field.	Product names on a case should be separated by a comma and then a space but target field only has it separated by a comma
Sequence Generator	Ensuring that the sequence number of reports are in the correct order is very important when processing follow-up reports or answering to an audit	Development team did not configure the sequence generator correctly resulting in records with a duplicate sequence number	Duplicate records in the sales report was doubling up several sales transactions which skewed the report significantly
Undocumented Requirements	Find requirements that are “understood” but are not actually documented anywhere	Several of the members of the development team did not understand the “understood” undocumented requirements.	There was a restriction in the "where" clause that limited how certain reports were brought over. Used in mappings that were understood to be necessary, but were not actually in the requirements. Occasionally it turns out that the understood requirements are not what the business wanted.
Duplicate Records	Duplicate records are two or more records that contain the same data	Development team did not add the appropriate code to filter out duplicate records	Duplicate records in the sales report was doubling up several sales transactions which skewed the report significantly
Numeric Field Precision	Numbers that are not formatted to the correct decimal point or not rounded per specifications	Development team rounded the numbers to the wrong decimal point	The sales data did not contain the correct precision and all sales were being rounded to the whole dollar
Rejected Rows	Data rows that get rejected due to data issues	Development team did not take into account data conditions that could break the ETL for a particular row	Missing data rows on the sales table caused major issues with the end of year sales report

QuerySurge will help you:

Leverage artificial intelligence to quickly & easily increase test coverage
Continuously detect data issues in the delivery pipeline
Utilize analytics to optimize your critical data
Improve your data quality at speed
Provide a huge ROI

But don’t believe us (or our clients). Try it for yourself.
Check out our free trials and great tutorial