Why You Need It
Comprehensive testing of a data warehouse at every point throughout the ETL (extract, transform, and load) process is becoming increasingly important as more data is being collected and used for strategic decision-making. Data integration projects are initiated as a result of mergers and acquisitions, compliance and regulations, data consolidation, and the increased reliance on data-driven decision making (using Business Intelligence tools, etc.). Data validation is commonly implemented either manually or with the help of a tool (functional testing tool, ETL tool, proprietary utilities). There are several challenges associated with these methods:
- Data volume: Due to the vast amount of data, traditional testing methods often do not provide the desired amount of coverage. With manual comparison, small sample sets of data may not fully represent all data within the system.
- Data complexity: When extracting and integrating data from multiple systems, teams may find from profiling studies that the data may contain errors, missing values, consistency, and integrity problems.
- Data quality standards & regulations: Ensuring that standards are met requires substantial time and financial commitment.
- Data integrity: Data cannot support decision-making because fields may be duplicated, missing, incomplete, or incorrectly migrated/ETL’d.
- Resource allocation: Manual validation of ETL’d data is very resource-intensive, requiring a large number of skilled testers. Due to budget and time constraints, this may lead to large gaps in test coverage.
How QuerySurge Can Help
The solution to ensuring data quality within the data warehouse is simple - compare all of the data in the source system with all of the data loaded into the data warehouse. While 100% data validation may be impractical, getting as large a data sample as is possible is your best insurance of a successful delpoyment. As with any other type of testing, the use of an automation tool increases the testing coverage to achieve the optimal level of data quality within the data warehouse architecture.
QuerySurge™ was built specifically for implementing data warehouse testing projects. It streamlines and accelerates testing while drastically improving data coverage throughout your projects.
- Validates as much as 100% of all data in less than .01% of the time compared with manual testing
- Automates data verification using mapping-specific data-centric testing, not just row counts or lower-quality test methods
- Enables teams to pinpoint defects earlier in development, reducing risk and saving money
- Produces audit trail documentation for compliance with regulations and data quality standards
- Generates both summary and detailed reports to share among teams
- Allows you to quickly update tests with Reusable test code for far-reaching ETL changes, making maintenance much easier
- Provides scheduling and workflow management capabilities to improve resource efficiency
- Reduces ETL development cycle time to help avoid late releases
Our Team's Expertise
Our team at RTTS developed our first data warehouse testing utility while we were serving on a client engagement. The tool greatly simplified our testing process and produced reliable results. We soon realized that we could not achieve comparable coverage, speed, and flexibility with any other existing testing methods or software, so we expanded upon the original utility to produce QuerySurge.
The experts who developed QuerySurge have many years of both strategic planning and tactical implementation experience. RTTS offers consulting, mentoring, and educational services to give you access to our team’s knowledge and skills for your data warehouse QA projects.