ETL or Data warehouse Testing Tips , Challenges

Primary tabs

ETL or data warehouse is one of the offerings which are developing rapidly and successfully. ETL testing (Extract, Transform, and Load), this article is an intro to  ETL testing. Independent Verification and Validation is becoming more popular by business and has huge potential and many companies are now seeing this as prospective business gain. ETL is different from Database testing or backend testing both in terms of complexity and scope and hence requires a different approach as against conventional testing. The methodology primarily aims at identifying and mitigating existing data defects and general errors occurring prior to processing of data for analytics and reporting.

ETL Testing Methodology includes following steps:

1.    Understanding and review of data model and actual data

2.    Understanding and review of source  to target mappings (this is called transformations)

3.    Check the Quality of Source data

4.    Create Packages for testing

5.    Schema testing (source and target)

6.    Verify Data completeness -  Make sure that all projected data is loaded into the data warehouse without any data loss and truncation.

7.    Verify transformation rules - Verify that data is transformed correctly according to various business requirements and rules

8.    Sample data comparison between source and target -   Check that application appropriately rejects, replaces with default values and reports invalid data.

9.    Checking of referential integrities and relations (primary key foreign key)

10.  Data Quality checks on target warehouse

11.  Performance tests  - Data should load in timeframe required and must be scalable

 

Depending upon your project your ETL or Data warehouse testing can be of 4 different types

1.       Migration Testing –  Existing Datawarehouse migration to new one.

2.       New Data Warehouse Testing – With ETL tools you will verify the New data warehouse from scratch.

3.       Enhancing Existing DW – Addition of  new data or Business rule to an existing DW.

4.       Report Testing – Report must be tested by validating layout, data in the report and calculation. Reports are basic purpose of a DW.

 

After test cases are ready and are approved, testing team will perform pre-execution check and test data preparation for testing

Finally execution is performed till exit criteria is met. Upon successful completion summary report is prepared and closure process is done. It is necessary to define test strategy which should be mutually accepted by stakeholders before starting actual testing. A well-defined test strategy will make sure that correct approach has been followed meeting the testing aspiration. ETL testing might require writing SQL statements extensively by testing team or may be tailoring the SQL provided by development team. In any case testing team must be aware of the results they are trying to get using those SQL statements.

 

ETL Testing Challenges – As with any type of testing ETL testing has its own set of challenges and issues. Lets see some common ones.

-          Business rules are incomplete or not thought of at all. Client will clarify as you go.

-          Test data acquiring itself is painful .

-          Volume and complexity of data is very huge.

-          Data is either Duplicate or Incompatible

-          During Transformation or load the data is lost

-          Test bed may not be available with all resources

-          ETL jobs require access to system resources and not all testers may have required access.

-          Business flow or data flow in application is missing.

 

If you have any more information on ETL testing do feel free to comment here.