Wednesday, June 15, 2011

From ETL to T-ETL - Its Advantages

In June, 2010 I started my blogging with ETL & ELT. Today I am taking it to next level and see T-ETL. What it is and how it can benefit the Enterprises.
Federated Approach use traditional method of data consolidation. Consolidated data stores, which are typically managed to extract, transform, load (ETL) or replicate data, are becoming standard choice for information integration today. In today's world ETL Tools and in certain cases Data Stores and Streaming data together are becoming best way to achieve fast, highly available, and integrated access to related information. By combining data consolidation with federation, businesses achieve the flexibility and responsiveness that is required in today's fast paced environment.

What we achieve if we integrate InfoSphere DataStage and InfoSphere Federation Server to perform data consolidation. On its integration InfoSphere Federation Server can be used as data pre-processor or performing initial transformations on the Data either on Source or on Data Extraction Piece. It means we are introducing Transformation before real ETL and is named as T-ETL. The T-ETL architecture can use federation to join, aggregate, and filter data before it enters InfoSphere DataStage, which can use its parallel engine to perform more complex transformations and the maintenance of the target.
The architecture draws on the strengths of both products, producing a flexible and highly efficient solution for data consolidation; WebSphere Federation Server for its joining and SQL processing capabilities, and WebSphere DataStage for its parallel data flow and powerful transformation logic. The WebSphere Federation Server cost-based optimizer also allows the T- ETL architecture to dynamically react to changes in data volumes and patterns, without the need to modify the job.

Transformation followed by ETL (T-ETL) is not a new concept and is as old as ETL and ELT. Many ETL jobs already employ some form of transformation while extracting the data, say filtering and aggregating data, or performing a join between two source tables, which reside on the same source database.  Only restriction that the source objects must exist on the same data source has severely limited the scope of T-ETL solutions to date. InfoSphere Federation Server removes this limitation and extends this initial transformation stage to heterogeneous data sources that are supported by InfoSphere Federation Server.
-Ritesh
Disclaimer: The postings on this site are my own and don't necessarily represent IBM's positions, strategies or opinions

No comments:

Post a Comment