Why ETL Tools
Let me try to explain importance of Extract – Transform – Load (ETL) tools in DW/BI space. To start with let’s think of an organization having tens of years of history and presence in almost all the geographies. You would agree that over these years company must have emerged from manual book keeping style to sophisticated ERP systems. During this process such an organization must have gone through various technological changes ranging from manual systems to simple in-house applications and data storages ranging from flat files to relational databases. This transformation resulted into different businesses, or to be precise different sub businesses within a business, running different applications, different hardware and different architecture.
Scenario like above results in having organization spread globally having different tools technologies used in different geographic locations. Example would be Organization’s unit in Europe using Mainframes like AS400 vs. US unit using SAP system to manage company operations. In such a setup if Organization’s top management needs a consolidated report of all the assets of the company, report writers struggle to gather all the data.
One of the ways for report writers is to ask for the reports from disparate systems. Consolidate these reports manually to get report in desired format. This is really cumbersome process and might take days to get to final report delivered to management.
Another, but efficient way would be to have a system that fetches data from these disparate sources, stores it in a data warehouse environment and generates a report whenever needed.
Ok, so we agree that second approach looks good. Most importantly because we are eliminating a lot of manual work needed and thus saving any delay to provide information.
Now the questions arises how to fetch the data from these different systems, make it coherent, and load it into a Data Warehouse? We need a methodology, a tool that can extract the data, cleanse it and load ultimately to data warehouse application. Note here that ETL stands for Extract, Transform and Load.
There are variety of ETL tools available, both commercial and open source. See chart below for few of the tool names that you should know. Obviously, Informatica is one of the best ETL tools available in the market.
|List of ETL Tools||
|Informatica PowerCenter||10.2||Informatica Corporation|
|Oracle Data Integrator||12c||Oracle|
|SAS Data Integration||9.4||SAS|
|Pentaho Data Integration||8.0||Hitachi Data Systems|
|Pervasive Data Integrator||11||Actian / Pervasive Software|