|
ETL Framework
The ETL Framework concept broadens the scope of the types of data sources available for intelligence analysis. The objective of the ETL Framework is to integrate heterogeneous types of link support information (LSI) from multiple sources in preparation for effective information sharing and analysis. Data pumps carry out ETL (extraction, transformation, and loading) of various types and formats of data. The data is extracted from practically any format including, but not exclusively, SQL based databases, XML files, flat files, MS Excel documents, MS Access, legacy systems and many others. Therefore, the data used in a specific solution can come from almost any source. The data is transformed into canonical forms which are designed for optimization of analysis. This process includes the business logic related to the specific type of data in order to build the LSI. The processing also attempts to identify entities which appear in more than one data source. For instance a person that is identified by a drivers license in one data source and a social security number in another source. Finally, the transformed data is loaded into a central database which stores the entity data and LSI which are used for processing and analysis by other SN-Sphere components and/or other external applications. The data fusion elements of a solution are optional. The other SN-Sphere components can use one or more existing normalized databases in which data has been collected by other systems.
|