Posts

Key Benefits of a Data Warehouse

Cool Data Warehouse

Data Warehouses are centralized data repositories that integrate data from various transactional, legacy, or external systems, applications, and sources. The data warehouse provides an environment separate from the operational systems and is completely designed for decision-support, analytical-reporting, ad-hoc queries, and data mining. This isolation and optimization enables queries to be performed without any impact on the systems that support the business’ primary transactions (i.e transactional and operational systems). Read more

Share

Data Warehouses (EDW vs DataMarts)

There are two fundamental types of data warehouses:
•  Enterprise Data Warehouses (EDW)
•  Data Marts
Overview of Enterprise Data Warehouse (EDW) and Data Marts

Overview of Enterprise Data Warehouse (EDW) and Data Marts

Enterprise Data Warehouse (EDW): The enterprise data warehouse is typically a large organization-wide database repository that crosses over every business function and includes data from every organizational unit, division, and department.  In essence, an enterprise data warehouse is a substantially large repository of historical and current transaction data of an entire organization. As an implementation of an enterprise data warehouse is commonly strategic in nature, the volume of data in enterprise data warehouses tends to be quite large.  Enterprise data warehouses can contain hundreds of gigabytes, terabytes, and sometimes even petabytes of data.

Data Mart: A data mart is a collection of subject areas organized for decision support based on the needs of a given department or office.  Many times, a data mart will serve as the reporting and analytical solution for a particular department within an organization, such as accounting, sales, customer service, and/or marketing.  For the most part, data marts are designed with just enough data entities, fields, and records to satisfy one department’s requirements.

There are two kinds of data marts that exist — dependent and independent:
•  A dependent data mart is one whose source is another data warehouse, and all dependent data marts within an organization are typically fed by the same source — the enterprise data warehouse.
•  An independent data mart is one whose source is directly from transactional systems, legacy applications, or external data feeds.
Share

Dimensional Modeling and Data Warehouses

Dimensional modeling is a specific discipline for modeling data that is an alternative to entity-relationship (E/R) modeling. A dimension model contains the same information as an E/R model but packages the data in symmetric format whose design goals are user understandability, query performance, and resilience to change.
Ralph Kimball, PhD, The Data Warehousing Lifecycle Toolkit, 1998

Basic Dimensional Model (Star Schema)

Basic Dimensional Model (Star Schema)

Dimensional modeling is a data modeling technique used to support on-line analytical processing (OLAP) systems and is implemented in databases that host either an enterprise data warehouses or data marts. The key point on the design of dimensional models is to resolve questions in the format “measures by dimensions.”  In addition, dimensional models are commonly referred to as star schema as they comprised of a central fact table surrounded by several dimension tables.

Within a dimensional model or star schema, there exists two types of data entities or tables
•  Facts (Measurements – Numerical Values)
•  Dimensions (Contexts and Attributes – Text, Strings, Dates, & Flags)
Transactional (OLTP) Systems to Analytical (OLAP) Systems

Transactional (OLTP) Systems to Analytical (OLAP) Systems

Within an enterprise data warehouse or data marts, data is fundamentally static, non-volatile and does not get updated.  Rather data is inserted or loaded in bulk into the tables in the model utilizing using batch programs or extraction, transformation, & loading (ETL) routines.  End-users of dimensional models develop queries that either read or select data, and there is no end-user inserting, updating, or deleting of data.  Data in dimensional databases requires data to be converted or extracted from on-line transactional processing (OLTP) or other OLAP systems.

The key benefits of dimensional models and data warehouses include ….
•  Separate environment from transactional systems
•  Allows for high-performance of select/read queries
•  Insulated from changes in source systems
•  Intuitive to developers and business users of queries
•  Contains data from multiple source systems
•  Optimized format for data warehouses and data marts
Share

Data Warehouse Subject Areas

Subject areas within a data warehouse or data mart are physical tables that are grouped together in a dimensional model or star schema that reflect general data or functional categories. Subsequently, subject areas are synonymous with functional areas and each subject area identifies and groups the data that relates to a logical area of the business.

Data Warehouse Subject Areas

Some of the subject areas that are common to most corporations and organizations include …
* Sales
* Product
* Order
* Shipment
* Work Effort
* Invoice
* Accounting
* Human Resources
Share