Lecture2 - Rabieramadan.org

January 6, 2018 | Author: Anonymous | Category: Engineering & Technology, Computer Science, Databases
Share Embed Donate


Short Description

Download Lecture2 - Rabieramadan.org...

Description

Data Warehouse Fundamentals

Rabie A. Ramadan, PhD 2

Your Assignment 

For an airlines company, how can strategic information increase the number of frequent flyers? Discuss giving specific details.



You are a Senior Analyst in the IT department of a company manufacturing automobile parts. The marketing heads are complaining about the poor response by IT in providing strategic information. Draft a proposal to them explaining the reasons for the problems and why a data warehouse would be the only viable solution.

2

Lecture Objectives 

Review formal definitions of a data warehouse

 





Discuss the defining features Distinguish between data warehouses and data marts Study each component or building block that makes up a data warehouse

3

What is a Data Warehouse? (a practitioner’s viewpoint) 

“A data warehouse is simply a single, complete, and consistent store of data obtained from a variety of sources and made available to end users in a way they can understand and use it in a business context” – Barry Devlin, IBM Consultant



“A data warehouse is a database of data gathered from many systems and intended to support management reporting and decision making” – Michael Corey et al, CTO of OneWarranty.com 4

What is a Data Warehouse? (a Classical viewpoint) 

According to W. H. Inmon (Building a Data Warehouse, 1992) “A DW is a subject oriented, integrated, time varying, non-volatile collection of data that is used primarily in organizational decision making.”

5

WHAT IS DATA WAREHOUSING 

A data warehouse is typically a dedicated database system for decision making that is separate from the production database(s) used operationally. It differs from production system in that:

• it covers a much longer time horizon than transaction • •

systems it includes multiple databases that have been processed so that the warehouse’s data are defined uniformly (i.e., ‘clean’ data) it is optimized for answering complex queries from managers and analysts. 6

Standard DB v. DW

7

CHARACTERISTICS

8

CHARACTERISTICS

9

Characteristics of a Data Warehouse

10

Characteristics of a Data Warehouse

11

SUBJECT ORIENTATION 

Data is organized around major subjects of the enterprise.

12

Subject Oriented 

Data warehouses are designed to help you analyze data.

For example, to learn more about your company's sales data, you can build a warehouse that concentrates on sales. 

Using this warehouse, you can answer questions like "Who was our best customer for this item last year?" This ability to define a data warehouse by subject matter, sales in this case, makes the data warehouse subject oriented.



E.g. claims data are organized around the subject of claims and not by individual applications of Auto Insurance and Workers’ Comp 13

Class Activity 



A data warehouse is a subject oriented. What would be the major critical business subject for : A local community bank as a business unit

Customer Profit Loans

14

Integrated 

Integration is closely related to subject orientation.



Data warehouses must put data from disparate sources into a consistent format. They must resolve such problems as naming conflicts and inconsistencies among units of measure.





When they achieve this, they are said to be integrated. 15

Non volatile 

Non-volatile means that, once entered into the warehouse, data are not changed/updated.



This is logical because the purpose of a warehouse is to enable you to analyze what has occurred.

16

Time Variant 







In order to discover trends in business, analysts need large amounts of data. This is very much in contrast to online transaction processing (OLTP) systems, where performance requirements demand that historical data be moved to an archive. The data are kept for many years so they can be used for trends, forecasting, and comparisons over time. A data warehouse's focus on change over time is what is meant by the term time variant. 17

Data Granularity

18

DATA MARTS 

Data Mart: A scaled-down version of the data warehouse

 

A data mart is a small warehouse designed for the Small Business Unit (SBU) or department level.



It is often a way to gain entry and provide an opportunity to learn



Major problem: if they differ from department to department, they can be difficult to integrate enterprise-wide 19

Data Warehouse and Data Mart

20

Data Mart and Data Warehouse

21

Data Warehouse COST 

Data warehouses are not cheap •

Median cost to create (does not include operating

cost) = $2.2M 

Multimillion dollar costs are common



Their design and implementation is still an art and they require considerable time to create. 22

Data Warehouse SIZE 

Being designed for the enterprise so that everyone has a common data set, they are large and increase in size with time.



Typical storage sizes run from 50 Gigabytes to several Terabytes

23

APPLICATION - DATA MINING 



Also known as Knowledge Data Discovery (KDD) Mining terminology refers to finding answers about a business from the data warehouse that the executive or analyst had not thought to ask

24

Data Warehouse Architectures

25

Data Warehouse Architectures: Basic

26

Data Warehouse Architectures: with a Staging Area

27

Data Warehouse Architectures: with a Staging Area and Data Marts

28

A General Architecture for Data Warehousing

29

30

Problems and Issues

31

Data Systems Supporting DW

32

Class Activity

33

Class Activity 

What are the main components of a data warehouse for your school system?

34

Project 

Egypt Election System



Governorates’ database system



Summarization System



Data Warehouse Server



Web page with query based system

• Multiple databases on Multiple Servers • Meta data

35

View more...

Comments

Copyright � 2017 NANOPDF Inc.
SUPPORT NANOPDF