Data warehousing Architecture
Data Warehouse Architecture
Data Warehouse server – almost always a relational DBMS,rarely flat files
OLAP servers – to support and operate on multi-dimensional data structures
Clients – Query and reporting tools – Analysis tools – Data mining tools
Difference between OLTP V/s Data warehouse???
Difference between OLTP V/s Data warehouse???
What is Data warehousing???
Definition:-
A data warehouse is collection of data marts.
Represents historical data. DW is a relational Database
which is specially designed for analysis Purpose rather then for transactional
purpose.
-subject-oriented,
-integrated,
-time-variant,
-nonvolatile
collection of data in support of management’s decision
making process.
Subject-oriented
Data warehouse is organized around subjects such as sales,
product, customer.
Integration
Data Warehouse is constructed by integrating multiple
heterogeneous sources.
Data Preprocessing are applied to ensure consistency.
Time-variant
Provides information from historical perspective
Eg. past 5-10 years.
Which supports business users in analyzing the business
and comparing the business with different time periods.
Ex:- Yearly, Half yearly, Quarterly, Monthly,weekly, Daily.
Nonvolatile
Data once recorded cannot be updated.
Data warehouse requires two operations in data
accessing– Initial loading of data – Access of data
Data mart
Data mart is a subset of data warehouse which concentrates
on a specific business
Ex:- Sales dept, Purchase dept, HR dept
Why a Data Warehouse?
For analysis and decision support, end users require
access to data captured and stored in an organization’s operational or
production systems
This data is stored in multiple formats, on multiple
platforms, in multiple data
structures, with multiple names, and probably created using
different business rules
Dimension Table
Contains descriptive or textual.
Relatively static(Occasionally data will be changed –
SCD’s I, II, III).
Hierarchical data.
Joined to another tables – Fact table(Are joined to a fact
table through foreign key reference).
Fact table
Fact table contain measures and the key of the dimension
table.
Can hold large volume of data
Are join to dimension table through foreign keys that
reference primary keys in the
dimension tables.
Star Schema
A Star schema is defined as a logical database design in
which there will be a centrally located fact table which is surrounded by at
least one or more dimension tables.
SnowFlake Schema
Variant of star schema model.
A single,large and central fact table and one or more
tables for each dimension.
Dimension tables are normalized i.e. split dimension table
data into additional tables
Data Warehousing Tools
Fact Constellation
Multiple fact tables share dimension tables.
This schema is viewed as collection of stars hence called
galaxy schema or fact constellation.
Sophisticated application requires such schema.
Refresh???
•
Propagate updates on source data to the
warehouse
•
Issues:
–
when to refresh
–
how to refresh -- refresh techniques
When to Refresh???
•
periodically (e.g., every night, every week) or
after significant events
•
on every update: not warranted unless warehouse
data require current data (up to the
minute stock quotes)
•
refresh policy set by administrator based on
user needs and traffic
•
possibly different policies for different sources
Data Warehousing Tools
- ETL Tools
- Informatica
- Data stage
- Ab initio
- SAS
- MSBI
- Reporting tools
- Cognos
- BO(Business objects)
- Micro
strategy
- MSBI
thanks for sharing
ReplyDeletemicro strategy online training
micro strategy courses
Thank you for sharing wonderful information with us to get some idea about that content.
ReplyDeleteMicrostrategy Online Course
Microstrategy Certification
I truly appreciate the time and work you put into sharing your knowledge. I found this topic to be quite effective and beneficial to me. Thank you very much for sharing. Continue to blog.
ReplyDeleteData Engineering Services
AI & ML Solutions
Data Analytics Services
Data Modernization Services