User Tools

Site Tools


concepts:star_schema
  • Bookmark "Star Schema" at del.icio.us
  • Bookmark "Star Schema" at Digg
  • Bookmark "Star Schema" at Furl
  • Bookmark "Star Schema" at Reddit
  • Bookmark "Star Schema" at Ask
  • Bookmark "Star Schema" at Google
  • Bookmark "Star Schema" at Netscape
  • Bookmark "Star Schema" at StumbleUpon
  • Bookmark "Star Schema" at Technorati
  • Bookmark "Star Schema" at Live Bookmarks
  • Bookmark "Star Schema" at Yahoo! Myweb
  • Bookmark "Star Schema" at Facebook
  • Bookmark "Star Schema" at Newsvine
  • Bookmark "Star Schema" at Yahoo! Bookmarks
  • Bookmark "Star Schema" at Twitter
  • Bookmark "Star Schema" at myAOL
  • Bookmark "Star Schema" at Slashdot
  • Bookmark "Star Schema" at Mister Wong

Star Schema

The star schema is a specific type of a multidimensional data model1). The target of that data model is not the normalization rather it is the optimization of efficient reading operations. The main areas of application of the star schema are data warehouses and OLAP-applications.

Definition

As a logical data base schema for data warehouse-applications the star schema get accepted as a standard model. This schema exist out of the fact chart and several dimension tables. The dimension spreadsheets are stellar arranged around the fact chart. Therefore the data model has its name star schema. The different dimension tables are coordinated around only one fact chart. Moreover the advantage of this schema is the clear arrangement of the various spreadsheets. Furthermore the star schema attempt to minimize the quantity of tables like they are normal in relational models 2). In the fact chart the main data or the derived quantities like business volume or costs get stored. In contrast the normally smaller dimension charts contain the quality data which help to visualize the dimensions and the dimension hierarchy 3). The individual lines of the dimension table will be identified through a minimal combination of attribute which is called the primary key. To get the connection between the dimension charts and the associated fact index the primary keys of the dimension spreadsheets will be assimilated in the fact table as a foreign key. Out of the several foreign keys the primary key of the fact chart will be composed. Consequential the fact index consists out of many relations in one table and has therefore a lot of redundancy. Only the fact spreadsheet is connected with all the other charts in the star schema. The dimension tables have no direct connection to each other and all of them are generally not normalized. The advantage of the disjunction of facts and dimensions is, that the facts can be analyzed independently after every dimension. An OLAP-application needs no knowledge about the meaning of a dimension. The interpretation is in the hand of the user. The size of dimension tables is important. Fact charts can exist out of 10 billion data sets. Dimension spreadsheets are smaller but they can still be very huge. A decrement of such big data sets can help to shorten the access time. Separate dimension tables with a huge size of data sets can be transferred through normalization into a snow flake schema.

A problem of the star schema is, that the data in the dimension tables has a reference to the data in the fact chart over a long period of time. After a while modifications of the dimension datas could be necessary. This changes shouldn't affect the data before the alterations. To avoid that trouble you can use the idea of the slowly changing dimensions.

Advantages of the star schema

  • A fast inquiry processing is possible. Furthermore a specific Join (star join 4)) can be optimized.
  • Changing anomalies 5) can be controlled easily because there are hardly no modifications in the classification data.
  • It is a simple and intuitive data model. The star schema has essential less relations than a converging Snow Flake Schema.
  • It has a high point of traceability and comprehensibility. An updating of the reporting because of the star schema is possible. Therefore data pools can be used for trend recognition and data mining.

Disadvantages of the star schema

  • A declined response time behavior by oftentimes query of very huge dimensional tables.
  • Redundancy 6) inside of a dimension chart because of the multiple storage of identical data.
  • Creation of aggregation is difficult.
1)
The Multidimensional Data Model, 2010.web.18.August 2010.http://download.oracle.com/docs/cd/B14117_01/olap.101/b10333/multimodel.htm
3)
Facts, Dimensions and Dimension Hierarchy, 2010.web.18.August 2010http://myobieeworld.blogspot.com/2009/02/facts-dimensions-and-dimension.html
4)
Using Star Join and few-Outer-Row Optimizations to improve Data Warehousing Queries, 2010.web.18.August 2010http://msdn.microsoft.com/en-us/library/ee410012(SQL.100).aspx
5)
wikipedia.anomaly, 2010.web. 18.August 2010http://en.wikipedia.org/wiki/Anomaly
6)
wikipedia.redundancy. 2010.web 18.August.2010.http://en.wikipedia.org/wiki/Redundancy
  • Bookmark "Star Schema" at del.icio.us
  • Bookmark "Star Schema" at Digg
  • Bookmark "Star Schema" at Furl
  • Bookmark "Star Schema" at Reddit
  • Bookmark "Star Schema" at Ask
  • Bookmark "Star Schema" at Google
  • Bookmark "Star Schema" at Netscape
  • Bookmark "Star Schema" at StumbleUpon
  • Bookmark "Star Schema" at Technorati
  • Bookmark "Star Schema" at Live Bookmarks
  • Bookmark "Star Schema" at Yahoo! Myweb
  • Bookmark "Star Schema" at Facebook
  • Bookmark "Star Schema" at Newsvine
  • Bookmark "Star Schema" at Yahoo! Bookmarks
  • Bookmark "Star Schema" at Twitter
  • Bookmark "Star Schema" at myAOL
  • Bookmark "Star Schema" at Slashdot
  • Bookmark "Star Schema" at Mister Wong
concepts/star_schema.txt · Last modified: 2014/10/19 13:38 (external edit)