In data warehousing, a fact table consists of the measurements, metrics of facts of a business process.
The two basic types of tables in a data warehous are the dimension table1) and the fact table. The fact table describes the facts (numeric measurements), which means the basic business processes2). While the volume of the dimension table, consisting of so called dimensions (describing data in text fields) is comparatively small, the fact table can become quite huge with a lot of information.
The fact table is a normalized table and contains relevant characteristics and information of the process as well as foreign keys (FK) which refer to the primary keys (PK) of the respective dimension tables. The primary key of the fact table itself is a combination of all foreign keys of the dimension tables.
Between facts and dimensions can be different kinds of relations - the cardinality. It explains how the tables are linked to each other3):
It is important to use indices or comment columns to reduce the time for scanning of all tables for an analysis.
If a fact-table has no measures or facts, but just collecting and bunch information, it is called a “factless fact table” or “function table”.
Fact tables are often defined by their grain. The grain of a fact table represents the most atomic level by which the facts may be defined. All fact tables can be grouped into just three types, but often there is a mixture of two to get a complete picture of a business 6):
A transactional table is the most basic and fundamental, it represents an atomic action that occurs at an instantaneous point of time. It is the only table that can answer detailed questions about timing (because of the accurate date/timestamp) and predictable, repetitive behavior (p. e. about customers). This kind of table can be enormous, with the largest containing many billions of records.
The periodic snapshot, as the name implies, takes a “picture of the moment”, where the moment could be any regular defined period of time. It corresponds to a predefined span of time. It is the only table that can easily generate a regular view of the important measurements (p. e. revenue and costs). This kind of fact table can also become very large.
This type of fact table is used to show the activity of a process that has a well-defined beginning and end. The accumulating snapshot often combines the most recent volatile status with measures that accumulate from the beginning of history. It is used when an item with a finite lifetime is tracked (p.e. insurance policy). Accumulated snapshot records are revisited and overwritten as the process progresses through its steps from beginning to end. Accumulating snapshot fact tables generally are much smaller than the other two types because of this overwriting strategy.
An example for designing a fact table 7):