Normally data warehouses are populated a star schema, for example the sales contains each order for each day. It is easy to see the total for each order, but often it is required to support a higher level of details. For example could someone be interested in knowing the sum of units sold aggregated by region, month or so on. With the star schema these types of aggregation, will need a high response time, if it computed at runtime. The target should be to get an acceptable response time. Essentially, we can either aggregate at runtime or pre-aggregate the data offline, making the sums available without real-time computation. There is a simple alternative to real-time aggregation: Just write a SQL Statement to pre-aggregate the data according to the dimension that the end-user wants to see. 1) When facts are aggregated, it is done by eliminating dimensionality or by associating the facts with a rolled up dimension. Rolled up dimensions should be shrunken versions of the dimensions associated with the granular base facts. This way, the aggregated dimension tables should conform to the base dimension tables. The reason why aggregates can make such a dramatic increase in the performance of the data warehouse is the reduction of the number of rows to be accessed when responding to a query. 2)
ROLLUP performs these aggregations across multiple dimensions, at different levels, within a single SQL query. ROLLUP creates n+1 levels of subtotals (including a grand total). n is the number of grouping columns.
ROLLUP(A, B, C) creates four groups: (A, B, C), (A, B), (A), ().
ROLLUP ( (A, B), C) treats (A, B) as a single entity and only creates three groups: (A, B, C), (A, B), ()
SELECT * FROM something ORDER BY Category, Year
For the something table, ROLLUP computes the subtotals in each category.
SELECT Category, Year, SUM(Amount) FROM something GROUP BY ROLLUP(Category, Year) ORDER BY 1,2, GROUPING_ID();
The following example shows how to use the HAVING clause with ROLLUP to restrict the GROUP BY results. The following query produces only those ROLLUP categories where year is subtotaled, based on the expression in the GROUPING function:
SELECT Category, Year, SUM(Amount) FROM something GROUP BY ROLLUP(Category,Year) HAVING GROUPING(Year)=1 ORDER BY 1, 2, GROUPING_ID();