SQL script for creating date dimension table for a data warehouse
Almost all data warehouses enable the user to analyse and summarize data in sectors of time. Reports are often based on the financial year, the last quarter, last month or last week etc. By building a date dimension table in our data warehouse we provide a reuseable date lookup table that can help ease the creation of date based reports. This article demonstrates a SQL script that can be used to create and populate such a date dimension table.
The following SQL script creates the table structure:
CREATE TABLE dbo.Dates (
DateID int NOT NULL IDENTITY(1, 1),
[Date] datetime NOT NULL,
[Year] int NOT NULL,
[Month] int NOT NULL,
[Day] int NOT NULL,
[QuarterNumber] int NOT NULL,
CONSTRAINT PK_Dates PRIMARY KEY CLUSTERED (DateID)
Now the table has been created we can fill it with data. The following script performs an INSERT for each day between the start and end date specified. This means there is one row in the table for each day.
-- declare variables to hold the start and end date
DECLARE @StartDate datetime
DECLARE @EndDate datetime
--- assign values to the start date and end date we
-- want our reports to cover (this should also take
-- into account any future reporting needs)
SET @StartDate = '01/01/2005'
SET @EndDate = '31/12/2007'
-- using a while loop increment from the start date
-- to the end date
DECLARE @LoopDate datetime
SET @LoopDate = @StartDate
WHILE @LoopDate <= @EndDate
-- add a record into the date dimension table for this date
INSERT INTO Dates VALUES (
CASE WHEN Month(@LoopDate) IN (1, 2, 3) THEN 1
WHEN Month(@LoopDate) IN (4, 5, 6) THEN 2
WHEN Month(@LoopDate) IN (7, 8, 9) THEN 3
WHEN Month(@LoopDate) IN (10, 11, 12) THEN 4
-- increment the LoopDate by 1 day before
-- we start the loop again
SET @LoopDate = DateAdd(d, 1, @LoopDate)
-- now we have inserted the data we can check how it appears in our table
SELECT TOP 100 * FROM Dates
The script above would insert 1095 rows into the date dimension table. Changing the start and end date will affect the number of rows inserted.
In SQL Server we can add the following snippet of code to the start of the script to enable the script to be rerun with different start and end dates without having to first DROP the date table. The entire script from this article could then be easily put into a stored procedure that accepted the start and end date as input parameters.
-- insert this code above the start of the above scripts
-- to enable easy rerunning of the script
IF EXISTS (
WHERE type = 'U'
AND ID = OBJECT_ID('[dbo].[Dates]')
DROP TABLE [dbo].[Dates]
Discuss this article in the forums