Data Warehouse is “a system used for reporting and data analysis”. In general it is built by dim (dimension) and fact tables (structure and data tables). Here I create a stored procedure to build Date Dimension. Now it builds only calendar data and will be expanded to include “fiscal” data and “offset” from a specific date (today or the last business day).
Let’s say I have a server, dedicated for ETL. I have a schema “etl” for storing objects during the run of the ETL processes and “dw” for storing the actual Data Warehouse. In a production environment I’ll have much more objects (Detailed Log table, Job Agent Schedules, SSIS packages, etc.), but for this example I will simplify the scenario and will focus on dimDate.
Create database.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
USE [master]; GO IF (DB_ID('PL') IS NOT NULL) BEGIN ALTER DATABASE PL SET SINGLE_USER WITH ROLLBACK IMMEDIATE; DROP DATABASE PL; END GO CREATE DATABASE PL; GO USE PL; GO CREATE SCHEMA dw; GO CREATE SCHEMA etl; GO CREATE TABLE PL.etl.[Log] ( [DateTime] DATETIME , ETLSource SYSNAME , [Action] NVARCHAR(9) -- Start, End, Execution, Error , ExecutionID UNIQUEIDENTIFIER , ExecutedBy NVARCHAR(128) ); GO |
Create stored procedure to build Date Dimension.
The parameters I pass are:
- FirstDayOfWeek –> the definition of a “week” differs in different cultures. Sunday in North America, Monday in Europe etc.
- YearsPast –> how many full past years will be included in dw.dimDate table
- YearsFuture –> how many full future years will be included in dw.dimDate table
The (innermost) level is day or one line represents a data for one day.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 |
SET ANSI_NULLS ON GO SET QUOTED_IDENTIFIER ON GO -- ============================================= -- Author: Peter Lalovsky -- Create date: 2015-02-05 -- Description: Create Date Dimension in Data Warehouse (version 1) -- ============================================= CREATE PROCEDURE etl.spDimDate1 @FirstDayOfWeek VARCHAR(9) = 'Monday' -- Monday, Tuesday, Wednesday, Thursday, Friday, Saturday, Sunday , @YearsPast TINYINT = 5 -- Full past years from the current year , @YearsFuture TINYINT = 2 -- Full future years from the current year AS BEGIN -- etl.dimDate1 SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; SET NOCOUNT ON; DECLARE @ExecutionID UNIQUEIDENTIFIER SET @ExecutionID = NEWID(); -- Log --> Start INSERT PL.etl.[Log] ([DateTime], ETLSource, [Action], ExecutionID, ExecutedBy) SELECT GETDATE(), @@PROCID, 'Start', @ExecutionID, SUSER_SNAME(); -- @FirstDayOfWeek to @DateFirst - Start DECLARE @DateFirst TINYINT; SELECT @DateFirst = CASE @FirstDayOfWeek WHEN 'Monday' THEN 1 WHEN 'Tuesday' THEN 2 WHEN 'Wednesday' THEN 3 WHEN 'Thursday' THEN 4 WHEN 'Friday' THEN 5 WHEN 'Saturday' THEN 6 WHEN 'Sunday' THEN 7 END; SET DATEFIRST @DateFirst; -- @FirstDayOfWeek to @DateFirst - End --** Variables - Start **-- DECLARE @DateStart DATETIME , @DateEnd DATETIME , @DateLoop DATETIME; SELECT @DateStart = CAST((YEAR(CONVERT(CHAR(10), GETDATE(), 120)) - @YearsPast) AS CHAR(4)) + '-01-01' , @DateEnd = CAST((YEAR(CONVERT(CHAR(10), GETDATE(), 120)) + @YearsFuture) AS CHAR(4)) + '-12-31' , @DateLoop = @DateStart; --** Variables - End **-- --** PL.etl.dimDate1 - Start **-- IF (OBJECT_ID('PL.etl.dimDate1', 'U') IS NOT NULL) BEGIN TRUNCATE TABLE PL.etl.dimDate1; END ELSE BEGIN CREATE TABLE PL.etl.dimDate1 ( [Key] INT NOT NULL , [Date] DATETIME , IsWeekend BIT , DayOfWeekLiteral VARCHAR (9) COLLATE Latin1_General_CI_AS , DayOfWeekShortLiteral CHAR (3) COLLATE Latin1_General_CI_AS , YearCalendar SMALLINT , QuarterCalendar TINYINT , MonthCalendar TINYINT , MonthLiteralCalendar VARCHAR (9) COLLATE Latin1_General_CI_AS , MonthLiteralShortCalendar CHAR (3) COLLATE Latin1_General_CI_AS , WeekOfYearCalendar TINYINT , WeekOfQuarterCalendar TINYINT , WeekOfMonthCalendar TINYINT , DayOfYearCalendar SMALLINT , DayOfQuarterCalendar TINYINT , DayOfMonthCalendar TINYINT , DayOfWeekCalendar TINYINT , BusinessDayOfYearCalendar SMALLINT , BusinessDayOfQuarterCalendar TINYINT , BusinessDayOfMonthCalendar TINYINT , BusinessDayOfWeekCalendar TINYINT ); ALTER TABLE PL.etl.dimDate1 ADD CONSTRAINT PK_etlDimDate1 PRIMARY KEY ([Key]); END WHILE (@DateLoop <= @DateEnd) BEGIN INSERT PL.etl.dimDate1 ( [Key] , [Date] , DayOfWeekLiteral , DayOfWeekShortLiteral , IsWeekend , YearCalendar , QuarterCalendar , MonthCalendar , MonthLiteralCalendar , MonthLiteralShortCalendar , WeekOfYearCalendar , DayOfYearCalendar , DayOfMonthCalendar , DayOfWeekCalendar ) SELECT CAST(CONVERT(CHAR(8), @DateLoop, 112) AS INT) -- [Key] , @DateLoop -- [Date] , DATENAME(DW, @DateLoop) -- DayOfWeekLiteral , LEFT(DATENAME(DW, @DateLoop), 3) -- DayOfWeekShortLiteral , CASE WHEN DATENAME(DW, @DateLoop) IN ('Saturday', 'Sunday') THEN 1 END -- IsWeekend , YEAR(@DateLoop) -- YearCalendar , DATEPART(QQ, @DateLoop) -- QuarterCalendar , MONTH(@DateLoop) -- MonthCalendar , DATENAME(MM, @DateLoop) -- MonthLiteralCalendar , LEFT(DATENAME(MM, @DateLoop), 3) -- MonthLiteralShortCalendar , DATEPART(WK, @DateLoop) -- WeekOfYearCalendar , DATEPART (DY, @DateLoop) -- DayOfYearCalendar , DAY(@DateLoop) -- DayOfMonthCalendar , DATEPART(DW, @DateLoop) -- DayOfWeek SET @DateLoop = DATEADD(DD, 1, @DateLoop); END -- Update Calendar Data - Start UPDATE PL.etl.dimDate1 SET WeekOfQuarterCalendar = H.WQC , WeekOfMonthCalendar = H.WMC , DayOfQuarterCalendar = H.DQC FROM ( SELECT [Key] AS K , DENSE_RANK() OVER (PARTITION BY YearCalendar, QuarterCalendar ORDER BY WeekOfYearCalendar) AS WQC , DENSE_RANK() OVER (PARTITION BY YearCalendar, MonthCalendar ORDER BY WeekOfYearCalendar) AS WMC , ROW_NUMBER() OVER (PARTITION BY YearCalendar, QuarterCalendar ORDER BY DayOfYearCalendar) AS DQC FROM PL.etl.dimDate1 ) AS H WHERE [Key] = H.K; -- Update Calendar Data - End -- Update Calendar Business days Data - Start UPDATE PL.etl.dimDate1 SET BusinessDayOfYearCalendar = H.BDYC , BusinessDayOfQuarterCalendar = H.BDQC , BusinessDayOfMonthCalendar = H.BDMC , BusinessDayOfWeekCalendar = H.BDWC FROM ( SELECT [Key] AS K , DENSE_RANK() OVER (PARTITION BY YearCalendar ORDER BY DayOfYearCalendar) AS BDYC , DENSE_RANK() OVER (PARTITION BY YearCalendar, QuarterCalendar ORDER BY DayOfYearCalendar) AS BDQC , ROW_NUMBER() OVER (PARTITION BY YearCalendar, MonthCalendar ORDER BY DayOfYearCalendar) AS BDMC , ROW_NUMBER() OVER (PARTITION BY YearCalendar, WeekOfYearCalendar ORDER BY DayOfYearCalendar) AS BDWC FROM PL.etl.dimDate1 WHERE IsWeekend IS NULL ) AS H WHERE [Key] = H.K; -- Update Calendar Business days Data - End --** PL.etl.dimDate1 - Start **-- --** Insert into dw.dimDate1 - Start **-- IF (OBJECT_ID('PL.dw.dimDate1', 'U') IS NOT NULL) BEGIN TRUNCATE TABLE PL.dw.dimDate1; END ELSE BEGIN CREATE TABLE PL.dw.dimDate1 ( ETLDateTime DATETIME , [Key] INT NOT NULL , [Date] DATETIME , IsWeekend BIT , DayOfWeekLiteral VARCHAR (9) COLLATE Latin1_General_CI_AS , DayOfWeekShortLiteral CHAR (3) COLLATE Latin1_General_CI_AS , YearCalendar SMALLINT , QuarterCalendar TINYINT , MonthCalendar TINYINT , MonthLiteralCalendar VARCHAR (9) COLLATE Latin1_General_CI_AS , MonthLiteralShortCalendar CHAR (3) COLLATE Latin1_General_CI_AS , WeekOfYearCalendar TINYINT , WeekOfQuarterCalendar TINYINT , WeekOfMonthCalendar TINYINT , DayOfYearCalendar SMALLINT , DayOfQuarterCalendar TINYINT , DayOfMonthCalendar TINYINT , DayOfWeekCalendar TINYINT , BusinessDayOfYearCalendar SMALLINT , BusinessDayOfQuarterCalendar TINYINT , BusinessDayOfMonthCalendar TINYINT , BusinessDayOfWeekCalendar TINYINT ); ALTER TABLE PL.dw.dimDate1 ADD CONSTRAINT PK_dwDimDate1 PRIMARY KEY ([Key]); END INSERT PL.dw.dimDate1 ( ETLDateTime , [Key] , [Date] , IsWeekend , DayOfWeekLiteral , DayOfWeekShortLiteral , YearCalendar , QuarterCalendar , MonthCalendar , MonthLiteralCalendar , MonthLiteralShortCalendar , WeekOfYearCalendar , WeekOfQuarterCalendar , WeekOfMonthCalendar , DayOfYearCalendar , DayOfQuarterCalendar , DayOfMonthCalendar , DayOfWeekCalendar , BusinessDayOfYearCalendar , BusinessDayOfQuarterCalendar , BusinessDayOfMonthCalendar , BusinessDayOfWeekCalendar ) SELECT GETDATE() , [Key] , [Date] , IsWeekend , DayOfWeekLiteral , DayOfWeekShortLiteral , YearCalendar , QuarterCalendar , MonthCalendar , MonthLiteralCalendar , MonthLiteralShortCalendar , WeekOfYearCalendar , WeekOfQuarterCalendar , WeekOfMonthCalendar , DayOfYearCalendar , DayOfQuarterCalendar , DayOfMonthCalendar , DayOfWeekCalendar , BusinessDayOfYearCalendar , BusinessDayOfQuarterCalendar , BusinessDayOfMonthCalendar , BusinessDayOfWeekCalendar FROM PL.etl.dimDate1; --** Insert into dw.dimDate1 - End **-- --** Clean up - Start **-- IF (OBJECT_ID('PL.etl.dimDate1', 'U') IS NOT NULL) BEGIN TRUNCATE TABLE PL.etl.dimDate1; END --** Clean up - End **-- -- Log --> End INSERT PL.etl.[Log] ([DateTime], ETLSource, [Action], ExecutionID, ExecutedBy) SELECT GETDATE(), @@PROCID, 'End', @ExecutionID, SUSER_SNAME(); END -- etl.dimDate1 GO |
Test the stored procedure.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
EXEC etl.spDimDate1 @FirstDayOfWeek = 'Monday'; SELECT * FROM PL.dw.dimDate1; GO EXEC etl.spDimDate1 @FirstDayOfWeek = 'Tuesday'; SELECT * FROM PL.dw.dimDate1; GO EXEC etl.spDimDate1 @FirstDayOfWeek = 'Wednesday'; SELECT * FROM PL.dw.dimDate1; GO EXEC etl.spDimDate1 @FirstDayOfWeek = 'Thursday'; SELECT * FROM PL.dw.dimDate1; GO EXEC etl.spDimDate1 @FirstDayOfWeek = 'Friday'; SELECT * FROM PL.dw.dimDate1; GO EXEC etl.spDimDate1 @FirstDayOfWeek = 'Saturday'; SELECT * FROM PL.dw.dimDate1; GO EXEC etl.spDimDate1 @FirstDayOfWeek = 'Sunday'; SELECT * FROM PL.dw.dimDate1; GO |
Check the Log.
1 2 3 4 |
SELECT * FROM PL.etl.[Log] ORDER BY [DateTime]; GO |
I ran Mon twice to cache a plan and again in the end to run the next reports for @FirstDayOfWeek = ‘Monday’.
The max execution time is 1 second.
Quick reporting, based on Date Dimension.
1 2 3 4 5 6 7 8 |
-- Business Days In Year SELECT YearCalendar , MAX(BusinessDayOfYearCalendar) AS BusinessDaysInYear FROM dw.dimDate1 GROUP BY YearCalendar ORDER BY YearCalendar; GO |
1 2 3 4 5 6 7 8 |
-- Business Days In Quarter SELECT YearCalendar, QuarterCalendar , MAX(BusinessDayOfQuarterCalendar) AS BusinessDaysInQuarter FROM dw.dimDate1 GROUP BY YearCalendar, QuarterCalendar ORDER BY YearCalendar, QuarterCalendar; GO |
1 2 3 4 5 6 7 8 |
-- Business Days In Month SELECT YearCalendar, MonthCalendar , MAX(BusinessDayOfMonthCalendar) AS BusinessDaysInMonth FROM dw.dimDate1 GROUP BY YearCalendar, MonthCalendar ORDER BY YearCalendar, MonthCalendar; GO |
Preview –> dimDate (1).xlsx
Delete database.
1 2 3 4 5 6 7 8 9 |
USE [master]; GO IF (DB_ID('PL') IS NOT NULL) BEGIN ALTER DATABASE PL SET SINGLE_USER WITH ROLLBACK IMMEDIATE; DROP DATABASE PL; END GO |
Keep it simple :-)