Understanding Slowly Changing Dimensions (SCD)

By | December 26, 2016

This article is focused on the beginners who are confused with the term Slowly Changing Dimensions(SCD) . Working on a ETL or any Data Management tool Slowly Changing Dimension is the term which we come across every now and then, in this article I will try to explain what are SCD types and how we can implement them using SAP BODS.

First lets understand what is Slowly Changing Dimension,  as the name suggests it involves the dimensions which change over a period of time and we as a Data Consultant have to take care of these dimensions as per the business requirements. Depending on how we want to preserve or manage these changing dimensions we have three Slowly Changing Dimension types.

  1. SCD Type I,
  2. SCD Type II, and
  3. SCD Type III

SCD Type I

In this type of SCD the dimensions which are changing are not required to be used in future and hence we overwrite the data, in other terms we can say that no history is preserved in this type of SCD. Below is the example for the same.

Initial Data
EMP_ID EMP_NAME EMP_CITY EMP_DEPT
123 Shubham Bangalore Insights and Data
456 Manish Patna SAP
Final Data
EMP_ID EMP_NAME EMP_CITY EMP_DEPT
123 Shubham Bangalore BODS
456 Manish Patna BW

In the above example we can observe that EMP_DEPT for both the records is changed, but in the final table the department name is overwritten and we cannot determine what was there earlier and the new value will be considered.

SCD Type II

In this type of SCD implementation history is maintained and at any given point of time we can see the historic records. For each update a new record is inserted here and a flag or effective date is used for identifying the current or the active record.

Initial Data
EMP_ID EMP_NAME EMP_CITY isActive
123 Shubham Bangalore 1
456 Manish Patna 1
SCD Type II using flag indicator
EMP_ID EMP_NAME EMP_CITY isActive
123 Shubham Bangalore 0
123 Shubham Pune 1
456 Manish Patna 0
456 Manish Bangalore 1

In the above example you can observe that the EMP_CITY is changed for both the employees but instead of updating the records a new record is inserted with the new city value and also the previous record is updated with isActive as ‘0’. In this manner the historic data is maintained and can be used later if required. In some cases people prefer using effective dates instead of flag which is sometimes more useful in manipulating the data. Below is the example with effective dates:

Initial Data
EMP_ID EMP_NAME EMP_CITY ValidFrom ValidTo
123 Shubham Bangalore 01/01/1900 01/01/9999
456 Manish Patna 01/01/1900 01/01/9999
SCD Type II using effective date
EMP_ID EMP_NAME EMP_CITY ValidFrom ValidTo
123 Shubham Bangalore 01/01/1900 18/12/2016
123 Shubham Pune 19/12/2016 01/01/9999
456 Manish Bangalore 01/01/1900 18/12/2016
456 Manish Patna 19/12/2016 01/01/9999

From the above example gives the more clear picture of the data. It clearly explains till what time a particular employee was in a particular city and when it changed. When in some cases evaluation is done on historic data also it is always advised to use effective dates.

SCD Type III

This is the type of SCD implementation where only limited history is preserved and for preserving history addition columns are added. Taking the initial data from the above SCD Type II example in SCD Type III the data will something like the below table.

SCD Type III using effective date
EMP_ID EMP_NAME EMP_CITY_Current EffectiveDate EMP_CITY_Previous
123 Shubham Pune 19/12/2016 Bangalore
456 Manish Bangalore 19/12/2016 Patna

The limitation of Slowly Changing Dimension Type III over Type II is that in Type III we have the immediate previous data only, whereas in Type II we have the complete history. For example if an employee changes 3 cities there will be 3 records in the table with proper effective dates whereas in Type III implementation only the current and the previous city will be stored.

Slowly Changing Dimension Type II is the most commonly used, but since it inserts records for every update, database performance tuning steps should be taken to avoid querying huge database.

I hope I was clear in explaining the concept above, in the next article we will see how to implement them using SAP BODS.

Leave a Reply

Your email address will not be published. Required fields are marked *