
The downside is that this complicates the model significantly in cases where there are many attributes requiring history to be tracked. This allows us to record factless facts without the need to refactor when events are reporting out of chronological order. The factless will include a data/time foreign key. One solution is to pull attributes that requires historical values to be tracked (SCD2) into their own dimension and put a factless between the original dimension and the new dimension. This makes traditional SCD tracking (effective, expired, current) less than ideal. Every situation is unique, but here are a few thoughts that you may want to consider.Īvoid Refactoring caused by Late Arriving Dimension Data – When you have a source that is consistently providing data that is not in chronological order, you will soon be in a late arriving dimension situation requiring constant refactoring (fact foreign key value updates & dimension SCD tracking updates). Dimensional modeling is up to the challenge as proven many times over. Good News! External data can be integrated into a data warehouse. The problem is much less about technology than it is about logic which can only be created by humans (for now). Sadly, neither the lack of ACID compliance nor the ability to process limitless amounts of data will not solve complex data integration challenges such as external data mashups. Before long people begin to believe that the data cannot be warehoused, and instead alternative technologies should be used such as NoSQL, mass storage/processing, and data virtualization technologies.

If things really get off track, then faith in the entire data warehousing and dimensional modeling discipline will break down. When faced with the situation described above it is pretty easy to see how a project can quickly get off track.
