A referenced dimension relationship allows us to create a relationship between a measure group and a dimension that is indirectly related to the measure group using an intermediate dimension. This is great, because we can now create relationships between our facts and dimensions that may not be directly related. Unfortunately, there are a couple of gotchas that you may not be aware of, which I read about here and here. I’m just offering my take on these issues, so thanks to Alberto and Jesse for the great content.
You can see several examples of referenced relationship types in the Adventure Works cube. One of those examples is the relationship between the Sales Target measure group and the Sales Territory dimension.
If you’ve ever created a referenced dimension relationship, you know that the Materialize option is checked by default. So one might assume this is the best option to go with.
But before we can decide if the option to materialize the relationship is best, we must first understand what the materialize option really does. When we check the materialize option, the link between the referenced dimension and the facts is materialized. During processing of the measure group, a join is made to the dimension to look up the referenced dimension’s key value. This leads to slower cube processing but better cube query performance.
There is, however, an important thing to realize: The materialized relationship link between the facts and the correct attribute members is created during the processing of the partition, not the dimension! Why is this important? Imagine we need to update the relationships between the facts and the referenced dimension. We can no longer simply process the dimension. The link to the older referenced dimension attribute member still exists in the partition! The partition must be processed! This could be a huge issue if your cube is very large and you’re only planning to process the most recent partition. If the relationships changed in the referenced dimension, you could find yourself forced to process more than just the most recent partition in order to ensure the relationships between the facts and the referenced dimension are correct.
So what’s the other option? We could leave the option to materialize the relationship unchecked. The downside to this is that query performance will suffer since the member will need to be looked up on the fly. The upside is that processing will be faster and we no longer need to process the partition to update the relationship since its looked up at query time.
But there is also a gotcha with using un-materialized referenced dimension relationships. Aggregations can not be created for a referenced dimension when the relationship is not materialized. If you run the Aggregation wizard, you’ll notice that any referenced dimensions with un-materialized relationships will be left out. This is by design. An aggregation should not be created (which is possible to do manually) for an un-materialized referenced dimension relationship because the relationship is resolved at query time.
What About the TSQL Query?
But what happens with the TSQL query during processing of the measure group? Here’s an example in the Adventure Works cube that we can look at:
This referenced relationship between the Reseller Sales measure group is materialized. When the Reseller Sales measure group is processed, a join is made to the intermediate dimension, to look up the Geography Key, which you can see here:
This join allows SSAS to add the referenced key to the measure group providing better MDX query performance for the users. But if we’re dealing with a significant amount of data in the fact table and a large intermediate dimension, the processing performance impact could be very significant.
But what if I de-materialize the relationship? The good news is that the TSQL query used to load the measure group will perform better because we’re no longer making a join to the intermediate dimension:
The bad news is that during executions of MDX queries, SSAS will need to look up the key value for the referenced dimension leading to slower MDX query performance.
My personal recommendation would be to add the referenced key to the fact table so that way you can eliminate the dilemma altogether. I’ve personally seen referenced relationships in a multi TB cube destroy processing performance. The resolution for this particular client’s issue was to add the referenced key to the fact tables and create the regular relationships between the measure groups and the referenced dimensions. After implementing the fix, processing for the partitions was very fast.
If for some reason you can’t add the referenced key to the fact table and you must decide between materializing the relationship or not, of course the answer is “It depends”. If your cube is extremely large featuring an incremental processing strategy, I would recommend to not materialize the relationship. This will prevent you from having to process the entire cube should something in the referenced dimension change. You’ll just have to live with the somewhat slower MDX query performance.
If, however, your cube is relatively small or your data warehouse is completely reloaded each night, I would probably recommend materializing the relationships. The increased time to process won’t be that big of a deal since the cube is small and your query times will benefit because of the materialized relationship. You’ll also be able to create aggregations since the relationship is materialized. Of course, the “it depends” applies differently to every situation.
I hope you found this useful! If you did, please share it!
Also, if you have any questions or feedback, please leave a comment below! Thanks for reading!