Do You Know Why Your MDX Query Is Slow?

Performance tuning MDX queries can often be a daunting and challenging task. But the first step in deciding where to begin your efforts to improve the performance of your query is to diagnose the source of the problem. There are two areas that could be causing our performance issue: 1. The design of our SSAS solution or 2. The design of our query. We need to figure out if we’ve written a bad query or designed a bad cube :).

How Do We Test The Query?

Usually an issue is discovered when a user comes to the BI team with a report that appears to be running slowly. So for our example, I have a pivot table in an excel workbook that is running a little slow. Let’s walk through this together and diagnose what the problem could be. Below here you can see the pivot table.

 image 

The first thing you’ll need to do is test the query and the best way to do this is to execute the query in isolation so that we can eliminate outside factors as part of the problem. So in this example I would execute the report outside of business hours. I want to make sure I do this outside of peak use time because one of the things I’ll need to do is clear the cache. In order to prevent SSAS from satisfying our query by leveraging the formula cache and the storage engine cache, we need to execute a ClearCache command to prevent our results from being corrupted. To do this, I’ll execute the following XMLA script in SSMS.

 

<ClearCache xmlns=”http://schemas.microsoft.com/analysisservices/2003/engine>
<Object>
<DatabaseID>AdventureWorksDW2012MutliDimensional-EE</DatabaseID>
<CubeID>Adventure Works</CubeID>
</Object>
</ClearCache>

 

Also, I’ll execute the following MDX to initialize my cube’s MDX script.

 

Select {} on 0 
From [Adventure Works]

For us to figure out how long our query is taking to execute, we’re going to fire up SQL Server Profiler and execute a trace against SSAS while we execute the query so we can gather all the nitty gritty details of our query execution. By running a trace, we will be able to see all kinds of really helpful details like the total duration of the query, the partitions being queries, if aggregations are being used to answer the query, which attributes are being used, and much more.

To begin our trace, go to Tool > SQL Server Profilers in SSMS.

image

Flip the Server type to Analysis Services and set the Server name to your SSAS instance that is home to the SSAS database your report is pointed at.

image

Next you’ll see the Trace Properties window. Go to the Events Selection tab, and check the check box near the bottom right of the window to Show all events. Scroll down a little more than half way and find the events “Get Data From Aggregation” and “Query Subcube Verbose.” The “Get Data From Aggregation” event is fire when an aggregation is used to satisfy a query. This event is also especially helpful when trying to determine if the aggregations you have designed are actually useful. The “Query Subcube Verbose” event will give you very detailed information on which members from which dimension attributes are being queried to satisfy the query. Click Run when you’re done.

image

Now that the trace is running, its time to conduct our test.

1. The first thing I’ll do is execute the Clear Cache command.

 

<ClearCache xmlns=”http://schemas.microsoft.com/analysisservices/2003/engine>
<Object>
<DatabaseID>AdventureWorksDW2012MutliDimensional-EE</DatabaseID>
<CubeID>Adventure Works</CubeID>
</Object>
</ClearCache>

 

2. Initialize the Calculation script in your cube.

 

Select {} on 0 
From [Adventure Works]
 
3. Execute the report/query. In my case, my report is an Excel pivot table, so I’ll simply click the refresh button in Excel to execute my query.

The report may take a few seconds or minutes to run depending on the query, but it should probably take longer than you’re used to simply because the query is running against an empty cache and SSAS will have to retrieve all of the data from storage.

When the query is finished executing, pause your Trace in SQL Server Profiler by clicking the pause button at the top of the Trace window. Now its time to take a look at the results. Find the Query End event in the Trace results. Find the Duration column. This number displays the total query duration in milliseconds. In my example, my query took just over 13 seconds to execute. So its not unbelievably slow, but certainly slower than we’d like.

image

In order to take a more in depth look at our query’s performance, lets save the Trace results to a SQL Server table so we can query it. Go to File > Save As > Trace Table.

image

Specify where you’d like to save the results and click OK.

image

Below you’ll see a very useful query that will break down where your query is experiencing a slow down. If you’re using this query for your testing, don’t forget to alter the From clause to query your trace table.

 

SELECT x.ConnectionID,
x.QueryID,
x.QueryDuration,
p.SEDuration,
CASE
WHEN p.SEDuration > x.QueryDuration THEN NULL ELSE x.QueryDuration - p.SEDuration
END AS FEDuration,
y.[Number of SE Queries],
y.[Thread Duration of SE Queries],
w.[Aggregations Read],
x.TextData
FROM (SELECT a.ConnectionID,
a.Duration AS QueryDuration,
a.TextData,
CAST (HashBytes('SHA1', CAST (reverse(CAST (TextData AS VARCHAR (MAX))) AS NVARCHAR (4000))) AS INT) AS QueryID
FROM MyTraceTable AS a
WHERE a.EventClass = 10) AS x -- Query End Event
LEFT OUTER JOIN
(/* Determine Query Subcube Verbose of Non-cache data */
SELECT ConnectionID,
COUNT(*) AS [Number of SE Queries],
SUM(Duration) AS [Thread Duration of SE Queries]
FROM MyTraceTable
WHERE EventClass = 12 -- Query Subcube Verbose
AND EventSubclass = 22 -- Non-cache data
GROUP BY ConnectionID) AS y
ON y.ConnectionID = x.ConnectionID
LEFT OUTER JOIN
(/* Determine Aggregations that are ready from */
SELECT ConnectionID,
COUNT(*) AS [Aggregations Read]
FROM MyTraceTable
WHERE EventClass = 60 -- Read from Aggregations
GROUP BY ConnectionID) AS w
ON w.ConnectionID = x.ConnectionID
LEFT OUTER JOIN
(/* Determine SE time */
SELECT ConnectionID,
SUM(Duration) AS SEDuration
FROM MyTraceTable
WHERE EventClass = 11 -- Query SubCube
GROUP BY ConnectionID) AS p
ON p.ConnectionID = x.ConnectionID;

 

The results here are very telling. The column “QueryDuration” shows us the total execution time of the query. The column “SEDuration” shows us the amount of time SSAS spent pulling the data from storage (Storage Engine). The column “FEDuration” shows how long SSAS spent calculating our queries results (Formula Engine).

image

In this particular example, the vast majority of our query’s execution time is spent in the Formula Engine. Of the 13+ seconds spent executing the query, the query spends more than 12 seconds in the Formula Engine and only 297 milliseconds pulling the data from storage. This tells us that the problem is probably not with the design our SSAS solution, but rather a poorly written query. Unfortunately, this being an MDX query generated by Excel, there’s not a lot we can do about altering the query.

How Do We Fix The Query?

Typically when deciding where to spend your performance tuning efforts you want to start in the area where your query spends more than 30% of its time (If its a 50/50 split make an educated decision). In the previous example, we’ve determined our problem is with the query.

What can I do to improve my MDX query?

If you determine your query’s problem is the query itself, ensure SSAS is utilizing subspace computation instead of cell by cell computation. SSAS will usually evaluate groups of cells in your cube at a time but in certain situations SSAS will evaluate cube space one cell at a time. We want to to avoid that. You can get a hint that SSAS is calculation your results one cell at a time if the query on our trace table shows a large amount of Storage Engine queries. Certain SSAS/MDX functions can disable subspace computation.

1. Late binding functions (ex. StrtoMember, StrtoSet functions)
2. Set aliases
3. LookupCube function

Also, check out this blog for more info on ways to improve your MDX query.

But what if the problem is my SSAS solution’s design?

If you conduct your test and determine the majority of the query duration is spent pulling the data from storage, there’s a lot to consider when discussing cube design best practices. But here are some brief highlights of things to consider.

1. Can we design aggregations to help our query? Look at your test results and see if aggregations are being used to satisfy the query.
2. Can we implement a partition design strategy that keeps SSAS from having to query larger partitions?
3. Are the right partitions being queried? For example, if your query is asking for data for 2010 and you notice in your trace that the partitions for all the other years are being queried, this could indicate that SSAS is having a hard time figuring out which partition has the correct data. You may need to set the Slice property on your partition.
4. Create Attribute relationships
5. Leverage Natural Hierarchies

There are many more best practices for cube design, but this is probably a good starting point.

If you’ve found this helpful, share it with a colleague or a friend and leave me a comment. Feel free to leave me a question or feedback in the comments here or send me a note on Twitter! I love discussing new ideas and learning so don’t hesitate!

SSAS Static Named Sets Vs. Dynamic Named Sets

So I’m 95% sure that I blogged about this topic at some point over the last couple years, but every time I try to find the link to show a class I’m teaching or to show a client, I can never find the darn thing. This is why I’m writing this blog. That and its also nice to have a good example of this on hand, which is what we have here.

In SSAS we have the ability to create named sets. An named set is basically an aliased set expression that we can use within our MDX queries. This is very useful if we have a set that is commonly used in our organization’s reporting solution.

But there are two types of named sets: static and dynamic. Static and dynamic sets appear very similar but they actually behave very differently, which is why I present to you the following example.

Below you will see a snippet of MDX from my cube script that creates a named set called Top 10 Customers – Static. This is the basic syntax for creating a named set in your cube’s MDX script. You’ll notice the static keyword, highlighted in blue. This specifies that we wish this named set to be static. The static keyword is actually optional, because if we leave the static keyword out of the create set statement, the set will still be created as a static named set.

CREATE STATIC SET CURRENTCUBE.[Top 10 Customers - Static]

AS topcount(

[Customer].[Customer].children,

10,

[Measures].[Internet Sales Amount]

) ;

The next create set statement creates a dynamic named set called Top 10 Customers – Dynamic, the big difference here being obviously the keyword dynamic, highlighted in blue. This specifies that this named set should be created as a dynamic named set.

CREATE DYNAMIC SET CURRENTCUBE.[Top 10 Customers - Dynamic]

AS topcount(

[Customer].[Customer].children,

10,

[Measures].[Internet Sales Amount]

) ;

Here you can see the create set statements in my Adventure Works cube script.

image 

And here we can see the two sets as they appear in our cube’s metadata tab in SQL Server Management Studio.

image

Now this is where it gets interesting. Below we have an example of our static named set being used in a query on the row axis. And we can see that it works fine.

image

But what happens if add a constraint in the Where clause? Uh oh, we run into an issue. The static named set does not respect the Where clause (or a subselect statement in the From clause for that matter). The named set displays the same members instead of displaying the top 10 customers from the year 2006. This could be a problem for our users depending on the requirements of the reporting solution.

image

image

This could be where a dynamic named set may be more useful. Here you can see an example of  a query that uses our dynamic named set.

image

Except when we provide a constraint in the Where clause, the named set listens to the Where clause and displays the correct data. I know the Internet Sales Amount numbers are all the same but that’s just the nature of the Adventure Works data.

image

I think this perfectly demonstrates the differences between static and dynamic named sets. Static named sets behave exactly as their name suggests: They are static and do not respect the Where clause or a subselect statement in the From clause. The dynamic named set is dynamic and will listen to a Where clause slicer or a subselect in the From clause.

If this is all a little overwhelming to you and anytime someone mentions using MDX you curl up into the fetal position, suck your thumb, and sob uncontrollably, I would suggest taking a look at the BI xPress calculation builder. BI xPress has a nifty little wizard that will help you create MDX calculations and named sets without you having to do any of the tough MDX writing on your own.

To create a named set with the BI xPress calculation builder, click the little calculate icon in the Calculations tab of the cube designer in BIDS or SSDT. This will open up the MDX Calculation Builder Wizard part of BI xPress.

image

Choose the Top 10 Count template under the Sets folder and click Next.

image

On the next screen we can pick the attribute required for our set. In this case, I’ll select the Customer attribute of the Customer dimension in order to create the Top 10 Customers set we were playing with earlier.

image 

Then we select the measure we want to use to rank our customers. I’m selecting the Internet Sales Amount measure.

image 

Lastly we give our named set a name and click finish. On this screen we can preview the MDX the BI xPress MDX Calculation Builder wrote for us.

image 

And we’re done!

image 

The BI xPress MDX Calculation Builder wrote all the MDX for us without us having to know a lick of MDX! Pretty nifty if I do say so myself. For more information on BI xPress or the BI xPress MDX Calculation Builder, head over to PragmaticWorks.com and download the free trial of BI xPress.

And if you have any questions or comments, please feel free to leave a comment or shout out on Twitter @SQLDusty! Thanks!

Recording Now Available For The Webinar, Choosing The Right Analysis Services: MOLAP Vs. Tabular

image

Thanks to everyone that attended Devin’s and my webinar called Choosing The Right Analysis Services: MOLAP vs. Tabular. I’m pleased to announce that the recording is now available to watch for free over at PragmaticWorks.com, so please go check it out. It’s a little less than an hour so you can watch it during your lunch break.

Also, the PowerPoint slide deck Devin and I used during the webinar is also available for viewing now! Please visit this link to download the slide deck.

Now for the questions! Many of you asked some great questions but unfortunately we ran out of time to answer all of the questions during the webinar. So here are a few of the questions we didn’t get to.

Q: How do i link if column have more than one column is key column in tabular?
A: If you need to create a composite key in a Tabular model table, you will need to create a calculated column that concatenate the columns that make up your composite key. You’ll need to do this in both tables you wish to relate. Once you’ve done that, then you can create the relationship between the two tables using your new columns.

Q: Can DAX be used to access cubes?
A: In the SQL 2012 SP1 CU4 release, DAX support for multidimensional cubes was added, so as long as you are running on SQL 2012 SP1 CU4 or later, you should be able to query cubes with DAX expressions. On a side note, MDX can also be used to query a Tabular model.

Q: Since tablular solution is many ways better than Muti Dimensional..then my question is when to go for Multi dimensional solution
A: This is one we covered extensively during the webinar. Here are some of the things to consider:

  1. How much data are you dealing with? If its too much to fit into memory for your Tabular model, then MOLAP is the way to go.
  2. Do you have a need for complex relationships? If so, MOLAP may be the answer. Role playing dimensions and many-to-many relationships are possible to create in a Tabular model, but they’re easier to create and manage in a MOLAP cube.
  3. Do you need to perform many complex calculations involving complex Scope assignments? If so, MOLAP is the answer here.

Q: Can you use a Multidimensional database as the source for a Tabular model and improve performance when creating low level granular reports?? This goes back to the performance differences between Multidimensional vs Tabular when creating granular reports.
A: You can use a Multidimensional database as a data source for a Tabular model, but I would suggest getting the data from the original source for the tabular model. If granular type queries are slow against your cube, those same queries are still going to be slow when you execute them to process your Tabular model.

Thanks to everyone that attending Devin’s and my webinar! If you have any other questions, please feel free to leave a comment or send me a message on Twitter!

Adding/Copying SSIS XML Package Configurations

SSIS Package Configurations allow us to modify SSIS package properties from outside of the packages. This gives us a seamless way to dynamically point our packages to our development, QA, and production environments without needing to manually open and modify our connection managers within each SSIS package. Package Configurations can be created using a SQL Server table, environment variable, XML files, registry entry, or parent SSIS package, but this post will focus on creating an package configuration using an XML file.

To create the Package Configuration, open an SSIS package in SSDT or BIDS and right-click in the blank area of the control flow and select Package Configurations.

image

Check the checkbox to enable package configurations and then click Add to add a package configuration to the package.

image

In the drop down list next to Configuration type select XML configuration file. Browse to the location where you wish to store the XML configuration file and type in the name which you’d like to give your configuration file and hit enter.

image

And now time for a quick note on best practices and my thoughts on using XML configuration files. SSIS package configurations are most often used to configure Connection Managers from outside the SSIS packages. With this in mind, I prefer to create one package configuration per Connection Manager. I also prefer to name the configuration file the same name as the Connection Manager. In my experience, the 1:1 ratio of XML configuration files to Connection Managers eases maintenance and management of the XML configuration files. If you’re creating XML configuration files to configure other properties of the package such as package variables or task properties, you may find another solution works better for you. That’s just my .02 cents on the matter.

Now we need to specify the property which we wish to configure. In this example, I’m selecting the ConnectionString property of a Connection Manager. Check the checkbox next to the ConnectionString property under the Connection Manager. Click Next. On the final screen you’ll need to give the Package Configuration a name. I give the same name as the XML Configuration File to help things stay nice and neat.

image

We can see our package configuration has been created. Click close.

image

Yay! Now your package has an XML configuration file set up to dynamically configure your connection string of the connection manager you selected at run time. All that is left to do is to copy the XML configuration file to each environment the package will be run on: developer machines, development, QA, and production environments. Then you must also update the XML configuration file with the environment’s connection string. Just open the XML file in note pad and update the Connection String to point to the appropriate server and database.

image

So we’ve set up the package configuration on a package. But what if we need to copy the package configuration to many other packages? Sure, we could complete the previous steps manually on each package, but that would take a lot of time. We can, however, easily accomplish this in a few clicks with BI xPress. In order to complete the following steps you will need to have BI xPress installed. You can download and install a free trial version of BI xPress here.

image

Shift select all of the packages within your SSIS project including the package that has the package configuration applied as well as the packages you wish to modify. Then right-click the packages in the Solution Explorer and select Package Configurations (BI xPress) to open the BI xPress Configuration Wizard.

image

Click Next pass the first screen of the Configuration Wizard.

image

You should see all of the packages highlighted and in the Select Items window. Click Next.

image

In the following screen you should see your packages listed and the lone package configuration displayed beneath the SSIS package we added the configuration to.

image

Right-click the package configuration and select copy or simply highlight the package configuration and hit Ctrl+c to copy the package configuration.

image

Then select each package in the list individually and hit Ctrl+p to add the package configuration to each package.

image

Click the disk icon to save your changes and we’re done! Our package configuration has been added to each package with much less work than we initially went though.

image

Just be aware that if package configurations have not been enabled for the packages you added the package configuration to you will still need to complete that step for each package.

For more information on BI xPress just head over to PragmaticWorks.com or head here to download the free trial version of BI xPress. Enjoy!

Comparing SSIS Packages with BI Compare

Comparing SSIS packages isn’t always easy, as I’m sure you’ve probably discovered if you’ve ever tried to compare one package to another. Of course SSIS packages are simply XML files underneath the shiny GUI, but scrolling through pages of XML in order to discover the slightest differences between two versions of a package is not how I prefer to spend my afternoons. Thankfully, there’s an easier solution.

Pragmatic Works has developed a tool called BI Compare that allows you to easily compare two SSIS packages and easily view the properties of each package side by side with easy to spot indicators making it very easy to spot differences in the packages. BI Compare is a tool included with Pragmatic Works fabulous BI development and administration tool BI xPress.

With BI Compare we can set up a new comparison scenario, identifying the packages we wish to display on the left and the right.

Once we have executed the comparison between the two package, the package comparison displays a image for component and properties in the two packages that match, a image next to components that do not match, and a image next to components that are missing in the other package.

So its very easy to skim through the comparison results and view the differences between the two packages by simply looking for the X icons or <- icons. You also have the ability to filter which types of objects you wish to compare. For instance, if you don’t need to compare the connection strings simply because you’re comparing a package in development against a package in production, you can exclude the Connection String property from the comparison.

As an SSIS developer and BI Consultant, I’ve found this tool very useful during SSIS package upgrades and migrations. To get more information on BI Compare, head over to Pragmatic Works and download the free trial version of BI xPress.

Are You Using SSIS Best Practices?

imageI’ve done a lot of consulting work over the last several years but one thing I’ve noticed that every single client is interested in is using best practices. Whether we’re talking about building SSRS reports, designing a data warehouse, building a 2 TB cube, or designing an ETL solution for a large data warehouse, everyone wants to use best practices. But its easier said than done. How do you enforce best practice rules across your organization? BI xPress attempts to remedy that problem with its SSIS Best Practices Analyzer.

The great part of the BI xPress SSIS Best Practices Analyzer is that you have the option to create your very own custom rules to enforce across your organization. Of course the tool comes packed with a list of rules put together by our team of ETL developers, but you also have the option to create your own rules to add to and improve upon our list of best practices. Very cool stuff!

Basically the way the tool works is very simple. BI xPress reads the properties of the packages you wish to enforce your organization’s best practices on and tests the properties based on the rules you’ve defined. For example, imagine your organization follows a specific naming convention for each and every component in your SSIS package. Simply create the rule that checks for proper naming convention using Rule Create dialogue.

After you’ve created your rules based on your teams best practices, its very easy to check your existing SSIS package to verify that they are utilizing best practices.

You can use your custom rules your team has developed or you can use the included rules put together by our team of consultants for those of you that are looking for best practice recommendations. If you’re looking to standardize yours SSIS packages and development life cycle, I’d encourage you to take a look at the BI xPress SSIS Best Practices Analyzer. Head over to PragmaticWorks.com for a free trial download.

SQL Server BI Consultant, MCITP

Follow

Get every new post delivered to your Inbox.