Power Pivot is an amazing, flexible and powerful business intelligence tool (among other things) and there is no doubt about that fact. As a feature included with Excel 2013 and 2016 (and an add-on for Excel 2010), Power Pivot allows user with a little technical expertise to integrate disparate data source together within a flexible data model. Once the data is loaded into Power Pivot, we easily have the ability to create powerful calculated measures, key performance indicators Continue reading Taking #PowerPivot to the Next Level
Since the release of Power BI Desktop this past week, I’ve been really spending my extra time digging into the application focusing on learning and experimenting as much as I can. When my wife has been watching Law and Order: SVU reruns at night after the rug rats are in bed, I’ve been right there next to her designing Power BI dashboards like the total data nerd that I am. When my kids have been taking their naps during the weekend, I’ve been writing calculations in the model for my test dashboards. Or when I’ve been riding in the car back and forth to work I’ve been thinking of new things to do with Power BI Desktop.
Since I’ve been spending a decent amount of time with Power BI Desktop, I thought I’d take a moment to share three things to know and remember when designing your Power BI models and dashboards that I think will help you make the most of this tool and be effective at providing the data your business needs to succeed.
1. Optimize your Power BI Semantic Model
It probably hasn’t taken you long to figure this one out if you’ve built Power Pivot/Tabular models or at least it won’t when you do start developing Power BI dashboards. The visualizations in Power BI and Power View are heavily meta-data driven which means that column names, table or query names, formatting and more are surfaced to the user in the dashboard. So if you using a really whacky naming convention in your data warehouse for your tables like “dim_Product_scd2_v2” and the column names aren’t much better, these naming conventions are going to be shown to the users in the report visualizations and field list.
For example, take a look at the following report.
Notice anything wonky about it? Check the field names, report titles and number formatting. Not very pretty, is it? Now take a look at this report.
See the difference a little cleaned up metadata makes? All I did was spend a few minutes giving the fields user-friendly name and formatting the data types. This obviously makes a huge difference in the way the dashboard appears to the users. By the way, I should get into the movie production business. 😉
My point is that the names of columns, formatting, data types, data categories and relationships are all super important to creating clean, meaningful and user friendly dashboards. The importance of a well-defined semantic model cannot be understated in my opinion. A good rule of thumb is to spend 80% to 90% of your time on the data model (besides, designing the reports is the easy part).
I’d also like the mention the importance of the relationships between the objects in the semantic model. Chance are you will have a small group of power users that will want to design their own dashboards to meet their job’s requirements and that’s one of the beauties of Power BI. But when users began developing reports, they may query your model in unexpected ways that will generate unexpected behaviors and results. I only want to mention this because the relationships between the objects in the model will impact the results your users will see in their reports. Double check your relationships and ensure that they are correct, especially after you add new objects to the model since the Power BI Desktop will sometimes make an incorrect guess at creating the relationship.
2. Choose the Right Visualizations
The best dashboards are those that tell a clear story within seconds. Your data should tell a story that is easy to read and can communicate the tale of the data to the users without a lot of extra work on their part. If your users have to look at the report for a long time in an attempt to decipher the visualizations plastered across their screen, chances are they won’t want to use your dashboard.
Let’s look at two different charts that I think will illustrate my point on the importance of choosing the right visualization for the story. The chart below shows a comparison of Domestic Sales and International Sales for different movie genres. If the purpose of this chart is to determine from which market most of the money comes from for the various film genres, then this chart isn’t doing that great of a job because we can’t clearly see the difference between the markets for Westerns.
Is there a better way to tell the data’s story? What about the pie/donut chart?
Goodness, no. Stay away from pie and donut charts. The problem with pie/donut charts is that even with only a few categories it can be very difficult to compare the slices in the pie. And if the purpose of our dashboard is for the users to quickly gain insights into the successes and failures of the business, I recommend you stay away from the pie/donut charts.
Now that’s what I’m talking about! With a clustered bar chart, we can clearly see from which markets most of the money comes from the different genres. This is a much better visualization choice for the data. We don’t have to stare and squint in order to determine the differences between the bars.
Visualization choice is critical with designing an effective and useful dashboard, so always make sure you choose the best visualization for the job.
3. Remember the User!
We as developers can oftentimes find ourselves lost in the minutia of data processing times, ETL performance, writing code, documenting the solution and all the other things that go along with designing and building a business intelligence solution. In the midst of all that awesome and glorious development work, it can be easy to forget that the whole purpose of this solution is to make the user’s job easier, faster, better, etc.
I only mention this because too many times I’ve encountered solutions that did not make the user’s job easier. Users are crafty and resourceful people. They’re (mostly) good at their job and will find a way to do their job without having to use your crappy dashboards and reports that are confusing and difficult to use. And once you start down the path of having your users work around your solution instead of with your solution, your solution has failed because at that point its not a solution; It’s an impediment.
Meet with the users as frequently as necessary to constantly gather feedback. During the requirements gathering phase its important to ask lots of questions especially if you’re unfamiliar with the data. And once its time to start designing reports, you may meet with the users even as frequently as daily since this will be the user’s primary way to interact with your solution. I’ve been on projects where my team and I worked in a conference room with a few power users. This was excellent as we were able to get immediate feedback on any reports developed and make the required changes as desired.
So in a nutshell, here are my three best practices for designing and building a killer Power BI reporting solution:
- Optimize the data-model by doing the following:
- Set data types correctly
- Apply user-friendly formatting to the data including explicit measures.
- Rename fields, measures, and tables with user-friendly naming conventions.
- Validate relationships between tables are created correctly.
- Use the right visualization that communicates the story of the data as clearly as possible.
- Remember the user and their experience with your solution! If the user likes to use your solution then its a success!
Here’s a few more Power BI related resources you may find useful:
So what do you think? What best practices did I leave out that you thought I should have included in this list? Leave a comment down below and let me know! And as always, thanks for reading. 🙂
Thanks to everyone that attended Devin’s and my webinar called Choosing The Right Analysis Services: MOLAP vs. Tabular. I’m pleased to announce that the recording is now available to watch for free over at PragmaticWorks.com, so please go check it out. It’s a little less than an hour so you can watch it during your lunch break.
Also, the PowerPoint slide deck Devin and I used during the webinar is also available for viewing now! Please visit this link to download the slide deck.
Now for the questions! Many of you asked some great questions but unfortunately we ran out of time to answer all of the questions during the webinar. So here are a few of the questions we didn’t get to.
Q: How do I link if column have more than one column is key column in tabular?
A: If you need to create a composite key in a Tabular model table, you will need to create a calculated column that concatenate the columns that make up your composite key. You’ll need to do this in both tables you wish to relate. Once you’ve done that, then you can create the relationship between the two tables using your new columns.
Q: Can DAX be used to access cubes?
A: In the SQL 2012 SP1 CU4 release, DAX support for multidimensional cubes was added, so as long as you are running on SQL 2012 SP1 CU4 or later, you should be able to query cubes with DAX expressions. On a side note, MDX can also be used to query a Tabular model.
Q: Since tablular solution is many ways better than Muti Dimensional..then my question is when to go for Multi dimensional solution
A: This is one we covered extensively during the webinar. Here are some of the things to consider:
- How much data are you dealing with? If its too much to fit into memory for your Tabular model, then MOLAP is the way to go.
- Do you have a need for complex relationships? If so, MOLAP may be the answer. Role playing dimensions and many-to-many relationships are possible to create in a Tabular model, but they’re easier to create and manage in a MOLAP cube.
- Do you need to perform many complex calculations involving complex Scope assignments? If so, MOLAP is the answer here.
Q: Can you use a Multidimensional database as the source for a Tabular model and improve performance when creating low level granular reports?? This goes back to the performance differences between Multidimensional vs Tabular when creating granular reports.
A: You can use a Multidimensional database as a data source for a Tabular model, but I would suggest getting the data from the original source for the tabular model. If granular type queries are slow against your cube, those same queries are still going to be slow when you execute them to process your Tabular model.
Via MSDN, there’s now a great whitepaper called Performance Tuning of Tabular Models in SSAS 2012 available for your viewing pleasure. There’s a treasure trove of great information in this whitepaper and I highly recommend anyone developing or reporting on Tabular Models to take a look at this whitepaper. So definitely take a moment to download and read this whitepaper.
While you download this historic piece of literature, here’s three little tidbits of knowledge I picked after my initial skim-through:
Partitions Don’t Help Query Performance
Partitions do not improve processing time and/or query time. In Tabular Models, partitioning tables in the model only serves to allow the administrator to selectively refresh smaller subsets of data as is the case with an incremental load of a fact table. If your fact table is incrementally loaded, you can save processing time and help you make your processing window by only processing the affected partitions, but partitioning alone will not improve processing time if you’re still processing all partitions.
Partitions in a table are processed serially, unlike partitions in a measure group of a Multidimensional Database which are processed in parallel. In a Tabular Model, however, since each table is independent of each other, tables can be processed in parallel even if partitions in a given table aren’t.
Unlike in a Multidimensional Database, dimension tables in a Tabular Model can be partitioned. This opens the door for incremental processing of those dimensions as well as some unique partitioning strategies for those dimensions.
Memory Is Crucial, Duh!
Obviously with a Tabular Model you need to be able to fit the model in memory to fully utilize the magical power that is Tabular. But if you have a 10 GB model, how much memory do you need? The correct answer is about 30 GB of memory. Why 30 GB you ask? During a Process Full of your Tabular Model, the database is kept online until the transaction for the processing operation is committed. That means that for the given 10 GB model, you need to be able to hold two copies in memory: 10GB for the old data and 10GB for the new data. Then you’ll likely need around 5 GB – 10 GB for various processing overheads. So keep in mind that you could need significantly more memory than you might think is necessary to support a single Tabular Model.
Table Queries Are Unaltered During Processing
In a Multidimensional database, the heavy lifting for dimensions is transferred
from SSAS to the relational database by executing “Select Distinct” statements for each attribute. For measure groups in a Multidimensional database, the query is wrapped in a subselect with only the necessary columns returned. This also means that using CTE’s, Order By statements, and procedures are not an option in a Multidimensional database. In a Tabular Model, however, the query for the table is unaltered which means that using CTE’s, procedures, Order By’s, and various other t-sql features is now possible. But this also means that the responsibility for returning only the required columns in the query is now the developer’s. Without removing the unnecessary columns from the query, processing will be less efficient and could take longer.
Like I said earlier, this is a great whitepaper and I highly recommend that you check it out!
With this post I begin a series of blog posts covering one of the most talked about features of SQL Server 2012: Tabular Modeling. Being new to this like most of us are, I hope to learn much about Tabular Modeling as we walk through the basics of building your first Tabular Model. In this installment, we’ll talk about what a Tabular Model is, when a Tabular Model is the right choice, and of course how to create a Tabular Model.
What is a Tabular Model?
With the release of SQL 2012, we are (re)introduced to tabular modeling. If you’re familiar with Power Pivot, you’re going to notice many similarities and will most likely pick up the tabular modeling aspect of SSAS pretty quickly. Basically a Tabular Model is an in-memory database in SQL Server Analysis Services. The Vertipag engine that was previously only used in Power Pivot is now utilized within Power Pivot and SSAS 2012 Tabular as xVelocity. The xVelocity technology allows you to perform complex analytics of your data all in-memory while making use of column store indexes. This eliminates expensive IO unlike SSAS Multi Dimensional Modeling where IO is a viable concern.
The Tabular Model also allows us to bring together multiple data source types very easily, similarly to Power Pivot. Bringing together data stored in a SQL Server Database, Oracle, Excel, and Access is not only possible but straight forward.
Once you’ve imported your data from whatever sources you need, defining relationships is very easy. Simply dragging and clicking an arrow from one object to another is all that is required here.
When Do I Choose Tabular Over Multidimensional Modeling?
You might be asking yourself, “Self, if Tabular is so fast and great, why would I ever use Multidimensional Modeling?” That’s a valid question, so lets go over some of the perks of each and when one or the other would the optimal choice.
1. If you need access to many different external data sources, choose Tabular. Multidimensional can do this to an extent, but if you need to relate an Excel spread sheet, a text file, an SSRS Report Feed, and your database data, Tabular is the way to go here.
2. If you need complex calculations, scoping, and named sets, choose Multidimensional.
3. If you need mind numbing speed and consistently fast query time, choose Tabular.
4. If you need Many-to-Many relationships, choose Multidimensional. You can model this relationship type in Tabular, but Multidimensional is still easier to create and manage these more complex relationships.
5. If you are planning on using Power View, choose Tabular. At this time its impossible to build Power View reports against a Multidimensional model, but that could change in the future.
6. If you don’t know DAX and want to use Tabular, either take the time to learn or use Multidimensional ; ) .
7. If your solution requires complex modeling, choose Multidimensional.
Take these points into consideration when choosing Tabular vs. Multidimensional. This isn’t every single consideration to think about, but should at least get you started in understanding the differences between Tabular and Multidimensional.
How Do I Create a Tabular Model?
So now that we have a general understanding of what the Tabular Model is and what are some of scenarios we should choose to use the Tabular Model, lets start creating our first Tabular Model.
For you to play along with my example, you’re going to SSAS 2012 installed in Tabular Mode.
If you’re not sure if your instance of AS is in Tabular Mode, just connect to Analysis Services in SQL Server Management Studio and look at the icons next to your instance of SSAS.
The Tabular SSAS instance has the nifty little blue icon and the Multidimensional instance has the same icon as before in 2008.
You will also need SQL Server Data Tools and the AdventureWorksDW2012 sample database.
So first things first. Open SQL Server Data Tools.
Then go to File, select New, then click Project.
Under Business Intelligence, highlight Analysis Services, and select Analysis Services Tabular Project. I’m naming my project FirstTabularProject. Then click OK.
After clicking OK you can see the new project in the Solution Explorer with an empty model, Model.bim.
With the project created, your empty model should be open in the Designer Window. So now its time to create a connection to our data source(s). In the top left of the menu tool bar, click the Import From Data Source icon.
Then select the type of data source you want to connect to. In my case, I’m connecting to a SQL Server database. Select Microsoft SQL Server and click Next. Specify the Server name, the credentials, and the Database.
After clicking Next, we must specify the Impersonation Information. These are the credentials that Analysis Services will use to access the data source when importing and processing the data. We can either specify specific credentials or tell it to you use the AS Service Account. I’m specifying credentials.
On the next screen, we need to choose how to import the data. We have two options: We can either select from a list of the tables and views which objects we’d like to import or we can write a query to specify the data to import. I’m selecting from the list of tables.
On the Select Tables and Views screen, you’ll see a list of the Tables and Views in your database. I can browse through this list and places checks next to all the tables and views I’d like to import. Or I can select a table and then click the button Select Related Tables. This will use the referential integrity of the database to determine which tables to check for you. Be careful clicking Select Related Tables. If you accidently click the button and the wizard selects 20 other tables, there’s no easy way to unselect the newly selected tables. I’ve selected FactInternetSales and allowed the wizard to select the dimensions based on the referential integrity.
Before clicking Finish, you’ll want to make sure that you highlight each table you want to import and then click the Preview & Filter button. This will allow you to not only preview the data, but also uncheck any fields that you do not wish to import into your model. This is important since the database will be stored in memory. We do not want to store any unnecessary data that we do not have to. You can see that I’ve gone through the FactInternetSales table and unchecked the fields I don’t want to import.
After filtering out the unnecessary fields, click Finish. The importing of the data will begin.
Once it is finished, click Close. You’ll notice the data has been imported and is now viewable in the Designer Window.
If after importing your data you decide you need to bring in another table from the same data source, click the Existing Connections icon.
Then click Open and you are able to add new tables, views, or named queries to your model.
In the Designer Window we have two views. The Grid view allows us to see the imported data, with each table on an individual tab.
We can also switch to the Diagram View by clicking the Diagram View icon at the bottom right of the Designer Window. The Diagram View is ideal for viewing all the imported tables and their relationships at one time.
So now that we’ve imported in our data, we need to add some measures to our model. Switch back to the Grid view and click over to the tab for the fact table, FactInternetSales. Select the first text box in the Measures Grid directly below the Sales Amount field. If the Measures Grid is not visible, just click the Show Measures Grid icon to toggle it back on.
After highlighting the text box beneath the Sales Amount field, click the Sum (Sigma) icon. This will automatically create a measure with an aggregation type of Sum. Then go into the properties of your new measure and give it a friendly name since this is the name that your users will see when browsing the cube.
Then do the same for the Order Quantity field.
Now lets deploy and process our model. By default, the model will be deployed to default instance of SSAS on the local machine. We can change the server we want to deploy to by right-clicking the project in the Solution Explorer and selecting properties. You can also change the name of the database that will be created when you deploy the Model.
In the Menu bar, click Build, then click Deploy ;.
This will begin the deployment and processing steps. If you specified specific credentials to use for impersonation, you’ll need to enter the user’s password during this step.
Once the model is deployed, we can now view our model deployed to the AS server and browse it with Excel. Click the Analyze in Excel icon and your model will open in Excel so you can browse it.
We’ve created our first Tabular Model. I hope this gives you a good introduction on what Tabular is, when Tabular is the right choice, and the basics of creating a Tabular Model.
In the next article, we’ll get more into modifying our model by building hierarchies in our dimensions and other more advanced topics so stay tuned for the next article. And as always, post any questions or comments here and I’ll answer them as best I can.