Category Archives: Power Query

Power BI Learning Path – Free and Paid Resources

This week’s TSQL Tuesday challenge is on learning something other than SQL. I’ve written before about how to keep up with technology. When you are starting out with a technology, it’s just plain hard to get a lay of the land.

So I thought I’d put together a learning path for Power BI, a technology that changes literally every month. This is a bit of challenge because there are so many moving parts when it comes to Power BI. So let’s break down those moving parts into different categories.

image

So, when I think about Power BI, I like to think about the flow of data. First we have the Data prep piece with Power Query, where we clean up dirty data. Next we model the data with DAX. I’ve written before about the difference between Power Query and DAX. They are like peanut butter and jelly and compliment each other well.

Now, if you are a SQL expert, you may not need to worry about Power Query or DAX much. Maybe you do a lot of the work in SQL. But either way, once your data is modeled, you need to visualize it in some way. You need to learn how to create your reports with Power BI Desktop. Once your report is created, you then need to publish it.

Finally, there is what I would call the IT Ops side of Power BI. You have to install an on-premises data Gateway to access local data. You need to license your users. You need to lock down security. All of these things might be outside of what a normal BI developer has to deal with, but are still important pieces. However, unlike the data flow model we talked about, the ops pieces happens at all of the stages of development and deployment.

With that overview in place, let’s get on to the individual sections and the learning paths as a whole.

Getting started with Power BI

When it comes to getting started with Power BI, I have two recommendations. First get your hands dirty, and secondly buy a book. Power BI is in many ways an amalgamation of disparate technologies. It took me a long time to to understand it and it didn’t really click until I took the edX course and did actual labs.

The reason I say to buy a book is this is a technology that is hard to learn piecemeal. When you are starting out you are much better off having a curated tour of things.

Free resources

  • Check out Adam Saxton’s getting started video.
  • Search Youtube for Dashboard in an Hour. This is a standardized presentation that will show you the basics in under an hour.
  • Follow the guided learning. This will walk you through bite sized tasks with Power BI.
  • Take the edX course. It has actual labs where you have to work with data inside of Power BI.
  • Check out the Introducing Microsoft Power BI book from Microsoft Press. It’s a bit dated at this point, but it’s free and is a great start.
  • Check out the Power BI: Rookie to Rockstar book from Reza Rad (b|t). The last update was July 2017, but it’s also very comprehensive and good.

Paid resources

  • Stacia Misner Varga (b|t) has a solid course on Pluralsight. It’s worth a watch.
  • Consider reading the Applied Power BI by Teo Lachev (b|t). It’s a real deep dive which is great, but can be a lot to take in if you are just getting started. A neat feature is that it’s organized by job role.

Learning Power Query and M

When it comes to self-service data preparation, Power Query is THE tool. The way I describe it is as a macro language for manual data manipulations. If you can pay someone minimum wage to do it in Excel, you can automate it in Power Query. Again, check out this post for the differences between Power Query and DAX.

Free Resources

  • Start with the guided learning. This quickly covers the basics
  • Reza Rad has a solid getting started post on Power Query that you can follow along with.
  • Matt Masson has a phenomenal deep dive video on the Power Query formula language, a.k.a M, from a year ago. It really helps elucidate the guiding principals of Power Query and M.
  • Blogs to check out:
    • Imke Feldmann (b|t) regularly has complex functions and interesting transformations on her blog.
    • Ken Puls (b|t) focuses on Excel and along with that, Power Query.
    • Gil Raviv (b|t) often has neat examples of things you can do with Power BI and Power Query.
    • Chris Webb (b|t) regularly dives into the innards of Power Query and what you can do with it.

Paid Resources

  • Ben Howard (b|t) has a Pluralsight course on Power Query. It’s a bit introductory, but great if you are just getting started.
  • Gil Raviv recently (October 2018) released a book on Power Query. What I really like about this book is it has more of a progression style instead of a cookbook kind of feel.
  • Ken Puls and Miguel Escobar (b|t) also have a book on Power query that has a cookbook feel. I found it helpful in learning Power Query, but it’s heavily aimed at excel users.
  • Finally, Chris Webb also has a book on Power Query. He goes into a lot of detail with it. However, the 2014 publish date means it’s starting to get a bit old.

Learning DAX

I always say that DAX is good at two things: aggregating and filtering. You aren’t doing those two things, then DAX is the wrong tool for you. DAX provides a way for you to encapsulate quirky business logic into your data model, so that end users doing have to worry about edge cases and such.

Free Resources

  • Read the DAX Basics article from Microsoft
  • Check out the guided learning on DAX
  • Learn the difference between Calculated columns and Measures in DAX. They can be confusing.
  • Make sure you understand the basics with SUM, CALCULATE and FILTER
  • Understand Row and Filter contexts. They are critical for advanced work in DAX
  • Blogs to check out
    • Matt Allington (b|t) has a blog with Excel right in the name but also writes about all the different parts of Power BI Desktop.
    • Rob Collie (b|t) has a voice all his own. read his blog to learn about DAX and PowerPivot without taking yourself too seriously.
    • Alberto Ferrari (b|t) and Marco Russo (b|t) are THE experts on DAX. Read their blog. Also see their site DAX.guide.
    • Avi Singh (b|t) regularly posts videos on Power BI and will often take live questions.

Paid Resources

Power BI Visuals

The piece of Power BI that is most prominent are they visuals. While it’s incredibly easy to get started, I find this area to be the most difficult. If you are heavily experience in reporting this shouldn’t be too difficult to learn.

Free resources

Paid resources

  • A really interesting book is The Big Book of Dashboards. While it doesn’t mention Power BI, it covers all the ways to highlight data and what really makes a dashboard.

Administering Power BI

Power BI is much more than a reporting tool. It is a reporting infrastructure. This means at some point you may have to learn how to administer it as well.

Free resources

Paid resources

Keeping up with Power BI

One of the big challenges with Power BI is just keeping up. They release to new features each and every month. Here are a few resources to stay on top of things:

Going Deeper

Finally, you may want to go even deeper with things. Here are some final recommendations:

I’m starting a BI newsletter. 5 BI links every week.

I’ve written before about how to keep up with technology. In the post, I describe 3 currencies we can spend to extend out learning: time, focus and actual money. As you get older, you start to get less time and even less focus, but your pay rate goes up. So, every year it becomes more and more important to learn on curation to find just the good stuff.

As part of that I’m starting my own curated mailing list for BI links. Power BI changes on a monthly basis and it’s such a pain to keep up with it. This week is the 3rd week so far.

So what’s the catch? Well, I’ll also be including whatever things I’m up to at the bottom of each email. So if you don’t like me, maybe don’t sign up, hah. Here is this week’s weekly BI 5:

  1. David Eldersveld talks a bit about #MakeoverMonday. This sounds like a great community program and I always find making things pretty to be the hardest part.
  2. Wolfgang Strasser is keeping track of all the November updates for Power BI. I keep seeing memes about this from Microsoft employees, so I’m expecting something big to drop at Pass Summit.
  3. Ginger Grant continues her series on SSAS best practices. I love seeing posts about how to do things right instead of just how to do the basics. Great stuff.
  4. Chris Webb also continues his series on using Power Query with Microsoft Flow. The expanded use of Power Query fits neatly into my conspiracy theories about where Power BI is going. Also keep an eye out for announcements about data flows.
  5. Finally, If you are going to PASS Summit, check out the BI Power Hour. All learning will be accidental.

Sign up to the list today!

 

#SQLChefs: Power BI Datasets, Reports and Dashboards

This week we’ve got another episode of SQLChefs with Bert Wagner, where we talk about the different between datasets, reports and dashboards in Power BI.

What are datasets?

A Power BI Dataset is a series of Power Query queries that have been shaped in a DAX model. Each dataset can combine different files, database tables and online services all into one tabular model.  In our cookie analogy, these are all different “ingredients”.

Unlike SSRS, a dataset in Power BI does not represent a single table or query of data. A dataset should be considered more like a “flavor” of data used to accomplish a specific type of reporting: financial, operational, HR, etc. So in our analogy, the dataset is the “raw dough”.

So in Power Query, you are going to have a set of queries which each combine a data source with a usually linear set of transformations.

image

Then, in DAX, you are going to take each of those outputs and combine them into a model. This consists of defining relationships between the outputted tables and adding business logic via calculated columns and measures.

image

For more on the difference between Power Query and DAX, see our previous episode of SQLChefs.

What are reports?

A power BI report is a series of visualizations, filters and static elements on a canvas. Power BI reports are saved as a single PBIX file and connect to a single dataset. Remember, a Power BI dataset can have many data sources.

image

(Demo file courtesy of Microsoft, MIT License)

Each report can have multiple sheets, just like an Excel workbook. In our analogy, this is us placing our “cookies” on multiple “cookie sheets” making one big batch, all of the same “flavor”.

One report per dataset

A quick aside to something that used to confuse me. In most cases, a report and a dataset are going to have a one to one relationship. A dataset can have one report and a report can have one data set.

Recently this has changed, however. A while back, they added the ability to use an existing dataset as a data source for a report. and at Ignite they announced the ability to share datasets outside of the app workspace they were made in.

That being said, while you are still learning Power BI, it’s easier to remember that in many cases, your dataset and your report are going to have a one-to-one relationship and be tightly linked.

What are dashboards?

In Power BI, dashboards are a way of pulling together visualizations from various reports. When you think dashboard, you are probably thinking something like Microsoft’s definition: “A Power BI dashboard is a single page, often called a canvas, that uses visualizations to tell a story. Because it is limited to one page, a well-designed dashboard contains only the most-important elements of that story.”

However, if you look at the report example above, it probably fits that definition. It is not a Power BI Dashboard. In Power BI, a dashboard is tool for pinning visuals from different reports and other sources of data.

image

In my opinion, a Power BI Dashboard is as much a tool for organization and navigation, as it is for actual reporting. I think that’s the real value add with Power BI dashboards.

M vs DAX: Chopping Broccoli vs Planning a Menu

Last week, I had the pleasure of recording some video with Bert Wagner about Power BI. In the video, I got to use one of my favorite analogies for M versus DAX: Are you chopping broccoli or planning a menu?

One of the challenges with learning Power BI, is that you have to learn not 1, but 2 new data manipulations languages. And it’s not always clear what they are good for, especially if you come from the SQL world.

Is M a general purpose knife, or one of those weird egg slicers?

Head Chefs versus Sous Chefs

I have never worked in the restaurant business, but I’m going to make some gross generalizations anyway.

Sous chefs, as far as I can tell, do a lot of the prep work. They are cutting vegetables, cleaning food, making sauces, etc. While this is all important work, much of it doesn’t inform the final outcome. If you are making beef teriyaki or if you are making broccoli salad,  you still need to chop the broccoli.

The head chef however, gets paid for her brains just as much as her hands. The head chef is figuring out the menu and how to combine all of the ingredients. She is involved very heavily with what the final result is going to be. A head chef has to think of the broader goals and strategy of the restaurant, not just how to get the immediate task done.

M is the Sous Chef; DAX is the Head Chef

Again this is all a gross generalization, but in the restaurant called Casa De Meidinger this is actually the case! I do a lot of the grunt work when we cook a meal. My wife says, “zest this lemon” and I mindlessly do it. I could probably be replaced with a robot some day, and that would be fine by me.

Annie, however, actually enjoys planning a meal, deciding what to cook, and thinking about how to make the final product. To me, cooking is just a necessary evil for eating. I don’t necessarily get any joy from the process itself.

Working with M

I like to think of M as this sous chef. It does all the grunt work that we’l like to automate. Let’s say that my boss asks for a utilization report for all of the technicians. What steps am I doing to do in M?

  1. Extract the data from the line of business system
  2. Remove extraneous columns
  3. Rename columns
  4. Enrich the services table with a Billable / NonBillable column
  5. Generate a date table

This is all important work, but I would have to do the same work for a variety of reports. Many of the steps tell me nothing about the final product. I would generate a date table for most of my reports, for example.

Working with DAX

Now, if I’m working DAX, what am I going to do?

  1. Ask what the heck “utilization” really means

This was a real-life example that happened to me. What is utilization as a key metric? Well it turns out it depends what you are trying to report on. A simple definition is usage divided by availability. If a technician billed 20 hours and clocked in 40, his utilization would be 50%. Or so you would think.

How do we handle internal projects? Let’s say we have a technician who billed 2 hours to a customer, but spent 38 hours on an internal database migrations. What was his utilization?Well, if we are looking for billable utilization, it’s 5%. If we are looking for total utilization, it is 100%. These are questions that you are going to encapsulate in your DAX formulas.

The whole idea of a BI semantic layer is to hide away the meaning from the end users. When someone orders a cobb salad, they don’t want to have to articulate the ingredient list. They just want a darn salad.

Are you paid for your hands or your brain?

In the SQL Data Partners podcast, episode 114, there was a question: what’s the difference between a contractor and a consultant. One of the answers was this: a contractor is a set of hands, and a consultant is a set of brains.

I think this answer relates to M versus DAX. M is an automated set of hands, able to do work you’d normally do by hand in Excel. DAX let’s you take your domain knowledge and encode it into a data model. It’s an externalized representation for your brain.

And if you think about it, which do you want to be paid for? Do you want to get paid to unpivot data by hand every week? Or do you want to get paid for thinking, for understanding the business and for working at a higher level.

M allows you to automate the first step, so you can do more of the latter with DAX.

Wrangling GotoWebinar Stats with Power Query: Part one

So, this week I gave my first presentation to GroupBy.org. It was very exciting, and I learned that I need a chair that doesn’t swivel so much.

So, I said to Brent, “How many people attended, I want to update my speaking log.” He said, “210? I’ll get you the data tomorrow.”

Here’s what he gave me this:

image

Ugh.

Power Query to the rescue

Normally this would be a giant pain to work with. When it comes to data quality, this is quite the tohubohu. Thankfully, I can clean things up quickly with Power Query.

So, first I’m going to click on the data and select Add to Data Model, under the Power Pivot tab.

image

Excel is going to make some assumptions about what is part of the table. This is convenient for our needs, but we’ll have to find a work around when we want to scale to multiple excel files.

image

We can’t tell it we have headers, because it’s going to think that first row is a header. We’ll deal with that later. Once we click OK, we are taken to the Power Query / Power Pivot window.

image

I made a mistake

Hmm, so it looks like I made a mistake. I hope my honesty won’t lose me any izzat, or ability to command respect. I think it’s important to see how people really learn and really solve problems. So, I’m including my screw ups in this post.

Apparently, I created a linked table and I can’t see how to edit the the Power Query portion for that. A linked table is a nice way to pull raw data from the Excel workbook. It’s great for reference tables, but doesn’t solve our problem.

image

Trying again

Let’s take a different approach. I’m going to open a blank excel workbook and pull the data into there. Okay, so let’s go to manage under the Power Pivot tab.

image

Next, we are going to click “Get External Data From Other Sources”

image

Then I’m going to scroll to the bottom and select Excel File.

image

Once selected, I only have the whole first sheet as an option. If I had table objects or named ranges, that would be different.

image

Hmmm, I still can’t find a way to edit the Power Query. Fiddlesticks!

Normally, in Power BI it would be right here:

image

Trying to do this in Excel is quite the boyg, or vague obstacle.

Third time is a charm

Sigh, okay let’s try this a third time. I’m going to do to the Data tab and the “Get and Transform Query”. “Get and Transform” is the new name for Power Query.

image

Okay, let’s try opening that Excel file. Ah, much better. Now I want to click Edit at the bottom right.

image

Cleaning the Data

So, First thing we need to do is get rid of all of the non-header rows at the top.

image

To do that, I just select Remove Rows –> Remove Top Rows.

image

Then I specify I want to get rid of the top 7 rows.

image

Next, I want to turn the actual header row into a header.

image

Okay, so now it looks like a real table.

image

Comma Delimited BS

Okay, so now we need to parse out the times someone was watching. The problem is that some people were in and out. Their entries are comma delimited. Ugh.

image

Okay, let’s split them up. I’m going to select Split Column –> By Delimiter

image

Unfortunately, splitting by column a) splits into more columns and b) you have to specify how many.

image

Thankfully, we can select those new columns and unpivot them.

image

Perfect. Now we have a row for every time a person as watching.

image

String parsing

Okay, so now we just need to parse out the dates. First, we are going to split on the dash, and then the parenthesis.

image

This is starting to look good.

image

Now we just need to get rid of the timezone and convert it to a datetime. First we need to select Replace Values.

image

image

Lastly, we select the data type we want.

image

What’s next?

Now that are data is cleaned up, we’ll join to sessions table and do some simple data modeling. But that’s for the next blog post.