Category Archives: Power Query

Query folding workaround for Azure Devops and Power BI

Query folding is one of the most powerful tools in Power Query and Power BI. It is the automatic process of pushing down filters and other transformations back to the data source. This can dramatically improve performance for your queries.

Unfortunately, OData is not guaranteed to support query folding. According to the Power BI documentation on incremental refresh.

Most data sources that support SQL queries support query folding. However, data sources like flat files, blobs, web, and OData feeds typically do not. In cases where the filter is not supported by the datasource back-end, it cannot be pushed down. In such cases, the mashup engine compensates and applies the filter locally, which may require retrieving the full dataset from the data source.

I recently did some tests on this for the OData source for Azure Devops. When I tested with the sample Northwind database, query folding was working. I was able to see with Fiddler that my date filter was getting pushed back down.

clip_image001

However, when I tried to the same with Azure Devops, none of my filters where getting pushed down to the source. As a workaround, I was able to put my filters in the URL. So to filter based on date, I used the following url:
https://analytics.dev.azure.com/eugene1234/TestingOdata/_odata/v2.0/WorkItems?$select=WorkItemId,WorkItemType,Title,State,ChangedDate&$filter=ChangedDate%20gt%202018-12-30T23:59:59.99Z

clip_image001[5]

In this case, I was manually specifying the date filter in the URL. But it should be possible to use M code to dynamically generate the URL. Another option might be to create a custom data connector for oData that supports query folding.

Power Query Is No Longer DAX’s Little Brother

I’ve talked before about the difference between the Power Query Formula language, or M, and the DAX language.

I would describe Power Query as the intern you pay minimum wage or the sous chef, and DAX as the $35 per hour analyst or the head chef. This wasn’t to be mean but instead was just because Power Query was all about automating repetitive data manipulations. It handled the less exciting, less complicated work.

Last week, however, I presented on Power Query, and I had to update the slide about where it’s available. I used to say that wherever DAX is, Power Query was not very far behind. Doing all of the grunt work so that DAX could shine. But this time I had to update my slides because Power Query is starting to take center stage.

Now instead of just being available in Excel, Power BI and SSAS, Power Query is available in Microsoft Flow, SSIS and ADF! At the time this post was published, these are all in preview. But it’s really exciting to see Power Query no longer trailing behind DAX, ready to take center stage.

Create a Power Query custom data connector in 10 minutes

Getting set up

When I heard about custom data connectors for Power Query, I had assumed there would be a lot of work involved. While there is definitely quite a bit of work in implementing advanced features like query folding,  creating your very first connector is simple.

So, first you need Visual studio installed and the Power Query SDK installed as well. Once you do that, you will see Power Query as an option when creating a new project. Visual studio will also have support for .pq or Power Query files.

image

Once you create a new data connector project, you are presented with two main Power Query files. The first one, is simply a test query you can run on demand to test your connector.

image

The other file is your data connector. It has a bit of boilerplate to specify the types of credentials it accepts and publishing details such as beta status. Otherwise there is just a little bit of code defining the actual functionality. In this case we are defining the Contents function, which acts as a hello world:

image

If we run it as is, our test query will be run and we’ll see the results in a testing program.

image

Adding a function

So now, what if we want to add some more functionality? Say maybe a function to square numbers. First, we’ll add a SquareNumbers.Squared function to the main file:

shared SquareNumbers.Squared = (x as number) =>
let
y = x * x
in
y;

Then we update the sample query to call out function:

let
result = SquareNumbers.Square(7)
in
result

And it works as expected:

image

Exporting the connector

Once you have the connector working the way that you want,  run a release build in visual studio. This will create a .mez in the bin/Release folder of your solution. Copy that file to the [Documents]\Power BI Desktop\Custom Connectors folder. You will likely have to create that folder.

Whenever you open Power BI Desktop, it will recognize the connector but won’t let you use it because of security settings.

image

To get around this, go into the options for Power BI Desktop and then security. Under security, select “Allow any extension to load without validation or warning.” Then Restart Power BI Desktop.

image

Now we can see it is available in our list of connectors.

image

By default it will call the Contents function:

image

But we can easily modify the M code to call our squared function as well.

image

Which will give us the output we expect.

image

What next?

If you are interested in going deeper with Custom Data Connectors, such as adding a navigation view or  query folding, check out the TripPin tutorials.

Power BI Learning Path – Free and Paid Resources

how to keep up with technology. When you are starting out with a technology, it’s just plain hard to get a lay of the land.

So, I thought I’d put together a learning path for Power BI, a technology that changes literally every month. This is a bit of challenge because there are so many moving parts when it comes to Power BI. Accordingly, let’s break down those moving parts into different categories.

image

So, when I think about Power BI, I like to think about the flow of data. First we have the Data prep piece with Power Query, where we clean up dirty data. Next we model the data with DAX. I’ve written before about the difference between Power Query and DAX. They are like peanut butter and jelly and compliment each other well.

Now, if you are a SQL expert, you may not need to worry about Power Query or DAX much. Maybe you do a lot of the work in SQL. But either way, once your data is modeled, you need to visualize it in some way. You need to learn how to create your reports with Power BI Desktop. Once your report is created, you then need to publish it.

Finally, there is what I would call the IT Ops side of Power BI. You have to install an on-premises data Gateway to access local data. You need to license your users. You need to lock down security. All of these things might be outside of what a normal BI developer has to deal with, but are still important pieces. However, unlike the data flow model we talked about, the ops pieces happens at all of the stages of development and deployment.

With that overview in place, let’s get on to the individual sections and the learning paths as a whole.

Getting started with Power BI

When it comes to getting started with Power BI, I have two recommendations. First get your hands dirty, and secondly buy a book. Power BI is in many ways an amalgamation of disparate technologies. It took me a long time to to understand it and it didn’t really click until I took the edX course and did actual labs.

The reason I say to buy a book is this is a technology that is hard to learn piecemeal. When you are starting out you are much better off having a curated tour of things.

Free resources

  • Check out Adam Saxton’s getting started video.
  • Search Youtube for Dashboard in an Hour. This is a standardized presentation that will show you the basics in under an hour.
  • Follow the guided learning. This will walk you through bite sized tasks with Power BI.
  • Take the edX course. It has actual labs where you have to work with data inside of Power BI.
  • Check out the Introducing Microsoft Power BI book from Microsoft Press. It’s a bit dated at this point, but it’s free and is a great start.
  • Check out the Power BI: Rookie to Rockstar book from Reza Rad (b|t). The last update was July 2017, but it’s also very comprehensive and good.

Paid resources

  • Stacia Misner Varga (b|t) has a solid course on Pluralsight. It’s worth a watch.
  • Consider reading the Applied Power BI by Teo Lachev (b|t). It’s a real deep dive which is great, but can be a lot to take in if you are just getting started. A neat feature is that it’s organized by job role.

Learning Power Query and M

When it comes to self-service data preparation, Power Query is THE tool. The way I describe it is as a macro language for manual data manipulations. If you can pay someone minimum wage to do it in Excel, you can automate it in Power Query. Again, check out this post for the differences between Power Query and DAX.

Free Resources

  • Start with the guided learning. This quickly covers the basics
  • Reza Rad has a solid getting started post on Power Query that you can follow along with.
  • Matt Masson has a phenomenal deep dive video on the Power Query formula language, a.k.a M, from a year ago. It really helps elucidate the guiding principals of Power Query and M.
  • Blogs to check out:
    • Imke Feldmann (b|t) regularly has complex functions and interesting transformations on her blog.
    • Ken Puls (b|t) focuses on Excel and along with that, Power Query.
    • Gil Raviv (b|t) often has neat examples of things you can do with Power BI and Power Query.
    • Chris Webb (b|t) regularly dives into the innards of Power Query and what you can do with it.

Paid Resources

  • Ben Howard (b|t) has a Pluralsight course on Power Query. It’s a bit introductory, but great if you are just getting started.
  • Gil Raviv recently (October 2018) released a book on Power Query. What I really like about this book is it has more of a progression style instead of a cookbook kind of feel.
  • Ken Puls and Miguel Escobar (b|t) also have a book on Power query that has a cookbook feel. I found it helpful in learning Power Query, but it’s heavily aimed at excel users.
  • Finally, Chris Webb also has a book on Power Query. He goes into a lot of detail with it. However, the 2014 publish date means it’s starting to get a bit old.

Learning DAX

I always say that DAX is good at two things: aggregating and filtering. You aren’t doing those two things, then DAX is the wrong tool for you. DAX provides a way for you to encapsulate quirky business logic into your data model, so that end users doing have to worry about edge cases and such.

Free Resources

  • Read the DAX Basics article from Microsoft
  • Check out the guided learning on DAX
  • Learn the difference between Calculated columns and Measures in DAX. They can be confusing.
  • Make sure you understand the basics with SUM, CALCULATE and FILTER
  • Understand Row and Filter contexts. They are critical for advanced work in DAX
  • Blogs to check out
    • Matt Allington (b|t) has a blog with Excel right in the name but also writes about all the different parts of Power BI Desktop.
    • Rob Collie (b|t) has a voice all his own. read his blog to learn about DAX and PowerPivot without taking yourself too seriously.
    • Alberto Ferrari (b|t) and Marco Russo (b|t) are THE experts on DAX. Read their blog. Also see their site DAX.guide.
    • Avi Singh (b|t) regularly posts videos on Power BI and will often take live questions.

Paid Resources

Power BI Visuals

The piece of Power BI that is most prominent are they visuals. While it’s incredibly easy to get started, I find this area to be the most difficult. If you are heavily experience in reporting this shouldn’t be too difficult to learn.

Free resources

Paid resources

  • A really interesting book is The Big Book of Dashboards. While it doesn’t mention Power BI, it covers all the ways to highlight data and what really makes a dashboard.

Administering Power BI

Power BI is much more than a reporting tool. It is a reporting infrastructure. This means at some point you may have to learn how to administer it as well.

Free resources

Paid resources

Keeping up with Power BI

One of the big challenges with Power BI is just keeping up. They release to new features each and every month. Here are a few resources to stay on top of things:

Going Deeper

Finally, you may want to go even deeper with things. Here are some final recommendations:

I’m starting a BI newsletter. 5 BI links every week.

I’ve written before about how to keep up with technology. In the post, I describe 3 currencies we can spend to extend out learning: time, focus and actual money. As you get older, you start to get less time and even less focus, but your pay rate goes up. So, every year it becomes more and more important to learn on curation to find just the good stuff.

As part of that I’m starting my own curated mailing list for BI links. Power BI changes on a monthly basis and it’s such a pain to keep up with it. This week is the 3rd week so far.

So what’s the catch? Well, I’ll also be including whatever things I’m up to at the bottom of each email. So if you don’t like me, maybe don’t sign up, hah. Here is this week’s weekly BI 5:

  1. David Eldersveld talks a bit about #MakeoverMonday. This sounds like a great community program and I always find making things pretty to be the hardest part.
  2. Wolfgang Strasser is keeping track of all the November updates for Power BI. I keep seeing memes about this from Microsoft employees, so I’m expecting something big to drop at Pass Summit.
  3. Ginger Grant continues her series on SSAS best practices. I love seeing posts about how to do things right instead of just how to do the basics. Great stuff.
  4. Chris Webb also my conspiracy theories about where Power BI is going. Also keep an eye out for announcements about data flows.
  5. Finally, If you are going to PASS Summit, check out the BI Power Hour. All learning will be accidental.

Sign up to the list today!

 

#SQLChefs: Power BI Datasets, Reports and Dashboards

This week we’ve got another episode of SQLChefs with Bert Wagner, where we talk about the different between datasets, reports and dashboards in Power BI.

What are datasets?

A Power BI Dataset is a series of Power Query queries that have been shaped in a DAX model. Each dataset can combine different files, database tables and online services all into one tabular model.  In our cookie analogy, these are all different “ingredients”.

Unlike SSRS, a dataset in Power BI does not represent a single table or query of data. A dataset should be considered more like a “flavor” of data used to accomplish a specific type of reporting: financial, operational, HR, etc. So in our analogy, the dataset is the “raw dough”.

So in Power Query, you are going to have a set of queries which each combine a data source with a usually linear set of transformations.

image

Then, in DAX, you are going to take each of those outputs and combine them into a model. This consists of defining relationships between the outputted tables and adding business logic via calculated columns and measures.

image

For more on the difference between Power Query and DAX, see our previous episode of SQLChefs.

What are reports?

A power BI report is a series of visualizations, filters and static elements on a canvas. Power BI reports are saved as a single PBIX file and connect to a single dataset. Remember, a Power BI dataset can have many data sources.

image

(Demo file courtesy of Microsoft, MIT License)

Each report can have multiple sheets, just like an Excel workbook. In our analogy, this is us placing our “cookies” on multiple “cookie sheets” making one big batch, all of the same “flavor”.

One report per dataset

A quick aside to something that used to confuse me. In most cases, a report and a dataset are going to have a one to one relationship. A dataset can have one report and a report can have one data set.

Recently this has changed, however. A while back, they added the ability to use an existing dataset as a data source for a report. and at Ignite they announced the ability to share datasets outside of the app workspace they were made in.

That being said, while you are still learning Power BI, it’s easier to remember that in many cases, your dataset and your report are going to have a one-to-one relationship and be tightly linked.

What are dashboards?

In Power BI, dashboards are a way of pulling together visualizations from various reports. When you think dashboard, you are probably thinking something like Microsoft’s definition: “A Power BI dashboard is a single page, often called a canvas, that uses visualizations to tell a story. Because it is limited to one page, a well-designed dashboard contains only the most-important elements of that story.”

However, if you look at the report example above, it probably fits that definition. It is not a Power BI Dashboard. In Power BI, a dashboard is tool for pinning visuals from different reports and other sources of data.

image

In my opinion, a Power BI Dashboard is as much a tool for organization and navigation, as it is for actual reporting. I think that’s the real value add with Power BI dashboards.

M vs DAX: Chopping Broccoli vs Planning a Menu

Last week, I had the pleasure of recording some video with Bert Wagner about Power BI. In the video, I got to use one of my favorite analogies for M versus DAX: Are you chopping broccoli or planning a menu?

One of the challenges with learning Power BI, is that you have to learn not 1, but 2 new data manipulations languages. And it’s not always clear what they are good for, especially if you come from the SQL world.

Is M a general purpose knife, or one of those weird egg slicers?

Head Chefs versus Sous Chefs

I have never worked in the restaurant business, but I’m going to make some gross generalizations anyway.

Sous chefs, as far as I can tell, do a lot of the prep work. They are cutting vegetables, cleaning food, making sauces, etc. While this is all important work, much of it doesn’t inform the final outcome. If you are making beef teriyaki or if you are making broccoli salad,  you still need to chop the broccoli.

The head chef however, gets paid for her brains just as much as her hands. The head chef is figuring out the menu and how to combine all of the ingredients. She is involved very heavily with what the final result is going to be. A head chef has to think of the broader goals and strategy of the restaurant, not just how to get the immediate task done.

M is the Sous Chef; DAX is the Head Chef

Again this is all a gross generalization, but in the restaurant called Casa De Meidinger this is actually the case! I do a lot of the grunt work when we cook a meal. My wife says, “zest this lemon” and I mindlessly do it. I could probably be replaced with a robot some day, and that would be fine by me.

Annie, however, actually enjoys planning a meal, deciding what to cook, and thinking about how to make the final product. To me, cooking is just a necessary evil for eating. I don’t necessarily get any joy from the process itself.

Working with M

I like to think of M as this sous chef. It does all the grunt work that we’l like to automate. Let’s say that my boss asks for a utilization report for all of the technicians. What steps am I doing to do in M?

  1. Extract the data from the line of business system
  2. Remove extraneous columns
  3. Rename columns
  4. Enrich the services table with a Billable / NonBillable column
  5. Generate a date table

This is all important work, but I would have to do the same work for a variety of reports. Many of the steps tell me nothing about the final product. I would generate a date table for most of my reports, for example.

Working with DAX

Now, if I’m working DAX, what am I going to do?

  1. Ask what the heck “utilization” really means

This was a real-life example that happened to me. What is utilization as a key metric? Well it turns out it depends what you are trying to report on. A simple definition is usage divided by availability. If a technician billed 20 hours and clocked in 40, his utilization would be 50%. Or so you would think.

How do we handle internal projects? Let’s say we have a technician who billed 2 hours to a customer, but spent 38 hours on an internal database migrations. What was his utilization?Well, if we are looking for billable utilization, it’s 5%. If we are looking for total utilization, it is 100%. These are questions that you are going to encapsulate in your DAX formulas.

The whole idea of a BI semantic layer is to hide away the meaning from the end users. When someone orders a cobb salad, they don’t want to have to articulate the ingredient list. They just want a darn salad.

Are you paid for your hands or your brain?

In the SQL Data Partners podcast, episode 114, there was a question: what’s the difference between a contractor and a consultant. One of the answers was this: a contractor is a set of hands, and a consultant is a set of brains.

I think this answer relates to M versus DAX. M is an automated set of hands, able to do work you’d normally do by hand in Excel. DAX let’s you take your domain knowledge and encode it into a data model. It’s an externalized representation for your brain.

And if you think about it, which do you want to be paid for? Do you want to get paid to unpivot data by hand every week? Or do you want to get paid for thinking, for understanding the business and for working at a higher level.

M allows you to automate the first step, so you can do more of the latter with DAX.

Wrangling GotoWebinar Stats with Power Query: Part one

So, this week I gave my first presentation to image

Ugh.

Power Query to the rescue

Normally this would be a giant pain to work with. When it comes to data quality, this is quite the image

Excel is going to make some assumptions about what is part of the table. This is convenient for our needs, but we’ll have to find a work around when we want to scale to multiple excel files.

image

We can’t tell it we have headers, because it’s going to think that first row is a header. We’ll deal with that later. Once we click OK, we are taken to the Power Query / Power Pivot window.

image

I made a mistake

Hmm, so it looks like I made a mistake. I hope my honesty won’t lose me any image

Trying again

Let’s take a different approach. I’m going to open a blank excel workbook and pull the data into there. Okay, so let’s go to manage under the Power Pivot tab.

image

Next, we are going to click “Get External Data From Other Sources”

image

Then I’m going to scroll to the bottom and select Excel File.

image

Once selected, I only have the whole first sheet as an option. If I had table objects or named ranges, that would be different.

image

Hmmm, I still can’t find a way to edit the Power Query. Fiddlesticks!

Normally, in Power BI it would be right here:

image

Trying to do this in Excel is quite the image

Okay, let’s try opening that Excel file. Ah, much better. Now I want to click Edit at the bottom right.

image

Cleaning the Data

So, First thing we need to do is get rid of all of the non-header rows at the top.

image

To do that, I just select Remove Rows –> Remove Top Rows.

image

Then I specify I want to get rid of the top 7 rows.

image

Next, I want to turn the actual header row into a header.

image

Okay, so now it looks like a real table.

image

Comma Delimited BS

Okay, so now we need to parse out the times someone was watching. The problem is that some people were in and out. Their entries are comma delimited. Ugh.

image

Okay, let’s split them up. I’m going to select Split Column –> By Delimiter

image

Unfortunately, splitting by column a) splits into more columns and b) you have to specify how many.

image

Thankfully, we can select those new columns and unpivot them.

image

Perfect. Now we have a row for every time a person as watching.

image

String parsing

Okay, so now we just need to parse out the dates. First, we are going to split on the dash, and then the parenthesis.

image

This is starting to look good.

image

Now we just need to get rid of the timezone and convert it to a datetime. First we need to select Replace Values.

image

image

Lastly, we select the data type we want.

image

What’s next?

Now that are data is cleaned up, we’ll join to sessions table and do some simple data modeling. But that’s for the next blog post.