All posts by Eugene Meidinger

GroupBy Conf was a stepping stone in my career. You should submit a session.

The year was 2017, and I was feeling a bit stagnant and frustrated. I had spoken at a handful of different SQL Saturdays, and I wanted to know how to get to the big conferences:

“There’s a joke about industries with no entry level jobs. How do those types of industries exist? How do they not just die out? But I don’t think entry level is the issue here. There are tons of opportunities to speak at user groups. SQL Saturdays are a clear stepping stone after that. The next step is…less clear. Do you blog? Do you speak at virtual user groups? Do you go to 20 SQL Saturdays a year?

No seriously, I don’t know. Can someone tell me? You wanna know why I started presenting at virtual user groups? Because I asked someone how to get to Summit, and they suggested that as a stepping stone. I think it’s important to give “new” speakers like me a path for reaching the top.”

Two months later, I was accepted to speak at the GroupBy conference about how to keep up with technology. It was a little nervewracking, but it went great and was truly a stepping stone in my career. It was my second virtual presentation and bigger than anything I had ever done. It helped me grow as a speaker, and later that year I was accepted to speak at PASS Summit.

I can’t ever know if that was what got me accepted to Summit. It might have been my abstracts, it might have been the fact that Summit was during Halloween, or that I had done some Pluralsight courses by then. But I truly believe it was part of the equation.

Now as someone on the other side of the selection process, I can definitely say that you will benefit from having an online record of your presentations. Having online presentations allows people to see that you can speak well and get a better idea of the type of material you cover.

Submissions are open for a few more days for the October session. I hope you’ll join me in sending in a few sessions, and I hope to see you there.

T-SQL Tuesday #116 – I only do demos on SQL for Linux

T-SQL Tuesday

When I first heard about SQL on Linux back in the beta days, I really didn’t see the point. Was this an attempt to steal customers away from Oracle? Was this intended to appeal to Postgres and MySQL users who were used to $0 licensing? Were that many organizations that were fully Linux except for the one SQL server?

And then as I learned more about devops and containers, it started to click for me. And while I’m still fumbling with devops, I’ve found one really good use for Linux and containers: demos.

Specifically, making a presentation on SQL Server 2019 is what convinced me. I presented on it back when it was still called vNext. Then 5 months later I was presenting it again. But now I had to uninstall it and reinstall the latest version. And I have to find whatever virtual machine I had installed it on in the first place. I had a virtual machine because I didn’t want to contaminate my host laptop.

But now, I can run 5 lines of code or less and suddenly I have the latest CTP ready to present against. All in the time it takes to down the update plus a minute or two. I couldn’t do this without SQL on Linux and Linux containers. So, at least for demos, I’m never going back.

Are You the Same as Your Business?

In Pennsylvania, if you start a business it starts out as a sole proprietorship. Legally, your business has the same name as you, and until you get an EIN, it can be identified by your social security number. In a very real sense, your fledgling business is you. But the question is, should it be?

I recently read two books that have got me thinking about that question. First is “The E-Myth” by Michael E. Gerber, which breaks people down into 3 contradictory personalities: the entrepreneur, the technician and the manager. Whenever you start a business, there is a little bit of each in you. Often times, however, you are mostly a technician. You quit your job because you were good at doing the work. But as I learned the hard way, there is more to running a business than just being technically proficient.

Another thing that it talks about is how businesses should be designed like the prototype for a franchise model. There should be standard operating procedures, code and processes for everything. Literally everything. As a solopreneur, this seems so strange at first.

We get even stranger when we get into “Built to Sell” by John Warrillow. This book is all about making a business that can be sold and live without you. What are the things that make a business appealing to buyers? I say this is strange because, as a consultant, if I left the business there wouldn’t be anything left. It literally has the same name as me, right?

But I’ve been thinking about it more, and not everything has to or should be me. When I make video courses, I pay an editor to do all of the video editing. As a consultant, in theory, I can make anywhere from $100-$200 per hour for what I do. Most of the things I do, such as accounting, or social media management, are nearly as valuable. So logically, even if I can’t separate myself from my business, or ever sell it, I should be thinking about the piece I can carve off. I may be at the core of my business, but I don’t have to be all of it.

Power Query Is No Longer DAX’s Little Brother

I’ve talked before about the difference between the Power Query Formula language, or M, and the DAX language.

I would describe Power Query as the intern you pay minimum wage or the sous chef, and DAX as the $35 per hour analyst or the head chef. This wasn’t to be mean but instead was just because Power Query was all about automating repetitive data manipulations. It handled the less exciting, less complicated work.

Last week, however, I presented on Power Query, and I had to update the slide about where it’s available. I used to say that wherever DAX is, Power Query was not very far behind. Doing all of the grunt work so that DAX could shine. But this time I had to update my slides because Power Query is starting to take center stage.

Now instead of just being available in Excel, Power BI and SSAS, Power Query is available in Microsoft Flow, SSIS and ADF! At the time this post was published, these are all in preview. But it’s really exciting to see Power Query no longer trailing behind DAX, ready to take center stage.

T-SQL Tuesday #114 – An Unsolved SQL Puzzle

This week’s T-SQL Tuesday invitation is all about puzzles. I’ve got an accidental puzzle that I’ve never quite solved, from one of my demos. I’m sure the answer will be a “Well duh!” moment.

I give presentations on SQL Server execution plans. As part of that, I like to show that if you pull a single row from a heap, it has to read everything. As part of that demo, I try to push everything out of memory by disabling readahead reads, taking a checkpoint, and dropping clean buffers. But for some reason… it never quite works!

IF NOT EXISTS ( SELECT  *
FROM    sys.schemas
WHERE   name = N'Demo' )
EXEC('CREATE SCHEMA [Demo] AUTHORIZATION [dbo]');
GO

IF OBJECT_ID('Adventureworks2014.Demo.Person', 'U') IS NOT NULL
BEGIN
DROP TABLE AdventureWorks2014.Demo.Person;
END

SET STATISTICS IO ON

SELECT *
INTO [AdventureWorks2014].[Demo].[Person]
FROM [AdventureWorks2014].[Person].[Person]

DBCC TRACEON (652,-1);

CHECKPOINT
DBCC DROPCLEANBUFFERS

SELECT * FROM [AdventureWorks2014].[Demo].[Person]
WHERE BusinessEntityID = 25

You can see here that it shows 3,808 logical reads, but 5 physical reads.

Screen shot of statistics IO showing the number of reads.

I’m sure there is some simple way to force it to do all of the physical reads, but I have yet to figure it out. Or it may be that I’m misunderstanding something and physical reads are the only pages used. But when I look at the execution plan, it says it read all of the rows.

I’d love to get an answer to this puzzle. I’m sure it’s something simple.

Update: Andy G. asked if maybe the issue is I’m not using all the rows. Here I tried a heap of a single row, and I get 1 logical read and 0 physical reads.

Table 'Person'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Push Your Outlook Calendar to Google Calendar with Microsoft Flow

Sometimes you may want your Outlook Calendar events copied to Google Calendar. This can be done with a handful of clicks and Microsoft Flow. Additionally, this is completely free if you have Office 365.

EDIT: Based on some comments, I would like to clarify that this template only works for copying over your calendar events when they are initially created, i.e. inserts. It does not process updates or changes to your calendar events. You will likely have to look into paid software for this functionality. I’ve changed the title from “Sync” to “Push” to reflect this.

What is Microsoft Flow?

Microsoft flow is the third piece of the Microsoft Power Platform:

  1. Power BI – Interactive analytics.
  2. PowerApps – Low-code mobile and web applications.
  3. Flow – User-friendly event integrations.

The Power Platform is a set of tools aimed at business users that want capabilities that were originally limited to professional coders or BI developers.

Out of the three, Microsoft Flow is the weirdest because it’s so granular. The unit of measure for Power BI is the report, the unit of measure for PowerApps is the application, and the unit of measure for Microsoft Flow is the flow or event trigger. And event triggers are really, really tiny.

Essentially, a flow is a trigger and then a series of actions, much like you might map out with a flow chart. It functions similarly to IFTTT or Zapier. I think of it as the glue or connective tissue between different applications.

In this post, we are going to glue together our Outlook Calendar to our Google Calendar.

Why connect calendars?

Back when I worked a normal job, I had two calendars: Office 365 for work and Google for home. Now that I work for myself, that’s a lot more complicated. Sometimes a customer will create an account for me in their network. Sometimes I’ll partner with other consultants and work as part of their team. And of course, I’ve got my own work email at eugene@sqlgene.com.

I need all of these calendars to consolidate to one place. My natural inclination and personal preference is to put it all into Google. Now, there are sync apps available, but this sort of problem is a perfect use case. A calendar event is created in outlook, a flow is triggered, and that information is transferred to Google.

Using Flow

To use Flow, I simply went to https://flow.microsoft.com and searched for Google Calendar. The template search for Flow sorts by popularity, and unsurprisingly the top result was exactly what I wanted.

image

Once I selected the template, then I needed to log into my Google account.
image

Then I just needed to select the calendars from both accounts that I wanted to sync.
image

And that’s it! I was pleasantly surprised how easy it was to do, and I’m interested to see where else I can use Microsoft Flow.

DAX Error: The Expression Refers to Multiple Columns. Multiple Columns Cannot Be Converted to a Scalar Value.

Sometimes, when working with DAX, you might get the following error:

The expression refers to multiple columns. Multiple columns cannot be converted to a scalar value.

This error occurs whenever the DAX engine was expecting a single value, or scalar, and instead received a table of values instead. This is an easy error to make because many DAX functions, such as FILTER, SUMMARIZE and ALL, return table values. There are three situations where this error commonly occurs:

  1. Assigning a table value to a measure or calculated column
  2. Forgetting to use a DAX aggregation
  3. Treating ALL or FILTER as an action, not a function

In the rest of the post, we’ll cover each scenario and how to fix it.

Assigning a table value to a measure or calculated column

Let’s say that you were doing some analysis on the products table in the AdventureWorks sample database. In this case, maybe you want to only look at the black products. So you create a measure with the following code:

BlackProducts = FILTER(Products, Products[Color] = “Black”)

image

One solution to this problem is instead of assigning the code to a measure, which is intended to display a single value, you can create a calculated table instead.

To do so, go to Modeling –> New table in Power BI Desktop. Then ender the same code as before but for the calculated table. Now you will see a table filtered accordingly.
image

Forgetting to use a DAX aggregation

Now, what if we actually did want a single value instead of a table? Let’s say we want to count the number of black products. In that case, we could wrap our code in an aggregation function, such as COUNTROWS which can take in a table and return a single value.

CountOfBlackProducts = COUNTROWS(FILTER(Products, Products[Color] = “Black”))

This code will return the count of all products, but only if they have black as the color.

Treating ALL or FILTER as an action, not a function

Sometimes, people will try to use functions like ALL or FILTER to filter information on the report. By themselves, these functions actually return a table. However, when they are used with CALCULATE and CALCULATETABLE then you can use them to filter your data appropriately.

Want to learn more?

If you want to learn more about DAX, then check out my free learning path and my paid Pluralsight course.

Fumbling in the Dark with DevOps and Automation

In the past, I’ve been skeptical about how much things like PowerShell, Devops and Docker are relevant to me personally. It makes sense if you are writing application code. It makes sense if you are managing hundreds of servers.

But I do Business Intelligence. How do you write unit tests for a report? Why do I need PowerShell when I can just hit Publish on Power BI Desktop? Do I really need Powershell if I manage 3 SQL Servers?

This year, however, there have been a number of events that have been slowly changing my mind:

I don’t know what I’m doing

I’ve talked before about how automation is a relative term. But I’d like to do some true automation, I’d like to make a script like Cody’s where I can spin up a multi-server homelab with SQL Server, Sample databases and client tools all installed.

And right now I have no idea what I’m Doing and I’m fumbling in the dark. I’ve made a github project and I’ve gotten Lability to create the virtual machines. I know I need to learn Desired State Configuration, and I can’t quite get it to work with Lability yet.

And beyond that, I have no idea what I’m doing. And that’s okay. I suspect that this is a pain a lot of people run into with devops and why they put it off. The reason I write this is to remind people is that it’s okay to suck at something.

Image result for adventure time suck

I’ll keep y’all updated as I slowly make progress, fumbling in the dark.

DAX error: A function ‘XXXX’ has been used in a True/False expression that is used as a table filter expression. This is not allowed.

Whenever you start trying to use more complicated filters in the CALCULATE or CALCULATETABLE functions in DAX, you may start to get the following error:

A function 'MAX' has been used in a True/False expression that is used as a table filter expression. This is not allowed.

image

The function in single quotes may vary. Instead of MAX, it could be SUM, MIN, AVERAGE or nearly anything. Sometimes, you may not even be using a function and the error will just say CALCULATE is the problem:

A function 'CALCULATE' has been used in a True/False expression that is used as a table filter expression. This is not allowed.

image

What causes this error?

The error is caused by using a TRUE/FALSE expression, something that evaluates to TRUE or FALSE, to filter the table in a way that CALCULATE or CALCULATETABLE doesn’t support.  So the error is saying you can’t use a boolean comparison to filter your table except in very specific circumstances.

The following comparisons are not supported:

    1. Comparing to a column to a measure. SalesHeader[TerritoryID] = [LargestTerritory]
    2. Comparing a column to a an aggregate value. SalesHeader[TerritoryID] = MAX(TerritoryID[TerritoryID]])
    3. Comparing a column to a What-If parameter. SalesHeader[TerritoryID] =

TerritoryParameter[TerritoryParameter Value]

In fact, you only have three options if you want to filter a column in a CALCULATE/CALCULATETABLE function:

  1. Compare the column to a static value. SalesHeader[TerritoryID] = 6
  2. Use variables to create a static value. VAR LargestTerritory = MAX(SalesHeader[TerritoryID])
  3. Use a FILTER function instead of a true/false expression. FILTER(SalesHeader, SalesHeader[TerritoryID] = [LargestTerritory])

This is because CALCULATE was designed for safety and performance. Complex row based comparisons can dramatically affect performance. So, in order to do more complex comparisons, you have to take the safety feature off and use the FILTER function.

How do I fix it?

In order to fix the issue, wrap your expression in the FILTER function. To use the FILTER function, you need to pass in the table you want to filter, and then a TRUE/FALSE expression to determine which rows get return. So, let’s say we had the following code:

CALCULATE (
    SUM ( SalesHeader[TotalDue] ),
    SalesHeader[TerritoryID] = [LargestTerritory]
)

to use the FILTER function, we would use this:

CALCULATE (
    SUM ( SalesHeader[TotalDue] ),
    FILTER ( ALL ( SalesHeader[TerritoryID] ), SalesHeader[TerritoryID] =    [LargestTerritory] )
)

The ALL function isn’t strictly necessary, but normally when we filter a single column in a CALCULATE function, it will undo any existing filters on that column. We use ALL here to replicate that behavior. In order to understand the specifics better, check out this article at sqlbi.com

Want to learn more about DAX? Check out my free learning path, or my paid Pluralsight course where I cover CALCULATE, FILTER, ALL and more in how to use DAX.

Getting Kubernetes and Containers to “click” for me

Today I had the pleasure of co-hosting the GroupBy Conference. Part of that involved co-hosting as Anthony Nocentino present on Kubernetes. His talk was based on his Pluralsight video on the same topic. After watching his presentation, Kubernetes finally clicked for me. I think I get it.

Before you can get what Kubernetes is about, you need to understand one layer lower and get what containers are about. Aaron Nelson has written a great article on setting up SQL in containers in 5 lines of code. This helped me see how quick and easy it is to spin up a container. Additionally, I see how useful it is to be able to set up a container, kill it and spin up a new one, all in a matter of seconds.

Once you start playing around with containers, you realize you need some way to control and organize them. If you are going to treat them like cattle, not pets, then you need to higher a cattle wrangler. Kubernetes is that cattle wrangler. Or should I call it a kattle wrangler?

I wrote last week about how The Phoenix Project totally altered the way I think about work. It also altered the way I think about deployments and devops. To go fast, to make 10 deploys per day, you need to remove humans as much as possible. You need infrastructure as code. Kubernetes turns your datacenter into code.

I still have some reservations about SQL Server Big Data edition, and I have to wonder when Kubernetes is overkill. But when you need to do dozens of deployments, or blue-green deployments, or implement stateless microservices, it’s a total no-brainer.