Wednesday, November 20, 2019

Quick tip: Sharing Power BI Dataflows


Problem

What rights does a user need to access a Power BI dataflow?

Just a quick post on Power BI dataflows after I saw this question on the Power BI Community forum about sharingdataflows - only via member/contributor permission?.
The user on the forum wants to know how to share dataflows with other users with the least privilege possible.
From my testing, it seems you need to provide people Contributor/Member/Admin to the workspace with the dataflow. If you choose the lowest permission Contributor - then they can't edit the dataflow BUT they can take over ownership and change the scheduling ! Not Good. I just want people to be able to connect to the dataflow.Have I missed anything ?
I thought this was just an easy lmgtfy-question, because there are already a lot of sources on dataflows:
Returning to finding the answer to this question, it turned out it wasn't that easy. There even is an open issue on the documentation on GitHub to add this information to the docs. So I ended up testing it myself.

Solution

To access a PowerBI dataflow you need either
  • access to the workspace the dataflow resides in as an Admin, Member or Contributor, or
  • Viewer role access to the workspace

Option 1, edit rights to the workspace, might not be desirable for your use case. But we have one other option: the new Viewer role that was introduced back in June of this year can also be used to share dataflows. I already wrote about the new Viewer role in PBI Desktop earlier.

When you are granted Viewer role access to a workspace where a dataflow is created, you can connect to and use that dataflow in Power BI Desktop:


Final thoughts

On a dataset you also have the ability to give the Build permission to give someone access to create content from that dataset (e.g. reports, dashboards, pinned tiles from Q&A, and Insights Discovery, but also content outside Power BI). This feature is still in preview however at the moment of writing. As the documentation says: Build permission is only relevant for datasets, for now? :-)

There is an idea that sounds like the Build permission on dataflows: Dataflow Permissions to give access to an individual dataflow.
Maybe the Build permission will also be added to dataflows in a later phase?

Monday, November 11, 2019

Power Query and The Curious Case of Keep Top Rows

The Power Query Editor in Power BI has many transformations that can be set from the UI. Keep Top Rows is one of them. I'd like to take you on my journey through this curious case I ran into a couple of weeks back when preparing for my session on Query Folding in Power BI.

Transformations in Power Query

Problem

I was preparing my demo's for my session at dataMinds Connect on Query Folding in Power BI (slides). If you're not familiar with query folding: transformations in Power Query are transformed into the native (data source) language and executed at the source for better performance. For a more thorough overview have a look at the post How Query Folding Works from Matt Allington (B | @ExceleratorBI).

If we look at SQL Server, some transformations can be folded (like Sorting, a simple Group By and Keep Top Rows), others cannot (merging two columns from the UI, using the Buffer() M-function and Keep Top Rows). See what I did there? :) Keep Top Rows on its own is a foldable transformation, it is (obviously) transformed into the TOP operator in SQL.
However, when using Keep Top Rows in combination with some other transformations it (looks like it) is not folding anymore. Your best chance is to use the View Native Query option in Power Query SQL Server Profiler to check if the query is folding or not. More on that later :)
A bit of contemplationI do think the real use case of the Keep Top Rows transformation might be mostly in development environments, where you can limit the number of rows to import for example. I still think the findings were rather odd and worth sharing. So, let's have a look at how it works.

Setup

I used import mode for the Fact.Movement table from the WideWorldImportersDW database. I then added the Keep Top (100) Rows transformation in the Power Query Editor.
Keep Top Rows


As you can see it folds to SQL and all is good for now.

View Native Query with TOP 100






























Adding transformations


Let's add a transformation on the Quantity column and apply an absolute value to it.
Add Absolute value of Quantity

So now the option to View Native Query is gone. Let's see if Power Query is speaking the truth (spoiler: No) 😀
I started SQL Server Profiler and captured the queries running against the database after I Close and Apply my changes. As you can see, now both the TOP operator and ABS function are folded.
SQL Server Profiler: TOP 100 and ABS

Now let's try something different and add a filter on the Date Key. Something strange has happened, because I already filtered the first 100 rows, so only 2 dates remained after that. But when I click on the Calculated Absolute Value query step I suddenly have a lot more dates that shouldn't be there in my opinion.
Filter on Date Key


Let's add a field from the Customer dimension this time and see what happens.
Expand Customer Dimension

The query for Fact.Movement is still the same, so the TOP operator and ABS function are still folding. And this is what is sent to the database for the Customer dimension:
SQL Server Profiler: Customer Dimension with WHERE clause

The Curious Case

Now watch closely to the where clause: so it actually already filters out the Customer Key based on the steps performed earlier in the Query Editor. But this Customer dimension is a separate query that returns 1 row, which will be joined to the other query inside Power Query.
Back to that WHERE clause: that Customer Key = 0 is valid for my (broken) Calculated Absolute Value query step, not for the Kept First 100 Rows as you see in the picture above. So I actually get the wrong a different result back into Power Query than I would expect.
I see you're thinking: "But you didn't add an explicit order, right?" So the top 100 is a little arbitrary. That's true, so let's add that order by.
Add Sorted Rows

This is a slightly better result in my opinion. Because now the Power Query steps are in sync with each other. The issue with View Native Query still remains the same by the way. It stops working at the Calculated Absolute Value query step, but the query send to the database is the following, so including the ABS function:

View Native Query after Sort

Recap

I showed several variations in the Power Query Editor of using Keep Top Rows and showed that adding subsequent steps brakes the View Native Query option in the editor, although query folding sometimes does take place. And adding an order to your query before applying a TOP operator is always a good idea, not only in SQL.

The steps I used might not all be best practices when creating a data model in Power BI. In case of the join to the customer dimension for example, it might be better to model it as a separate dimension, because STAR SCHEMA ALL THE THINGS. But I used those steps to highlight the problem.

I already mentioned this 
bug feature to Patrick and Adam (members of the Power BI CAT team) from Guy in a Cube (B | @GuyInACube) when I was at the Precon at Data Saturday Holland. Patrick LeBlanc (B | @patrickdba) also hadn't seen this before and hopeully will take it up with one of his team members. I thought I'd make a write up for him to easily reproduce it.

Featured Post

My DataGrillen Adventure: Speaking, Connecting, and New Friendships

I just got back from an incredible trip to DataGrillen, and I can’t wait to share my experiences with you. If you haven’t heard of it,  Data...