Power Query in Power BI plays the role of a data connection technology. It does the data mashup i.e. connect, combine and refine data from many sources to meet the need of our data analysis.
Power Query is available in Excel 2016 or later version of Excel. It can also be added in Excel 2010 as an add-in. It is mainly used for data Extraction-Transformation and Load (ETL) in Excel worksheet or Power BI model.
ETL is something which takes the major portion of time of a data analyst. To ease this task Power Query takes raw data from the source and convert to something more workable form. This form of data is easy to analyze and to draw insights.
Data sources for Power Query
Power Query in Power BI and Excel allows us to extract data from almost any external sources and Excel itself. Here are some examples of the external sources we can bring data from. And there are many more…
After the data has been extracted from the desired source, Power Query helps us clean and prepare the data.
Using Power Query, we can easily append or stack different data tables. We can create relationships by merging different data tables, group and summarize using Pivot feature provided by Power Query.
The beauty of Power Query in Power BI lies in the fact that all this data transformation does not affect the original data set. The data transformation happens in the Power BI memory and we can anytime get back our old data just by removing any particular data transformation step.
Once we have summarized the data extracted from diverse sources, the report can be refreshed with one click. Every time new data added in the source data folder, Power Query helps us to update the report accordingly with this refresh feature.
The M language and structure of Power Query
The M language is at the core of Power Query. It is the same as the F# language, case sensitive and contains code blocks starting with "let"
and "in"
as shown below.
let
<em> variable </em> = <em> expression </em> [,....]
in
<em> variable </em>
These blocks consists prcocedural steps of declaring and defining variables. Power Query is very flexible with physical position of these logical steps. That means we can declare a variable at the begining of coding and then can define at the last.
But such a type of coding with a different logical and physical structure is very tough to debug. So, unless absolutely necessary, we should maintain the same logical and physical structure of Power Query.
Editing the Power Query
Luckily we don’t need to write the Power Query in Power BI from scratch. It is already written in the background when we perform the data transformation steps. If it is needed we can tweak the Power Query to make desired changes.
First of all, we need to open the data transformation window by clicking the “Transform data
” option in Power BI. Then the Power Query can be edited using either the “Advanced Editor
” or editing the code for each “Applied Steps
” of “Query Settings
“.
The image below consists of an example of Power Query where the data is stored in a variable called “source
“. Some other variables are also declared here to store the data with different transformation steps.
The variables can be of any supported type with a unique name. Only if the variable name contains spaces, then the variable must contain a hashtag in the beginning and enclosed with quotes. It is the protocol of declaring Power Query variables.