How to add data from website to Power BI desktop

Add data from website to Power BI desktop

Different webpages are rich source of data. Either structured or unstructured, these data are very useful and can provide good insights.

Power BI has recently added and enhanced the existing feature of data extraction from the web. This feature was already compelling, with the recent enhancement it has become even more powerful.

In this article I will discuss this feature in detail with a practical example.

A practical example of web scraping with Power BI desktop

I have a data set with information on state-wise crop production in India. The data was collected from data.world. I have discussed this data and how I have analyzed it with the Power BI desktop in this article.

Now my purpose was to analyze this crop production data in context with India’s economic growth. As we know that any country’s GDP (Gross Domestic Product) and GSDP (Gross State Domestic Product) are very good indicators of its economic growth.

So we need to collect this data in order to correlate the state wise crop production with GDP and GSDP of corresponding state.

But the problem is such data is not readily available. So…..

Is web scraping an alternative option?

In this scenario, web scraping is generally the only solution.

You can read this article to know how to write web scrapers with python to collect the necessary information.

But writing a web scraper with Python needs coding knowledge. It is not possible for a person with zero knowledge in software development or at least any single programming language (preferably Python).

Add data from website to Power BI desktop

Here comes Power BI desktop with its immense powerful feature to add data directly from the website. It is really a boon for the data analysts/scientists who want this data import process smooth real smooth.

It does not require any software development background. Anyone with no idea about any programming language at all (really!!) can use it.

So without any further ado, lets jump to see how you can also do the same.

The data source

I will use an authentic website like Wikipedia for open-source data. A data source that has unquestionable authority. If you simply search Google with the search query “Indian states and union territories with their GDP” the first result you will get is from Wikipedia.

Google search result for Indian GDP
Google search result for Indian GDP

Importing the data from website to Power BI desktop

Power BI allows importing data from numerous sources as I have mentioned in the introductory article on Power BI. Among several sources, one is from “Web“.

As you can see in the below figure, you need to select the “Web” from “Get data” option under “Home” tab. Consequently, a new window will open where the URL is to be provided.

Importing data using URL of particular website
Importing data using URL of particular website

If you have already imported the data once from a website, then the address gets stored in “Recent sources“. It will help you to quickly import the data in case you need the data again to import.

"Recent source" of previously used website URL
“Recent source” of previously used website URL

As you provide the URL, it will first establish a connection to fetch the data. Next, a “Navigator” will open to show you a preview of the data.

See the below screenshot of the Power BI app on my computer; on the left side, all the tables from the web page will be listed. You can click the particular table you are interested and the table preview will be displayed on the right-hand side.

Navigator and table view
Navigator and table view

Similarly, if you want to get a glimpse of the web view, just click the “Web view“. A web view of the page will be displayed as shown below.

Web view of the page
Web view of the page

Transform/load data to Power BI desktop

Now if you are satisfied and found the particular information you are looking for, proceed with the data by clicking “Load” or “Transform". I would suggest going for the transform option as it will enable you to make the necessary changes in the data.

As I have shown the data below before loading it in Power BI. With the help of Power Query, I have made minor changes like changing the column name, replacing the blank rows or replacing the “Null” values, adding necessary columns etc.

I have already described all these operations in data transformation steps here.

Window for data transformation
Window for data transformation

Once you are satisfied with the table you created, you can load it in Power BI for further processing.

Add table using examples

Another very useful feature is to "Add table using examples". As you can see this option at the bottom left corner of the window in the below screenshot.

Add table using example
Add table using example

This option is very helpful when the tables Power BI automatically shows do not cater to your purpose.

Suppose for the above web page you can see almost all the structured data in table form on the left pane. But you are looking for some information which is scattered on the page and not in a table form.

In that case, if you click the option “Add table using example” you will be provided with a blank table along with the web view as shown in the above image.

Upon clicking the row of the table Power BI provides several options which you can choose from to fill the table. As shown in the above image, some information with no table structure are there to populate the table.

You can also add several columns by clicking on the column header with “*”. Also change column name or later at the data transformation stage.

Final words

So, I hope this article will help you to collect required information from any web page using this feature of Power BI. It is very simple to use only you need to be aware of its existence.

My purpose was to provide you with a practical example of real world data which will make you familiar to this feature. And also to document all the steps for my future reference.

For data analysts and those who just want to get their desired data, writing web scrapers is pure time waste. I myself have written several web scrapers. It obviously has some benefits and can get you some very specific data from several webpages.

But if your data is not scattered in multiple webpages and can be fetched from specific URL, Power BI is your best friend. It will save your lots of time of collecting data and you can straight way jump to the main task of data analysis.

If you find the article helpful, please let me know commenting below. Also if you have any question regarding the topic I would love to discuss with you.

How to join tables in power bi desktop: a practical example

Join tables in Power BI desktop

Joining tables is an important feature to combine information from several tables. We can join tables in Power BI desktop with a very nifty merging feature provided with it..

In this article, I am going to demonstrate how to join tables in Power BI desktop with some practical data. The data are all open source. You can collect them from the links provided and use them to practice.

Combining data

When it comes to combining data in tables, it can be done in two ways. One is you may need to increase the rows of a table with new data. This type of data combination is known as “Appending“.

Whereas when you add columns with new information with an existing table, it is called “Merging“.

Power BI provides both of these features under the “Home ” tab (as in the below figure). You need to use them according to your requirement.

Merge and Append queries
Merge and Append queries

This particular article is on joining tables. So we are going to discuss the Merge queries option here only with suitable example.

Lets first discuss different kinds of joins. This will be an overview so that you can make the right decision while selecting join types while merging queries. Also will suggest going through the Microsoft documentation page for details.

The data set

The data set I have used for demonstration purpose is on India’s state-wise crop production collected from data. world. And another data set with India’s state-wise rainfall from different years.

Both the data set presents a real-world experience. The data is collected in raw form and refined using data transformation feature of Power BI. You can go through all the data transformation steps here.

The measures used for different calculations are described in this article.

The first table that is the crop production table contains the area under different crops in hectare (ha) respective to different districts of different Indian states. Whereas the second table i.e. the rainfall table has the rainfall record in mm respective to different districts of Indian states.

Replacing/removing errors

While importing the data from the CSV file or from the web itself, you may face some missing values. When it gets imported in Power BI, the missing values are shown as “Error“.

Now you can not proceed without handling these errors. One way is to replace these errors with proper values. It may be mean or median of rest of the values of particular variable or you can simple remove the rows with missing values.

In the below screenshot you can see how I have replaced the error with a suitable descriptive value. The “Remove errors” option will simply remove the corresponding rows.

Now suppose we want information on both the cropped area as well as rainfall of particular state and districts. Then we need to join the tables with proper conditions.

Different kinds of join tables in power bi desktop

Now Power BI will ask you about the particular join type you are interested to apply. And in order to use the correct join, we should have the idea about different joins. So here is a vivid description with examples of different joins.

For demonstration purpose I have picked few rows from both the tables and created two tables.

Left outer join

It is a prevalent joining process and the default one in Power BI, where the left or 1st table (as in the figure below) retains all of its rows and matching rows from the right or 2nd table. As the text from “Merge queries” option of Power BI displays “All from first, matching from second“.

Suppose we want to know the rainfall of some particular districts with a definite amount of cropped area. So how will you join the two tables to fetch the information? Here is left outer join to help you. See the below figure

Left outer join

In the above figure the particular information we are interested are colored as green cells. The corresponding information has the same color code in the second table.

So, as per the rule of the left outer join, all the rows (yellow coloured) of the left table and the green-coloured rows of the right table has been joined in the new table.

The Venn diagram at the bottom right corner describes the joining process in colour codes. According to this diagram, the matching rows are called the interaction of the two sets. So here the new table consists of the complete left set and the interaction of the sets.

Right outer join

Now suppose we are interested to know the information exactly opposite to what we have fetched earlier. This time we want to know the cropped area of districts having particular rainfall. So, here the right outer join we need to perform.

Right outer join  of tables in Power b
BI desktop
Right outer join of tables in Power b
BI desktop

Full outer join

When we are in need of information of all the states and districts with their cropped area and rainfall, we should go for a full outer join. As the name suggests, this join will return all the records including the matching ones.

Full outer join of tables in Power BI desktop
Full outer join of tables in Power BI desktop

Here is the final output from the full outer join. The rows contain both the rainfall and cropped area information including the matching rows (in green colour).

Inner join

Again if we need information on the rainfall of those districts with cropped area data as we have nothing to do with rainfall data for those districts not having any cropped area information. In this situation, the inner join produces the desired result.

See the below figure where we have applied inner join on both the tables.

Inner join of tables in Power BI desktop
Inner join of tables in Power BI desktop

See in the above image, only the matching rows are kept and all other rows have been excluded.

Left anti join

Suppose for the sake of data analysis, we need only those states and respective districts for which we don’t have any rainfall data, how can we fetch the required information?

Not to worry here the particular join type we need to apply is left anti join. Which will keep all the rows from the left table removing all those which have a match in the right table. See the below example.

Left anti join of tables in Power BI
Left anti join of tables in Power BI

Here in the above image we just have the required information from the left table of state wise cropped area and everything else have been excluded.

Right anti join

Now suppose we need information exactly opposite the earlier one i.e. we need only the rainfall data of all those districts for which we don’t have any information on the cropped area. The join we will apply here is right anti join.

See the below demonstration with the two tables.

Right anti join of tables in Power BI
Right anti join of tables in Power BI

In the above figure, the crop production tables have been joined with the rainfall table using the right anti join. You can see that all the rows of rainfall table have been retained excluding the matched rows from the crop production table.

Join tables in Power BI desktop

Now lets see a practical application of joining tables in Power BI desktop.

When you click the merge query option of Power BI desktop, you will see the first table as the active table (the selected able while clicking the merge query option).

You need to select a column from the table with unique values and can act as a key. Then a drop-down will allow to chose from the available tables. Again you have to select another column from the second table using which both the tables can be joined.

Then you need to select the particular join option from the drop-down as shown below. All the join options are available in the same sequence as we discussed above.

Join tables options in Power BI desktop
Join tables options in Power BI desktop

For example here in the above image, you can see I have selected column “State_name” from the table “India_statewise_crop_production” and the column “SUBDIVISION” from the second table “rainfall_India” as the key columns for joining.

Now our purpose was to keep all the crop production information with matching rainfall data of the corresponding districts. So, I opted for “Left outer join” which is also the default option. You can choose any of them as per your requirement.

Now simply click and see the operation in process.

Selecting rows from the joined table

Now the newly created column will appear in the table. You can see in the below image that the whole table is displayed as a column. So you need to select the particular column and deselect the option “Use original column as prefix“.

Selecting particular column from merged table

See in the above image, by default the complete table appears as row element in the first table. You need to select the particular columns of the table you want here.

Use of advanced editor

Use of advanced editor of Power BI allows more flexibility to change the M script itself yo make desired changes in the output. The M language is at the core of every Power BI application.

See in the below image, the option “Advanced Editor” under the “Home” tab shows the M script behind the particular application. You can tweak them to change the join type or column names etc.

Editing M script for Joining tables in Power BI desktop
Editing M script for Joining tables in Power BI desktop

Final words

So, here is all about on join tables in Power BI desktop. I have tried to cover all the basics of each kind of join with examples. So that the you can understand the logic behind the joins. And apply the right kind for your need.

Joining tables and fetching the exact information is the core of data modelling. In order to obtain a good visual representation of the Power BI report, a good data modelling is must.

I hope this article will help you to get a good grip on this very fundamental operation of Power BI. If you have queries or doubt, please comment below. I would like to answer them.

How to create a map visualization in Power BI with example

Map visualization in Power BI

Map visualization in Power BI is a very useful feature to show location-wise values. In this article, I will demonstrate how to use this feature with a practical example.

I have used maps a lot with ArcGIS to create boundary maps and display locations as supplementary information. It certainly requires expertise and considerable time.

Being data analytics guy, I always wondered if there is any tool which will make the mapping task easier and I can focus more on the analysis part. And here comes the Map feature of Power BI to make my life easier.

This feature has been well described by the Microsoft in its Power BI help document. So no point to repeat the same thing here. Rather I will discuss the map feature with some real-world data so that you can apply the same with yours.

I will also discuss some common problems a Power BI user may face while using this feature for the first time (which I have also faced). Certain things you need to keep in mind while using the map in Power BI which are:

  • The data you want to map needs to be correctly geocoded. Otherwise, it will cause ambiguity while mapping the locations
  • The purpose of map visualization is to show the geospatial information or the distance between the locations. So, your project should have variables assigned to locations
  • To remove any ambiguity regarding any locations you may have to produce the latitude and longitude of locations in decimal values.

Types of Map visualization in Power BI

Power BI offers three types of Map visualizations.

  • A simple Map option creates a bubble map. The size of the bubble varies with the variable set in the value field.
  • A filled map option. It displays the area of particular locations.
  • The third option is to create Map using ArcGIS tool.

All these options are available in Power BI visualization pane as shown in the below image.

Different Map visualizations of Power BI
Different Map visualizations of Power BI

All of the above options are powered by Bing. So you need an active internet connection to use these feature. The map gets created dynamically with the help of internet.

Power BI is very much flexible in order to take location values either city name or PIN number or specific latitude or longitude.

Many a time you may face an error thrown by Power BI due to some ambiguity in the location name. For example, a state and district both have the same name( a good instance is Washington DC of US).

In such cases Power BI itself can not decide how to mark the particular location you want. And so you need to guide Power BI to mention if it is a state name or district name.

You can do that either by editing in Power query or providing specific latitude and longitude of the location.

Practical example: map visualization of crop production in India

Without much ado. Lets dive into the example where I tried this Map option to create some beautiful and informative maps.

The data I have used here contains the crop production statistics of different states of India. The data I have collected from data.world website. And it is in a very refined state. So you don’t need to do any serious transformation.

Just some removal of blank values, filtering as I have shown in this article is enough to start with. Also, I have made custom keys from the main table for the purpose of data modelling (here is the process).

The data set contains following columns

  • State_Name” containing different states of India.
  • District_Name” containing state-wise different districts name
  • Crop_Year” has the year of production
  • Season” has the particular season of crop production
  • Crop” is the particular crop name
  • Area_ha” this column contains a total area of under the crop in Hectare
  • Production_tonnes” has a total production of the season in Tonnes

Creating the Bubble map

This is the most interesting Map visualization offered by Power BI and here we will create this map first. Bubble map is very frequently used in data analytics. The size of bubble helps us to compare the magnitude of any parameter across different locations.

Here we need to click the Map icon from the visualization pane as shown in the first image.

Then as shown in the below figure I have selected “India_states” for the location field and “Crop_productivity” for the Size field. The purpose is to show the crop productivity of different states on the map of India.

Selecting location and size for bubble in map visualization
Selecting location and size for bubble in map visualization

So here is the report output. See the below screenshot of my Power BI desktop report page. Different states of India has been located with the bubbles. The bubbles are of different size and proportional to values of crop productivity.

You can see now how it has become to visually compare the crop productivity of different states of India. Which is otherwise a tough call in a table form visualization.

The bubble map with different state wise crop productivity
The bubble map with different state wise crop productivity

Changing the fields to add more information in the map visualization

Lets make little changes in the fields of the map we created. And try to add some more information. Suppose we need the district level information. We also want the map to show different states separately.

To do that we can use the field “Legend” which we have left blank earlier. See that we have added “India_states” in that field. Also this time I have replaced “Crop_productivity” with “Area_ha“.

Here I want to mention one important point. You should drag the particular field from the right pane to the left drop-down fields instead of selecting them with left click (see the figure below).

If you select them with a click then Power BI automatically assigns them field which may not lead to something you want.

Selecting the fields
Selecting the fields

See the result below. All the states are shown with different colour shades. Different districts within states have bubbles with size proportional to the cropped area. Such a map can quickly provide an overview of the crop production scenario with a glance.

You can see the “Bing” icon at the extreme bottom-left corner. It suggests that all these map functionalities are provided by Microsoft Bing. So you need to be online to create the maps.

Formatting the Map visualization in Power BI

The basic map feature has several formatting options. You have all these formatting options available under the visualization pane with a brush icon as shown in the below figure.

Format option for basic map visualization
Format option for basic map visualization

With the help of this format option, you can change the category, data colors, bubble size, map styles, map controls and many other visual appeal of the map you created.

You can also change the themes of your map if you are using Power BI “online service”. By default the “Road” theme is applied. The other themes available to opt from are Aerial, Dark, Light and Grey scale. Power BI desktop does not have this option.

Filled map

This is another basic map visualization option where the territory of the particular location is displayed as a filled one. For example, if we apply the same criteria as we have set for the map option, then it looks like as in the below image.

To create this map I have just clicked the “Filled map” icon in the visualization selecting the existing map. And the old map has been replaced by this new filled map in the report.

Filled map visualization in Power BI
Filled map visualization in Power BI

Here we can see that the selected states are simple filled with different shades as shown in the legend. The same format options are also available for filled map also. If you want some variations in its appearance other than the default one, you can apply them.

ArcGIS map visualization in Power BI

The last and third map visualization option available in Power BI is ArcGIS map. The process of creating such map is also same. You just need to select the ArcGIS option while selecting the already created filled map.

See the below image the same map with the same options has been converted to an ArcGIS map.

ArcGIS map visualization in Power BI
ArcGIS map visualization in Power BI

An ArcGIS map has some advanced features which other map visualizations do not offer. These features are as given below:

  • When you zoom in and zoom out in the map the clustering features
  • It has advanced options like drivetime and distance radii
  • You can have heatmap visualization
  • You can add reference layers from ArcGIS online repositories
  • Some infographics options are also available as default in the report

All these advanced features come with some limitations like:

  • This map visualization option is not available for Power BI report server
  • In case you want to publish or embed the map in the web then this map option is not visible
  • The custom shape options are available only if the maps are added to the ArcGIS online and publicly shared

Final words

So, we have completed all the map visualization options available with Power BI. A practical data set I have used to demonstrate all of them.

I have kept this as simple as possible and discussed all the important functions so that you do not need to waste time. Otherwise, all the first time users invariably waste a lot of valuable time looking for the correct steps.

Hope you will find the article helpful. I would appreciate if you point out any other queries regarding the topic and suggestions by commenting below. I would like to answer them and also try to improve the content further.

How to use Measures in Power BI? a practical example

Measures in Power BI

Measures in Power BI are really a beautiful feature. They are fast in the calculation, has the benefit of reusability. Measures can be applied to multiple tables. We create measures to obtain counts, averages, sums, ranking, percentiles, aggregating year to dates and many more handy calculations.

Measures are dynamically calculated. And most importantly it gets imported in other applications like MS Excel through its report format. A good data modelling and creating useful measures are among the core skills Power BI Pro users have.

So, in nutshell, measures in Power BI are one of the most important features you must know and use. In this article, I am going to discuss how to use them with a simple application using real-world data.

I would suggest going through this article for in-depth knowledge of Power BI measures. It is from the creator Microsoft itself. This article will give you the overview and the necessary idea about measures to apply the steps described here.

NB: Being a non-native English speaker, I always take extra care to proofread my articles with Grammarly. It is the best grammar and spellchecker available online. Read here my review of using Grammarly for more than two years. 

The data set used

The data set used here is the same used to demonstrate the data modelling process in Power BI. For the first time readers, a brief description of the data is given below.

The data I used here is crop production data of different states of India. This data is accessible at Data.world. This is real-world data and without any garbage.

The columns are

  • State_Name” containing different states of India.
  • District_Name” containing state-wise different districts name
  • Crop_Year” has the year of production
  • Season” has the particular season of crop production
  • Crop” is the particular crop name
  • Area_ha” this column contains a total area of under the crop in Hectare
  • Production_tonnes” has a total production of the season in Tonnes

Use of DAX (Data Analysis Expressions) to create Measures

We use Data Analysis Expressions (DAX) in Power BI to create measures. The same DAX is used for Excel formulas. The only difference is, in Excel, DAX is applied on cells and columns whereas, in Power BI, DAX is applied on tables and columns of the data model. And they are very fast in calculations.

Implicit and explicit measures: which one you should use?

Measures are of two kind one is implicit measures which gets calculated by default and the other one is explicit measure. Now both of these measures yield the same result.

But there are subtle differences between the calculation approaches of these two. Which you should know and use accordingly.

Implicit measures are easy to calculate and you need not write any DAX expressions for it. But it has its own share of disadvantages. Which is Implicit measures are not reusable.

On the other hand, explicit measures require some knowledge of writing DAX expressions to achieve desires result. But they can be reused. Also, you can make little changes in the DAX expression you wrote to use it for other calculations.

An example of limitations of Implicit Measures is its inability to handle division by 0(zero). If there are some calculations have divisions and your data contains some zeros in the numerator, then Implicit Measures will give garbage values.

But such a situation is well handled by Explicit Measures. Where these measures through a tiny error displaying “NAN” i.e. “Not A Number”. In this way, you can understand the possible problem and can correct it.

New Measure and Quick Measure

These two are another options provided by Power BI when you need to create measures. Now again, creating a Quick Measure is easy and does not require writing DAX expressions.

Where creating New Measure needs writing few lines of DAX expressions. But it has some advantages over Quick Measures. I will discuss it with example here in a bit.

The figure below displays the options for creating “New measure”, “Quick measure”, “New column” and “New table”.

Different measures option in Power BI
Different measures option in Power BI

While in case of “Quick measure” a wizard opens which help us to define the calculation. Again it is easy as we don’t have to bother about writing DAX at the cost of losing the flexibility and reusability offered by it.

Common mistake while creating new measure

Here I would like to mention a very common mistake while you are in the process of creating you first ever New measure. It may sound silly but related to very basic concept of measure in Power BI.

The role of a measure in Power BI is to apply some aggregate functions of the column values in the report. So, it is important to remember that measures can not be created using single row values.

See the error thrown by Power BI in the below figure when I have tried to create a new measure using single row values. My intention was to calculate productivity by dividing crop production with the area.

Common mistake while creating measures for the first time
Common mistake while creating measures for the first time

And the error clearly directed me that a “single value” can not be determined. We need to create “New column” for that purpose.

Creating a “New column”

For creating a column to calculate the productivity from the crop production and area values, we will create a “New column” here. Simply click the “New column” option. A formula bar will appear where we need to write the DAX for the calculation.

Creating "New column"
Creating “New column”

Now see in the below figure where the DAX has been written for crop productivity calculation. Important to note that I have mentioned an alternate value as 0 at the end of the expression. It avoids the problem that arises whiling dividing by 0.

Creating new column with crop productivity
Creating new column with crop productivity

Creating “New measure”: an Explicit measure

While we create new measure by clicking the “New measure” option, a formula bar appears in the window. See the below figure. The “Fields” pane at the right also starts to display the Measure with a calculator icon. By default, a new measure has the name “Measure”.

Creating a "New measure"
Creating a “New measure”

Now we have written a DAX for calculating the average crop production. Its very simple and remember selecting the relevant table while writing the DAX. It will help you select the right column by prompting from the selected table columns.

Creating a new measure

As I have already said that explicit measures are reusable, it means that one explicit measure can be used to create another explicit measure.

Application of an Explicit measure

Use of explicit measure

Implicit measures

Here we will see how a measure can be created implicitly. It is an easier option compare to explicit measures. But it is always advisable to go for an explicit measure for its several advantages as I have discussed before.

In the below figure you can see that a bar plot has been created for the Area_ha column. By default it is the sum which is selected. You can change to other options as available in the figure.

And all these options create an implicit measure for each of them. You will not get them in the field options there. Which means that implicit measures are not reusable. Every time you want to do the same calculation, you have to go through the same process.

Creating an implicit measure
Creating an implicit measure

The bar plot takes only the measure “Area_ha” which gives the sum of all the cropped area from all the states. So, it gives a single bar for a single value. You can drill down further to any level adding them as values.

Like in the below image we can see that the cropped area has been displayed district wise.

Application of implicit measure
Application of implicit measure

Final words

In this article, we have mainly focused on creating measures both implicit and explicit. Both measures have been shown using practical examples.

As I have discussed the merits and demerits of both the measures, it is prudent that creating explicit measures always has the edge. Implicit measures are easy to apply. That’s why any new Power BI users always prone to use the implicit measures.

Although in the long run, when he delves into more advanced use of Power BI, the data size and complexity increases, he finds that using implicit measures are not helpful in terms of their reusability.

Also, as the Explicit measures get dynamically calculated, it puts less burden on the computer memory and quick to give result.

Hope this article will help you to get a good grip and understanding on the use of measures. If you have any queries or issues regarding application of measures let me know through comments below.

I would like to address them which will enrich my knowledge too.