By: Matt Wollner
Senior BI Consultant at Key2 Consulting
In my previous post, I showed how you can use Microsoft Power BI Desktop to pull data from the web and use it to create simple charts. Also, in another previous post, we laid out the steps for sharing Power BI reports. But what if you wanted to pull data from multiple charts or web sites? In this post, I’m going to outline the basic steps to joining data when using Microsoft Power BI Desktop.
I’m going to compare fantasy football points for quarterbacks in 2016 to their projected points for 2017. To start, I will use the “Get Data” option to pull data from the web. For your reference, here are the websites I am pulling data from:
2016 NFL QB Stats – http://www.pro-football-reference.com/years/2016/passing.htm
2017 NFL QB projections – http://fftoolbox.scout.com/football/2017/cheatsheets.cfm?player_pos=QB
Once you’ve imported the data, use the Data Tab to view the imported tables. Passing Table comes from pro-football-reference.com and Table 1 comes from fftoolbox.scout.com.
Both sets of data have player information. We would like to join the data together using the Player Name. If you try and join the data, you will notice that some of the QB names have an asterisk (*) that needs to be cleaned up. Let’s quickly walk through some steps to clean up the data.
To begin transforming the data, select Edit Queries.
This will open the Query Editor in a new window. The Query Editor will allow you to do many types of data manipulations. We will begin with some basic edits.
1. Update the Player name
• Open the Passing table. You’ll notice the player column name is empty. Rename it to Player.
• Remove Special Characters.
• Right click on the Player column header and select Replace Values for each special character. You’ll do this step twice.
• Replace the * and + with an empty string.
2. We will also need to add a column to calculate total fantasy points in 2016.
• Add a new column, Custom Column.
• Name the column FF Points.
• Set the value based on the fantasy scoring.
• ([Yds]/25) + ([TD]*6) – ([Int]*2)
• You will also need to change the data type to a decimal number. You can use the Detect Data Type option under the Transform menu.
3. Under Table 1, you will notice some extra rows. Let’s remove these.
• Find the rows with the header data and footer data.
• To remove the header, you can either select:
• Use First Row as Header OR
• Remove Rows Remove Top Rows Number of Rows: 1
• Remove the footer row
• Remove Rows Remove Bottom Rows Number of Rows: 1
Here are some other transformations that may be useful:
• Split Column – use this to split a column into 2 columns based on character, such as a comma.
• Remove Duplicates
• Change Type – change the data type of the column.
You’ll notice that every change you make will be saved in the Applied Steps on the right-hand side. Note that every step that you make can be changed, edited, or removed.
Once you’ve completed modifying the data, select Close & Apply (under the Home Ribbon).
We’ll now need to join the 2 data sets together. Select the Relationship tab and you’ll see the 2 tables. Find the column Player in Table 1 and drag it over to the Player column in the Passing Table. To modify the connection or to update the Join type, double click on the join line or select the Manage Relationships in the Ribbon bar to create the connection.
The final step is to view our data. Under Passing Table, add Player, Tm, and FF Points. Under Table 1, select Player and Proj. Pts.