Home > Pivot > Layout > Data Analysis ## Excel Data Analysis - Sports DataGet started with Excel's quick data analysis features, and turn raw data into clear insights. Using sports team player data, I compared hockey teams using pivot tables, pivot charts, and other Microsoft Excel tools and features. Author: Debra Dalgleish |

Sometimes you're lucky, and have an Excel file with clean, well-organized data, stored in named Excel tables. With just a few clicks, you can create pivot tables to show insights from the data, or add visuals, like a bar chart or line chart.

Unfortunately, life isn't always like that! Sometimes you have to do some hard work:

- First, collect the data yourself, from a few different sources
- Next, you need to clean up the data, and find an efficient way to store it in Excel.
- After that, you can decide what insights you want to pull from the data
- Finally, choose the Excel tools and features to analyze your data

In the hockey team player data example below, you can see the steps that I followed, to collect, clean, and analyze this sports data.

During the 2018 Winter Olympics, the hockey team rosters were available online.

As a fun challenge, I decided to download the rosters for the men's and women's hockey teams, from Canada and the USA.

I wanted to use that data for a hockey player data analysis, using Excel pivot tables. Would the player data give any insights as to why the USA women's team beat our Canadian women's team?

To begin the player data analysis, I went to the team rosters that were available online.

**Get this File**: You can get my Excel file, with the cleaned up data, in the Download section, at the bottom of this page.

On the roster pages, I couldn't find any buttons that said "Download Roster", so:

- I selected and copied the data in each table
- Then I did a Paste Special, Values, onto a blank Excel sheet.

Each country's rosters were set up a bit differently, so that's why I pasted them onto separate sheets at first.

For example, here's a screen shot from the Canadian Women's hockey team roster, and the Canadian Men's team had the same table layout.

Next, here's a screen shot from the USA Women's hockey team roster, and the USA Men's team had the same table layout.

In this roster, position is a single letter, height is feet/inches and centimetres, and weight is pounds and kilograms.

All of the team rosters had player names, height, date of birth, etc., but there was some cleanup work to do, before starting the data analysis.

- First, I set up column headings on a blank sheet, where I wanted to store the cleaned up data.
- Next, I rearranged the spreadsheet columns in each roster, to match the order on the new sheet
- For the USA teams, I did these additional clean up steps:
- Used Find and Replace to change the height dashes to apostrophes.
- Used the Text to Columns feature to remove the (cm) and (kg) values from the height and weight columns.
- Formatted the dates in Short Date format, to match the Canadian teams' date format.

- After that, I copied each roster's data onto the new sheet
- Note: I added "Women" or "Men" in the Team column, and "Canada" or "USA" in the Country column, before moving to the next roster.

- Next, I formatted the list on the new sheet as a named Excel table
- And finally, I filled in the ID column, with numbers from 1 to 96, using the AutoFill feature.

In my Excel file, I named the main sheet as "PlayerData".

- The Excel table headings are in row 3
- There are 11 columns of data, copied from the team rosters
- There are 96 rows of data, with players from 4 team rosters

Here are the team data table's column headings, and a short description for each column's data:

**ID**: unique ID number for player**Team**: type of hockey team - Men, Women**Country**: country of hockey team - Canada, USA**NameF**: player's first name (given name)**NameL**: player's last name (family name)**Weight**: player weight in pounds**Height**: player height in feet and inches**DOB**: player date of birth**Hometown**: player's hometown city**Prov**: player's hometown province**Pos**: player's position on hockey team

To make it easier to analyze the player data, I created 4 calculated fields in the Excel table:

**Age**: player's age in years, based on date of birth**HeightFt**: player's height converted to feet, with 2 decimal places shown**HtIn**: player's height converted to inches (HeightFt x 12)**BMI**: player's BMI based on Weight and HtIn

The details for these calculations are in the sections below, and there are additional formulas in the Age and Height Calculation section.

Player age would be a useful statistic, when comparing the teams' players, so that was the first calculated column that I added.

- In the downloaded data, each of the rosters had each player's date of birth.

To calculate the ages, based on date of birth, I used the following DATEDIF formula.

=DATEDIF([@DOB],TODAY(),"y")

**Warning**: Be careful with the Excel DATEDIF function - it can give incorrect results in some versions of Excel. You can find more age formulas in the Age and Height Calculation section.

In the player data, the height was listed as text, with feet and inches, and a foot mark, such as 5'7.

For data analysis, that text measurement should be converted to a real number. For example, change from 5'7 to 5.58.

To convert the height's text measurement to feet, I used the following formula:

=SUM(LEFT([@Height],1), RIGHT([@Height], LEN([@Height])-2)/12)

Here's how the formula works:

- The first character at the LEFT is the feet (there were no double-digit heights)
- The 1 or 2 characters at the RIGHT are the inches
- Those inches are divided by 12, to convert them into decimal portions of a foot
- For example, 6 inches divided by 12 = 0.5 feet

- SUM the feet and the inch calculation, to get the height in feet
- For example, 5'6 = 5.5 feet

You can find more height formulas in the Age and Height Calculation section.

The weight was entered as a number, so that didn't need to be changed.

I thought it would be interesting to get a height/weight ratio, so I used the BMI calculation for that. weight in pounds / [height in inches x height in inches] x 703

First, I needed another column, to convert the height to inches, from feet.

That was a simple formula – multiply the calculated foot height by 12:

=[@HeightFt]*12

Then, for the BMI calculation, I used the following formula – rounding to zero decimals, to limit the number of BMIs. (You could round to 1 or 2 decimals, if you needed greater precision. )

=ROUND([@Weight]/([@HtIn]*[@HtIn])*703,0)

After the calculations were finished, it was time to start the quick data analysis, to compare the teams. My favourite way to do that is with pivot tables!

My analysis steps are described in the sections below

-- 1) What Matters?

-- 2) Summarize Data in Pivot Table

-- 3) Analyze Team Data

-- 4) Why Did Team USA Win?

**Get this File**: You can get my completed data analysis Excel file in the Download section, at the bottom of this page.

When you're analyzing data, a key step is deciding what the reports should show.

The sports team roster data had size, age, and home town information for each player, so the following questions were my starting point.

**Age**: What age range is represented in the hockey teams. Are there differences among the teams? Did a team with younger players do better?**Size**: You'd expect the players to be in top physical condition, so what does the height and weight data show? Did the bigger team win, or is there an advantage to being smaller?**Home Town**: Finally, are the players from a snowy region? Most of Canada gets snow and ice in winter, but what about the USA players? Are they all from the northern states?

I'm a big fan of pivot tables, for quick Excel data analysis, so I started with those.

The video below shows the steps for creating a pivot table in Excel 2013 and later (using different sample data).

**Tip**: For detailed steps on building an Excel pivot table, go to the Create a Pivot Table in Excel page.

Another quick way to create a pivot table is with the Excel Analyze Data tool.

- First, select a cell in the Excel table where your data is stored
- Next, on the Excel Ribbon, go to the Home tab
- At the right end of the tab, click the Analyze Data button
- When the Analyze Data tab opens, click in the box at the top
- In the "Ask a question" box, type a question about your data, then press the Enter key, to see the suggestions

In the example shown below:

- I typed "Which country team has the tallest players"
- The answer showed a pivot chart, with a bar for each team, showing Average of Weight. (not quite what I asked for, but close!)
- Below the chart, there was a button - Insert PivotChart
- When I clicked that, Excel inserted a new worksheet, named Suggestion1, with a pivot table and pivot chart.

**Note**: Excel has other built-in tools for data analysts, like the Analysis ToolPak.

After you create one pivot table from the source data, you can quickly copy and paste that pivot table onto different sheets.

Then, on each new worksheet, change the pivot table layout, to get different views of the data.

In the sample workbook, there are several pivot tables, comparing the teams. Details are in the sections below.

To analyze the team data, I used pivot tables with different layouts and formatting.

Some pivot tables have Slicers, that let you filter the data, for ad hoc analysis.

The first table shows player counts, for each position, plus age, height and weight.

In the pivot table, the age, height and weight are shown using 3 different summary functions - Average, Minimum, and Maximum.

- After you add the value, right-click on one of the numbers
- Then, click Summarize Values by, and choose one of the functions.

In the country subtotal row, pivot table conditional formatting highlights two things:

- higher amounts in red
- lower amounts in green.

**Tip**: At the top of the worksheet, click the Slicer, to select the team type -- in the screen shot below, the Women's data was selected.

**Observation**: On average, the USA team is a couple of years younger, a bit shorter and 4 pounds lighter.

For the Men, the data is similar – the USA team is a little younger, shorter and lighter.

For the next analysis, I wanted to see the differences between the teams' age, weight and height numbers.

Instead of doing the arithmetic in your head, you can get a pivot table to do custom calculations.

For these calculations, I added two more pivot tables, below the first one, connected to the same Slicer.

- first pivot table shows the difference as an amount
- second pivot table shows the percent difference, based on the amounts for Canada.

**Note**: I hid the rows for Canada, because they are empty.

Next, I created a pivot table and pivot chart, to show the age ranges for the Mens and Womens players, from both countries.

**Observation**: The men have a substantially longer career, from what this pivot table shows.

NOTE: I created a new Theme for the workbook, with red and blue as the first 2 accent colours. That way, all the charts had red and blue bars.

Next, I set up a pivot table to show the BMIs, with a count of players in each position – Defence, Forward and Goalie.

- I changed the values to
**% of Column total** - From the pivot table, I created a stacked bar pivot chart to show the results.

**NOTE**: To keep the BMI list consistent, I set that field to "Show Items with No Data". Otherwise, some of the BMIs would not be listed, when you switch between the data for Men and Women.

**Observation**: There's a noticeable difference between the BMIs for the two Women's teams.

Or, just look at the player counts per BMI on each of the Women's teams – the patterns are quite different. Was that body type difference a factor in the USA victory?

Finally, to see where the player home towns were located, I used a different data visualization tool - the 3D Maps feature in Excel. That command is on the Insert tab, between Charts and Sparklines.

NOTE: The maps are not in the sample workbook – I created them in a separate file. If you have the 3D Map tool, you can build your own maps after you download my file.

I was surprised that the map was able to get the locations from the shortened state and province names, e.g. "Ont.", instead of requiring full names or official postal abbreviations.

Here's the map of home towns for all the Canadian players. They're spread from coast to coast, with most cities close to the US border, and some further north, in the the prairie provinces.

And here is the map of USA player home towns. Most are in the north and northeast, with a few scattered across the rest of the country. There were none in Alaska or Hawaii.

From my analysis of the data in the hockey player rosters, here's what I observed:

- USA Women team members were slightly younger, shorter and lighter, on average, than Canada's team members.

Maybe that gave them an advantage against our Canadian team.

- Younger players might have more stamina, and do better at the end of a long game
- Smaller, lighter players could be better at moving on the ice, especially in tight situations

Or maybe the USA womens' players performed better under the pressure of a game-deciding shootout!

If you analyze the data, you might find more clues about why the USA Women's team won, but I couldn't find any obvious answers.

In one of my Excel newsletters, I asked readers what formulas they'd use to calculate age and height, based on the hockey player data. There were several excellent suggestions, and you can download the hockey player solutions file to see the formulas.

First, here are the suggested formulas for calculating age, based on date of birth.

=DATEDIF([@DOB],TODAY(),"y")

=INT(YEARFRAC([@DOB],TODAY(),1))

=ROUNDDOWN(YEARFRAC([@DOB],TODAY()),0)

=INT(TODAY()-[@DOB])/365.25

=CONVERT(TODAY()-[@DOB],"d","yr")

Next, here are the suggested formulas for calculating height in feet, based on the text height, such as 5'7. By the length of these formulas, you can see that this calculation was tricky!

=SUM(LEFT([@Height],1), RIGHT([@Height], LEN([@Height])-2)/12)

=SUM(LEFT([@Height],1), CONVERT(RIGHT([@Height], LEN([@Height])-2), "in","ft"))

=LEFT([@Height])+ RIGHT([@Height], LEN([@Height])-2)/12

=12*TRIM(SUBSTITUTE([@Height],"'"," ") & "/12")/12

=12*(SUBSTITUTE([@Height],"'"," ") & "/12")/12

=(((LEFT([@Height],1)*12))+ IF(LEN([@Height])=4, RIGHT([@Height],2), RIGHT([@Height],1)))/12

=LEFT([@Height]) + MID([@Height],3,2)/12

**1) Hockey - Data Only**: Click here to get the hockey player data file, with the data only -- there are NO pivot table sheets. The zipped Excel file is in xlsx format, and does not contain any macros.

**2) Hockey - With Data Analysis**: Click here to get the hockey data file, with pivot tables and pivot charts, analyzing the data. The zipped Excel file is in xlsx format, and does not contain any macros.

Here are a few of my Contextures tutorials on working with data in Excel, organizing it in tables, and summarizing the data in pivot tables.

-- Excel Tables

-- Pivot Tables

Here's a video from CBC Sports, showing some of the Women's Hockey gold medal game highlights.

Last updated: February 28, 2024 3:23 PM