Adventures in Excel: Power to the Pivot
- Create an Iterative List in Excel Without Any Real Coding
- Dynamic Tension! Creating and Using Dynamic Named Ranges in Excel
- Active-X Gonna Give it to Ya
- Your Invitation to the Excel Pivot Party
- Slicing Excel: The Way of The Knife
- Yes...If(And)): Excel’s Mighty “IF” Statement
- Getting Iffy With it: Conditional Statements in Excel
- Taking Out the Trash: Dirty Data in Excel
- Dealing with Dirty Data in Excel
- Doing the Excel Data Dance
- Adventures in Excel: Power to the Pivot
- The Art of Excel Pivot Tables
- The One Excel Formula to Rule Them All
- Excel Showdown: Index Match vs. The Lookups
- How I Stopped Worrying and Learned to Love the Formula
- Excel Data Manipulation: a Guide For Everyone
During the last discussion, you've (hopefully) learned how to generate a pivot table, and learned about the four "buckets" that can house your columns:
I'm also going to make the wild assumption that you've played around with your newly birthed pivot table, taking your column headings from your "raw" data (in the lingua franca of the numerically inclined, this means the data just as you've received it, before you add formulae or engage in any other manipulation.) Hopefully this gave you a feel of how a pivot table works, and if your synapses are wired in a very specific way you may even begin to feel out why pivot tables are such a game-changer. With this base in mind, in this post, we're going to learn what each of those buckets actually do, and explore some really cool ways that you can mess with your raw data within the pivot environment.
Before we go further, we must first understand some foundation facts about working with Pivots that will save you heartache and frustration out of the process:
- Remember, any data within the pivot table does not affect the raw data, this makes pivot tables the ideal way to look at your data through different lenses.
- You can utilize a heading in your pivot table in multiple locations (or even multiple times in the same location!) to view your data in different ways. The exact way this works is difficult to explain, so try it yourself and find out.
- Any changes to your raw data (especially if you didn't highlight the entirety of your data) will not be reflected in the pivot table UNLESS you right click the table, and press "refresh", this will then take into account any data changed in the raw data (assuming that the change took place within the area you highlighted, if you add a column or row, you need to redo the table)
- You can copy/paste any part of the pivot table, and contents that you highlighted to copy will appear. HOWEVER, if you highlight all visible parts of the pivot table, you'll actually copy the whole pivot (which you can continue modifying). This is very useful if you want to make a minor change to a pivot table, and compare them side by side (we'll dive deeper into this topic in a later post).
- This isn't exclusive to pivot tables, but remember the absolute golden rule of Excel: much like an obnoxiously large Leatherman (or a Swiss army knife, depending on how over-prepared you want to be) while there are nearly infinite tools or settings available, a hypermajority of Excel users will never touch even half of the options available, and are still strong in the ways of magic (#wizard), so just because you can press all the buttons, doesn't mean you should.
With all three of these rules in mind, let's dive into each of the buckets, what they do, and how you can use this to really start to get wild with your data. Let's start with the most straightforward:
The filter bucket allows you to place one (or more!) of your headings (AKA your columns in the raw data) in order to filter in or out those values out of the entire pivot table. For instance, If you have a column of people's names, and you put names into the filter bucket, you can then choose the names that you want to include the associated data. One great use for the filter bucket is if you have a dataset with dates, you can filter out a specific Month/Day/Year and the rest of your data will only display information from that particular time period.
Rows are the bread and butter of a pivot table (and as we discussed last time, the thing that clicking on a heading defaults to). This is the main area where your data will live. For example, if you have a dataset of Names, IDs, and Birthdays, and you click Names, the pivot table will show all of the DISTINCT names in a column within the pivot. For example, if you have the dataset of names:
|COLUMN A||COLUMN B|
Your pivot table will display:
|COLUMN A||COLUMN B|
In other words, it treats all Bobs the same, and this functionality is *Muy importanté
However, what if you want to see how many Bobs exist within your data? Well...
The values tab is perhaps the most flexible part of the pivot table. By putting a heading within the value tab, it will default to showing you the Count, or the number of appearances that a particular value shows up within a column of your dataset. So for example, if you drag names into both the row and values buckets, you'll wind up with this (a pivot table will always show a grand total (unless you specifically tell the pivot not to, which is a possibility, however, because you can copy and paste any part of the pivot, it's rarely necessary):
COLUMN ACOLUMN B Bob3 Sally1 Bob1 Jim1 Tom1 **Total****7**
Now, notice that above I said that by dragging a heading into the values bucket it'll default to a count (unless the column is all numbers, then it'll default to a sum), however, this is not the only possibility. By left clicking the heading within the value bucket and choosing "value field settings), you can choose from a mind-boggling array of possible mathematical operations, the ones that you will generally use most often are probably going to be count (the number of appearances of a given value within a dataset) and sum (the combined total of all the numbers associated with each value in your row bucket). One important piece of advice though, you can not sum text, Excel won't let you (even if the text looks like a number), so if you're using numbers, be sure to tell Excel that you're looking at numbers (this issue will happen more often than you could possibly imagine!)
Other useful operators are count distinct (which you can only utilize if you checked add to data model when you built the table) and min or max (returning the lowest or highest value associated with a particular value in your rows, and you can only use this if you did not click add to data model when you built the table). A good example of using Max or Min is when you have say names and salaries, where you can show the highest and lowest salary for a particular name. Keep in mind that you can have more than one value operator (like min and max) per heading, thus putting both the minimum and maximum value next to each other on the table. Which leads us to the generally least used bucket...
One thing I intentionally left out was what a pivot table actually does with your data. A gross over-simplification of the functionality is that it consolidates (picking out those distinct values), summarizes (shows totals, and grand totals), and most importantly, it flips your columns and rows (hence the reason it's called a pivot table.) Thus, the column bucket is generally just used as a place where your values go automatically when you populate the values bucket. The reason columns get kind of cumbersome is because it literally widens your pivot by how many ever values exist in your dataset, thus, anything more than say 20 values becomes almost impossible to read, let alone use in a meaningful way.
There are some rare cases in which using a heading in the column bucket may make sense though, for instance, if you have birth months as one of your headings, and names for rows, you can put the months in the column bucket and then get a count showing each how many people with a particular name land at a particular month in a grid or "heat map" (a fancy way to say a chart showing concentrations of something) fashion. Another use case (and one I use on the day to day) is finding the numbers of providers who utilize a particular procedure for which there is a known value (there are 12 that I look at). You have the procedure names as the columns, and you have the providers who administer the procedures as the rows. You then put measure in the value bucket. This gives me a nice, concise grid for which I can see which providers give how many of which of the 12 procedures...which on it's own is sorta useful, but when combined with a trick we already learned, some epic level spellcasting is afoot. Unfortunately, that'll have to wait until next time.
In summary, we learned:
- The purpose of the filter bucket within a pivot table
- The purpose of the rows bucket within a pivot table and the fact that it shows distinct values
- The purpose of the values bucket within a pivot table, and how you can have the column use different mathematical operators (such as count, sum, min, and max) to find different information within a dataset... and
- The purpose (or lack thereof) of the columns bucket within a pivot table, and potential use case scenarios for this function.
Next time, we start combining knowledge in order to get freaky with your data in ways that the lord of Excel themselves did not consider.
Pivot till you puke,