If you've been following along, we discussed in the last several posts of this series how, if you're not working in a very "tech-forward" organization (like my two compatriots on his site). Still, you have the same title; you're probably obtaining your data from another department (or it might be a sentient sponge or a gang of squirrels with dreams of world domination; you'll actually have no idea) with who you will have no contact with. As a side effect of this...rather strange situation (ya know, like being a child and being abandoned briefly by your mother in a laboratory into the care of a stranger in order to track your response...what, no other Bowlby heads out there?) you'll almost certainly get this data in a form that is completely unusable until you, dear reader, get your hands on it.
As such, in this post, we'll begin to dive into the myriad ways in which you might go about doing that, but first, I need to put you in the right frame of mind. I am a therapist, after all. Remember back to one of my first posts: when you get down to it, all computer code boils down to a series of binary choices (generally represented as 1s and 0s) indicating that a specific gate on the actual hardware is "on" or "off," and by layering millions of these on top of each other is how every single piece of computer software you have comes into being. Think about other binary decisions you've made: Yes or no, Odd or Even, Coke or Pepsi (despite what RC cola would like you to think), and perhaps the most bemoaned of all for everyone who has ever been a student: True or False.
Interestingly, this true/false dichotomy is the very essence of how all of the chopping and screwing you'll be doing on your messy data (and, in fact, the majority of all computer programming) begins; by simply asking Excel if what you typed in is true...or false.
Now that you've been successfully induced, we can jump into the technical stuff. As you may have garnered from my post about formulae, Excel has, DEEP within its bowels, a fairly robust programming language underpinning it: Visual Basic for Applications (VBA). Now, there are two general ways to tap into it:
- The way that most wizards do it: through the function bar (that is, typing an "
=
" and letting it rip), and... - The way that the particularly adept wizards engage in it: through the command line, which I'm not going to bother touching here, because if you're writing entire Excel spreadsheets in VBA, why are you still using Excel? Move onto Python or R and continue feeling smugly superior!
So, with that being said, how does one tap into this unlimited wellspring of potential using nothing but the features on the Excel main page? I already told you, simply ask Excel =If()
!
All flowery language and allegory aside, the majority of computer programming languages (the one underneath Excel included) hinge on this "If" statement, the only difference is how the particular language wants you to write these statements.
In Excel parlance, when one types in =If(
you'll be asked for three things separated by commas (that Excel could really make a bit clearer), they are:
- The logical argument (AKA, what do you want Excel to check as being true or false?) A good example would be say "Does the value in Cell A1 equal the value of the Cell in B1?" which would be written as
=if(A1=B1
(Protip: unlike other programming languages, the way you write "does not equal" in Excel is<>
) - The value you want to grab if the value is true (leaving this blank will just write "true") but in this case, let's make it say "Yes". This will be written as
=if(A1=B1,"Yes"
. (Protip: Within Excel, if you ever want to return something other than a number you always need to include it in quotation marks.) - The value that you want to return if the value is false (leaving this blank will just write
"false"
instead of). In this case, let's make it say "No". This will be written as=if(A1=B1,"Yes","No")
.
After completing this formula with the three elements followed by a closing parenthesis, you'll get a cell that either says "Yes" or "No" depending on if A1
and B1
. By containing the same values. The powerful thing here is what comes next: by double-clicking the lower right-hand corner of the cell with the formula, it'll automatically populate all of the rows that are adjacent to information with Ax = Bx
double-clicking (where x equals the row number), allowing you to check each row to see if the two columns match!
To add an extra wrinkle, let's say you want to check to see if each cell in a column is equal to the values in a specific cell; you'd do that as follows: =if(A1 = $B$1,"Yes","No")
double-click. If you then double click the cell as you did before, that will check every value in column A against ONLY the value in B1. By adding the dollar sign to the location of the cell, You've identified the "Absolute Reference" instead of the "Relative Reference." Feel free to disregard these names immediately and refer to them as "dollar signs." You can even manipulate it by only using the Absolute reference on the row (A$1) OR the column ($A1
) if you're populating, vertically, horizontally, or both (yes, technically putting B$1
in the above example would have gotten you the same answers).
In summation, in this post we've learned:
- Excel is actually built on top of a "real" programming language called VBA.
- You can write tiny programs in Excel through the function bar (or big programs through the command line...showoff).
- All "programming" really is, is the manipulation of true/false statements underpinning the binary code even further beneath all computational tasks.
- How to write your own "If" statement in Excel and how to "phrase" the returning of numbers vs. anything else using quotation marks.
- How to utilize the absolute reference (AKA the "Dollar Sign") to change what gets included in the statements in each line of your spreadsheet.
Next time, we'll get more grammatically complex, leveraging Ands and Ors in our Ifs. We might even get crazy and throw some Ifs in our Ifs...so you can drive while you drive.
Ever Upward,
-Snacks