It's been a hell of a decade for software development. I've personally enjoyed observing endless moaning and complaining from fellow developers about the topic of "new frameworks." Standard rhetoric would have one think our careers are somehow under siege as a result of learning new technologies. I can't help but wonder how the same contingency of professionals has reacted to the rise of cloud services. To those who fear learning, I can only imagine the sentiment they hold for implementing entirely new paradigms that come with budding cloud services. Alternatively, there are people like myself: grown children masquerading in an adult world. While we may appear to be older, we've simply reallocated our life's obsessions from Lego sets to cloud services.
I've fallen victim to the more "new technology" hype cycles than I'd like to admit. Straying down the path of dead-end tools is exhausting. I'm sure you'll recall the time the entire world arbitrarily decided to replace relational databases with NoSQL alternatives. I recognize how impulse architectural decisions create devastating technical debt. Thus, I would never think to push tools that have yet to stand the test of time and usefulness in the wild. Redis is one of such tools, perhaps ranking somewhere in my top 5 of useful new technologies in the past decade.
Why Redis Tho?
Why might I choose to proverbially shit on NoSQL databases prior to advocating for Redis: a NoSQL datastore? Make no mistake: Redis has almost zero similarities to NoSQL databases like MongoDB, in both intent and execution. MongoDB stores records on disk space, as do SQL databases. Allocating disk space to store a record comes with the implication that the information being stored is intended to persist, like user accounts, blog posts, permissions, or whatever. Most data worth saving falls into this category.
Surely not everything we do on the internet is worth saving forever. It would be weird (and inefficient) if we stored information like items in a user's shopping cart or the last page of our app a user had visited. This sort of information could be useful in the short term, but let's not fuck up the ACID-compliant databases our businesses depend on with endless disk I/O. Luckily for us, there's a little thing called RAM, which has handled similar scenarios since the dawn of computing. Redis is an in-memory database that stores values as key/value pairs. Reading and writing to memory is faster than writing disk, which makes memory suitable for storing secondary data. This data might enable better user experiences while simultaneously keeping our databases clean. If we decide at a later time that data in memory is worth keeping, we can always write to disk (such as a SQL database) later on.
In-memory data stores like Redis or Memcached lie somewhere in your architecture between your app and your database. A good way to think of these services would be as alternatives to storing information via cookies. Cookies are bound to a single browser and can't reliably be depended on to work as expected, as any user can purge or disable browser cookies at any time. Storing session data in the cloud eliminates this problem with the added benefit of having session data shared across devices! Our user's shopping cart can now be visible from any of their devices as opposed to whichever browser they had been using at the time.
Today we'll familiarize ourselves with Python's go-to Redis package: redis-py. redis-py is obnoxiously referred to as
redis within Python, presumably because the author of redis-py loves inflicting pain. See for yourself by visiting the official docs of redis-py: it's a single page of every method in the library listed in alphabetical order — pretty hilarious shit.
Setting Up Redis
If you don't have a Redis instance just yet, the folks over at Redis Labs offer a generous free tier for broke losers like you and me. They're quite a reputable Redis host, probably because they invented Redis in the first place. Once you have an instance set up, take note of the host, password, and port.
Enough chit-chat, let's dig into Python. You know what to do:
Like regular databases, we can connect to our Redis instance by creating a connection string URI. Here's what a real-life Redis URI might look like:
Here's what's happening piece-by-piece:
- CONNECTION_METHOD: This is a suffix preceding all Redis URIs specifying how you'd like to connect to your instance.
redis://is a standard connection,
rediss://(double S) attempts to connect over SSL,
redis-socket://is reserved for Unix domain sockets, and
redis-sentinel://is a connection type for high-availability Redis clusters... presumably for people less broke than ourselves.
- HOSTNAME: The URL or IP your Redis instance. If you're using a cloud-hosted instance, chances are this will look like an AWS EC2 address. This is a small side-effect of modern-day capitalism shifting to a model where all businesses are AWS resellers with preconfigured software.
- PASSWORD: Redis instance have passwords yet lack users, presumably because a memory-based data store would inherently have a hard time managing persistent usernames.
- PORT: Is your preferred port of call after pillaging British trade ships. Just making sure you're still here.
- DATABASE: If you're unsure of what this is supposed to be, set it to
0. People always do that.
Create a Redis Client
Awesome, we have our URI. Let's connect to Redis by creating a Redis client object:
Why StrictRedis, you might ask? There are two ways to instantiate Redis clients:
redis.StrictRedis(). StrictRedis makes an effort to properly enforce Redis datatypes in ways that old Redis instances did not.
redis.Redis() is backward-compatible with legacy Redis instances with trash data, where
redis.StrictRedis() is not. When in doubt, use StrictRedis.
There are many other arguments we can (and should) pass into
redis.StrictRedis() to make our lives easier. I highly recommend passing the keyword argument
decode_responses=True, as this saves you the trouble of explicitly decoding every value that you fetch from Redis. It doesn't hurt to set a character set either:
Birds-Eye View of Redis
Redis' key/value store is a similar concept to Python dictionaries, hence the meaning behind the name: Remote Dictionary Service. Keys are always strings, but there are a few data types available to us to store as values. Let's populate our Redis instance with a few records to get a better look at how Redis databases work:
r.set([KEY], [VALUE]) is your bread-and-butter for setting single values. It’s as simple as it looks: the first parameter is your pair’s key, while the second is the value assigned to said key.
Like a regular database, we can connect to our Redis instance via a GUI like TablePlus to check out our data. Here’s what my database looks like after running the snippet above:
|user_agent||Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_3)||STRING||-1|
It looks like everything went swimmingly, eh? We can learn quite a bit about Redis simply by looking at this table. Let's start with the type column.
Data Types in Redis
Values we store in Redis can be any of 5 data types:
- STRING: Any value we create with
r.set()is stored as a string type. You'll notice we set the value of timestamp to be an integer in our Python script, but it appears here as a string. This may seem obnoxiously inconvenient, there's more to Redis strings than meets the eye. For one, Redis strings are binary-safe, meaning they can be used to store the contents of nearly anything, including images or serialized objects. Strings also have several built-in functions that can manipulate strings as though they were numbers, such as incrementing with the INC command.
- LIST: Redis lists are mutable arrays of strings, where each string is sorted by the order in which they first appeared. Once a list is created, new items can be appended to the end of the array with the RPUSH command, or added at the zero-index via the LPUSH command. It's also possible to limit the maximum number of items in a list using the LTRIM command. If you're the kind of nerd that likes to do things like build LRU caches, you'll likely recognize the immediate benefit of this.
- SET: A set is an unordered list of strings. Like Python sets, Redis sets cannot contain duplicates. Sets have the unique ability to perform unions or intersections between other sets, which is a great way to combine or compare data quickly.
- ZSET (Sorted Set): Keeping sets in mind, a variation of the same data type are Sorted Sets. While sharing the constraint of disallowing duplicates, items in sorted sets can have their order changed after creation, making ZSETs very useful for ranking unique items. They're somewhat similar to ordered dictionaries in Python, except with a key per value (which would be unnecessary, as all set values are unique).
- HASH: Redis hashes are key/value pairs in themselves, allowing you to assign a collection of key/value pairs as the value of a key/value pair. Hashes cannot be nested, because that would be crazy.
Data Expiration Dates
Our Redis database contains a fourth column labeled TTL. So far, each of our rows has a value of
-1 for this column. When this number is set to a positive integer, it represents the number of seconds remaining before the data expires. We've established that Redis is a great way to store temporarily useful data, but usually not valuable enough to hold on to. This is where setting an expiration on a new value comes in handy: it automatically ensures our instance's memory isn't bogged down with information that has become irrelevant.
Revisiting our example where we stored some user-session information, let's try setting a value with an expiration:
This time around we pass a third value to
r.set() which represents the number of seconds our pair will be stored before self-destructing. Let's check out the database now:
|user_agent||Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_3)||STRING||-1|
If we use our GUI to refresh our table periodically, we can actually see this number counting down:
Working With Each Data Type
I'm sure you love hearing me drone on, but we both know why you're here: to copy & paste some code you need for your day job. I'll spoil you with a few cheatsheets that demonstrate some common use-cases when working with Redis' 5 datatypes.
If strings contain integer values, there are many methods we can use to modify the string as though it were an integer. Check out how we use
This assigns a value of
index, increments the value of
index by 1, decrements the value of
index by 1, and finally increments the value of
index by 3:
Below we add items to a Redis list using a combination of
.rpush(), as well as popping an item out using
.lpop(). You know, typical list stuff:
Here's what our list looks like after each command:
Redis sets are powerful party due to their ability to interact with other sets. Below we create two separate sets and perform
.sinter() on both:
As expected, our union combines the set without duplicates, and the intersect finds values common to both sets:
Adding records to sorted sets using
.zadd() has a bit of an interesting syntax. Take note below how adding a sorted set records expects a dictionary in the format of
Items in a sorted set can never share the same index, so when attempting to insert a value at an index where one exists, the existing value (and those following) get pushed down to make room. We're also able to change the indexes of values after we've created them:
Alright, I didn't do much of anything interesting with hashes. Sue me. Still, creating and fetching a hash value is kind of cool, right?
The output is exactly the same as the input... no surprises there:
There are plenty of Redis and redis-py features we’ve left unexplored, but you should have more than enough to get going. More importantly, hopefully, this tutorial has hopefully had some part in demonstrating why Redis could be useful in your stack… or not! If I can claim any part in preventing awful system architecture, that’s a victory.
Oh yeah, and check out the repo for this tutorial here: