Advanced datasets with Redis - Performance at Scale with Amazon ElastiCache

Advanced datasets with Redis

Let's briefly look at some use cases that ElastiCache (Redis OSS) can support.

Game leaderboards

If you've played online games, you're probably familiar with top 10 leaderboards. What might not be obvious is that calculating a top n leaderboard in near-real time is actually quite complex. An online game can easily have thousands of people playing concurrently, each with stats that are changing continuously. Re-sorting these users and reassigning a numeric position is computationally expensive.

Sorted sets are particularly interesting here, because they simultaneously guarantee both the uniqueness and ordering of elements. Redis sorted set commands all start with Z. When an element is inserted in a Redis sorted set, it is reranked in real time and assigned a numeric position. Here is a complete game leaderboard example in Redis:

ZADD “leaderboard” 556 “Andy”
ZADD “leaderboard” 819 “Barry”
 ZADD “leaderboard” 105 “Carl”
ZADD “leaderboard” 1312 “Derek”

 ZREVRANGE “leaderboard” 0 -1
1) “Derek”
2) “Barry”
3) “Andy”
4) “Carl”

ZREVRANK “leaderboard” “Barry”
2

When a player's score is updated, the Redis command ZADD overwrites the existing value with the new score. The list is instantly re-sorted, and the player receives a new rank. For more information, refer to the Redis documentation on ZADD, ZRANGE, and ZRANK.

Recommendation engines

Similarly, calculating recommendations for users based on other items they've liked requires very fast access to a large dataset. Some algorithms, such as Slope One, are simple and effective but require in-memory access to every item ever rated by anyone in the system. Even if this data is kept in a relational database, it has to be loaded in memory somewhere to run the algorithm.

Redis data structures are a great fit for recommendation data. You can use Redis counters used to increment or decrement the number of likes or dislikes for a given item. You can use Redis hashes to maintain a list of everyone who has liked or disliked that item, which is the type of data that Slope One requires. Here is a brief example of storing item likes and dislikes:

INCR "item:38923:likes" HSET "item:38923:ratings" "Susan" 1 INCR "item:38923:dislikes" HSET "item:38923:ratings" "Tommy" -1

From this simple data, not only can we use Slope One or Jaccardian similarity to recommend similar items, but we can use the same counters to display likes and dislikes in the app itself. In fact, a number of open-source projects use Redis in exactly this manner, such as Recommendify and Recommendable. In addition, because Redis supports persistence, this data can live solely in Redis. This placement eliminates the need for any data loading process, and also offloads an intensive process from your main database.

Chat and messaging

Redis provides a lightweight pub/sub mechanism that is well-suited to simple chat and messaging needs. Use cases include in-app messaging, web chat windows, online game invites and chat, and real-time comment streams (such as you might see during a live streaming event). Two basic Redis commands are involved: PUBLISH and SUBSCRIBE:

SUBSCRIBE "chat:114" PUBLISH "chat:114" "Hello all" ["message", "chat:114", "Hello all"] UNSUBSCRIBE "chat:114"

Unlike other Redis data structures, pub/sub messaging doesn't get persisted to disk. Redis pub/sub messages are not written as part of the RDB or AOF backup files that Redis creates. If you want to save these pub/sub messages, you will need to add them to a Redis data structure, such as a list. For more details, refer to Using Pub/Sub for Asynchronous Communication in the Redis Cookbook.

Also, because Redis pub/sub is not persistent, you can lose data if a cache node fails. If you're looking for a reliable topic-based messaging system, consider evaluating Amazon SNS.

Queues

Although we offer a managed queue service in the form of Amazon Simple Queue Service (Amazon SQS) and we encourage customers to use it, you can also use Redis data structures to build queuing solutions. The Redis documentation for RPOPLPUSH covers two well-documented queuing patterns. In these patterns, Redis lists are used to hold items in a queue. When a process takes an item from the queue to work on it, the item is pushed onto an in-progress queue, and then deleted when the work is done.

Open-source solutions such as Resque use Redis as a queue; GitHub uses Resque.

Redis does have certain advantages over other queue options, such as very fast speed, once and only once delivery, and guaranteed message ordering. However, pay careful attention to ElastiCache (Redis OSS) backup and recovery options (which we will cover shortly) if you intend to use Redis as a queue. If a Redis node terminates and you have not properly configured its persistence options, you can lose the data for the items in your queue. Essentially, you need to view your queue as a type of database, and treat it appropriately, rather than as a disposable cache.

Client libraries and consistent hashing

As with Memcached, you can find Redis client libraries for the currently popular programming languages. Any of these will work with ElastiCache (Redis OSS):

Table 2 — Redis client libraries

Unlike with Memcached, it is uncommon for Redis libraries to support consistent hashing. Redis libraries rarely support consistent hashing because the advanced data types that we discussed preceding cannot simply be horizontally sharded across multiple Redis nodes. This point leads to another, very important one: Redis as a technology cannot be horizontally scaled easily. Redis can only scale up to a larger node size, because its data structures must reside in a single memory image in order to perform properly.

Note that Redis Cluster was first made available in Redis version 3.0. It aims to provide scale-out capability with certain data types. Redis Cluster currently only supports a subset of Redis functionality, and has some important caveats about possible data loss. For more details, refer to the Redis cluster specification.