October 19, 2022

Serving millions of contest join requests at My11Circle

Written by

Games24x7

A typical match lifecycle

Let’s see what the lifecycle of a match looks like.

The very first phase begins when the match comes into our system. Post which the registration opens for a match and we enter into phase 2 of the match lifecycle, where multiple innovative contests having different sizes and prize callouts see a rush of players joining as per their interest. The next crucial phase begins when the actual toss takes place onfield. Toss is important because, at this point for the first time, both the teams declare their final list of on-field playing players. Once the teams are announced, players come back to our platform to make amendments, if required, to their fantasy teams and further engage in contests until the match starts. And this is where we get more than 50% of the teams joining in a match. This sweet period of 30 odd minutes poses us with interesting challenges which include and are not limited to supporting around 18K joining requests per second, 300K API requests per second, keeping fast-filling contests available, along with maintaining consistency and high availability.

All of these numbers just revolve around one single match and at any given point, we have multiple matches running in parallel in our system.

What did we need?

System to support 300K API requests per second with p99 latency as low as 25ms.
A mechanism to keep fast-filling contests available to players.
A persistent data store that supports high concurrent write and read throughput.
Transaction support to achieve consistency.
A mechanism to handle a high volume of concurrent contest-join requests along with a rollback facility.
Ability to publish real-time join counts for each contest.

How did we achieve it?

Thinking about the choice of database, it was an open ground. And we did put quite a few potential data stores into play, analyzing challenges in data design and consistency, and listing their pros and cons. Though this in itself is an interesting topic to discuss, for now, let’s skip that part and conclude that we went ahead with MySQL, for managing contests and player join information.

No matter what data store we choose, we had to have a distributed caching solution above it to support such a high request rate. We are using Redis as our caching layer and for a few other use cases which we will talk about.

Fast-filling contest availability

We host several contests, which are required to be dynamically created and available to the players as soon as they get filled. For example, Head-to-Head contests, tons of which get filled within a second. Moreover, we can never fail a player’s request to join such a contest, even if the last seat on the requested contest got filled while we started processing his request. The player’s experience should be smooth, without any failure or errors, while we silently seat him on a new contest. In other words, it cannot be like the musical chair

If we create contests on getting to know unavailable seats in the requested contest, that will take time, as contest creation will involve few DB operations, resulting in increased latency for players, which is not an option. Also, we don’t want to end up in a situation where multiple players are sitting alone in different contests, while they could have been matched together! This means we need to fill the contests sequentially and at the same time be super quick.

To keep the inventory available, we pre-create a number of such fast-filling contests. The number of contests to create is again decided based on an underlying algorithm considering liquidity, contest properties, and join rate as factors. Our algorithm also maintains the same number in the inventory, ensuring inventory does not grow unnecessarily due to parallel filling contests. But having a healthy inventory is not enough. It needs to be backed by a good seat allocation algorithm. Let’s discuss that now.

Seat allocation

Joining any team in a contest is a lengthy process. What we mean by this is that it involves updating multiple data sets and communication with multiple microservices. We need to do validations, deduct the player’s wallet balance, update cache, update the database, and so on. Most complicated joins involve several such interactions. And in case any of the steps fail, we need to roll back everything. Also, if the join request for the last seat fails, this contest should be given higher priority than newly created contests, to make it fair for early joiners and avoid reduction of prize callout.

We have developed our seating algorithm using Redis, which allows us to maintain available seats in a contest and also select the right contest based on priority along with handling rollback scenarios. Redis being an in-memory store provides us a very low latency at the scale we need. Redis shards help us distribute data across nodes which automatically distributes player traffic as well. We can add TTL to our keys as well, offloading us from any potential memory leak issues. Once we have a good seating algorithm ready, we need to maintain consistency between our data and other microservice states.

Consistency

To maintain consistency across microservices, we follow the old-school method. Reverting things in other services if something fails in one. But we also need to maintain consistency between single and multiple datasets lying in our cache and database. The single-threaded nature of Redis for a given shard, helps us achieve consistency for a single cache item. For consistency between multiple items in the cache, we need transactional support. To achieve this we use Lua scripts, which allow us to work on and modify multiple datasets lying in the same Redis shard in an atomic way. This also helps us serve new player requests using cache. Otherwise, we might end up making frequent database calls for the players who have not created any team or joined any contest, just to check if they have.

Conclusion

This is how we are handling millions of team join requests in a single match. Combining the power of Redis, with Lua scripting and the goodness of MySQL to achieve beautiful results. If this is something that excites you, come join us, we are hiring!

‍

Downlaad full report

Explore More

Discover the latest insights from the world of Games24x7

Engineering

Serving millions of contest join requests at My11Circle

A typical match lifecycle

What did we need?

How did we achieve it?

Fast-filling contest availability

Explore More

ksqlDB in Data Engineering at Games24x7

Distributed Scoring And Ranking Through Thor

Serving Millions Of Personalised Offers To Our Players