We're talking new releases & fast AI at Redis Released. Join us in your city.

Register now

Using the Redis Allocator in Rust

November 12, 2019

Introduction

While developing redismodule-rs, the Rust API for writing Redis modules, I encountered the need to set up a custom memory allocator.

Normally, when a Rust program needs to allocate some memory, such as when creating a String or Vec instance, it uses the global allocator defined in the program. Since Redis modules are built as shared libraries to be loaded into Redis, Rust will use the System allocator, which is the default provided by the OS (using the libc malloc(3) function).

This behavior is problematic for several reasons.

First of all, Redis may not be using the system allocator at all, relying on jemalloc instead. The jemalloc allocator is an alternative to the system malloc that includes many tweaks to avoid fragmentation, among other features. If the module uses the system allocator and Redis uses jemalloc, the allocation behavior will be inconsistent.

Secondly, even if Redis always used the system allocator, memory allocated directly by the module would not be visible to Redis: it would not show up in commands such as info memory and would not be influenced by cleanup operations performed by Redis such as eviction of keys.

For these reasons, the Redis Modules API provides hooks such as RedisModule_Alloc and RedisModule_Free. These are used much like the standard malloc and free calls but make Redis aware of the allocated memory in addition to actually passing the call on to the memory allocator.

Using a custom allocator

Rust provides the option to define a custom memory allocator by providing a custom implementation of the GlobalAlloc trait:

We can use it by implementing the GlobalAlloc trait with our own methods that delegate the allocation to Redis. For this, we need a way to call the Redis Module API functions from Rust. That is a topic for another post, but in short, we achieve this by using the bindgen crate to generate Rust bindings from the redismodule.h C header file.

The header file defines the functions as follows:

These functions, like the rest of the Modules API, are defined as function pointers. When calling the functions from Rust, we need to dereference the function pointer first, which we do use the unwrap() method. We also need to do some casting to match up the pointer types. Finally, we need to use the unsafe keyword since we dereference raw pointers, which is not allowed in safe Rust for good reasons:

The crash

Unfortunately, it’s not that simple. When we build a module with this custom allocator and load it into Redis, it crashes on us. Redis does print a nice stack trace when it crashes, so let’s look at it:

So, it looks like we had a null pointer dereference here (3 ??? 0x0000000000000000 0x0 + 0), but what are all these weird symbols starting with _ZN…?

After a bit of searching, we find that this is the way Rust does name mangling: Unlike in C, and similarly to C++, in Rust, multiple functions with the same name can coexist since there are various namespace mechanisms such as modules and traits to distinguish them. To generate unique symbols that are C-compatible, the compiler mangles these into long and ugly unique names. To detangle these names back into the original, we can filter the output through rustfilt.

This gives us the following stack trace (uninteresting parts removed):

It still took me a lot of head-scratching and experimenting to figure it out, but here’s what happened:

The functions of the Redis modules API are accessed via C function pointers. Instead of relying on the dynamic linker to initialize these pointers, they are initialized explicitly by Redis as part of the module initialization process.

As the stack trace shows, during the loading of the module, we call the CString::new function. This standard library function allocates memory for a string. This, in turn, calls our allocator, which would then call RedisModule_Alloc.unwrap()… to actually perform the allocation. This causes a chicken-and-egg problem. The Redis module is not ready yet, meaning our function pointers have not yet been initialized, so we can’t call the relevant API to perform the allocation.

The solution

I try various approaches to solve this, but there seems to be no clean way to avoid the allocation during module initialization. The second best thing would be to use the standard allocator until the module is ready and then switch to the custom one. However, Rust doesn’t allow changing the allocator at runtime, so we can’t do that.

I end up adding a flag to the custom allocator that causes allocations to be passed through to the system allocator at startup. After the module initialization is complete, the flag is toggled so that further allocations are then performed via the Redis allocator. This solution still has edge cases—most importantly requiring that all previously allocated memory is freed before switching, otherwise, that memory would leak. However, it’s good enough for our purposes.

Here is what the final code looks like:

We add a static flag named USE_REDIS_ALLOC that determines whether we should use the Redis allocator or the system one. It’s important to guarantee safety when mutating static data, so we use an AtomicBool here that is false by default.

In the module initialization code, we call use_redis_alloc when the module is ready to use. At this point, we can safely start using the Redis allocator, and all future allocations will be accounted for by Redis.

This takes care of the crash and ends up in the redis-module crate. Feel free to check it out and let me know how you like it!