11 Dec 2010

How to run Redis natively on Xen

In this post we will investigate how Redis, a popular key-value storage, can be run natively on Xen, i.e., without the support of a conventional operating system such as Linux, and what implication this has on the performance.

Introduction

In the recent months there was a lot of buzz about the increasing complexity and the amount of abstraction layers in virtualized environments. A post at HighScalability.com touches on this issue and highlights the performance tax of these layers. It is necessary to reconsider current architectures and to simplify the complex layering.

Two recently announced projects are focusing on providing runtime environments that run barebone on Xen for Haskell and OCaml: Mirage for OCaml and HaLVM for Haskell. This will remove the conventional operating system, which typically hosts these runtime environments, and can potentially improve the efficiency.

However, these approaches require that the software is either written in OCaml or Haskell, and the majority of software written in C still requires a conventional operating system layer. A solution for this problem was presented in a paper about HPC using lightweight Xen VMs, where software written in C is build using a special toolchain for Xen that results in a Xen VM image rather than a executable binary for conventional operating systems. Unfortunately, no benchmarks or concrete implementations are presented.

Hypothesis

In virtualized environments, it is practical that services running in virtual machines will experience a significant performance improvement when removing the conventional operating system layer and replacing it with a vastly simplified one.

Redis

Redis is a popular in-memory key-value storage system written in C. There are a few articles describing the architecture and background of Redis, which we will just refer to: Redis, from the Ground Up, Redis: under the hood.

We picked Redis as an example, because of several reasons:

Small and clean code base (~20k lines)
In-memory storage; and persistent storage that can be deactivated
Mostly single process and no threading

Overall the simplicity was the major advantage to take Redis as a proof of concept.

Xen Mini-OS and Stub Domains

Xen Mini-OS started as a small kernel example to demonstrate to developers how to port their kernels to Xen (for paravirtualization). More features got added over time (cf. Xen 3.3 Feature: Stub Domains) such as a C library, TCP/IP stack, and POSIX environment, and other application scenarios for Mini-OS were discovered, e.g., PVGrub is based on Mini-OS. Nowadays, stub domains are small Xen domains based on Mini-OS and the subsequently added features that are tightly integrated into the Xen build system (cf. xen-unstable.hg/stubdom). “Hello World” stub domains written in C and OCaml can be used as a basis to develop own stub domains.

Implementation

The implementation for running Redis as a Xen stub domain is basically using the following steps:

Take git version of Redis
Integrate Redis as new stubdom in Xen’s build system
Hack and try to compile until it works
Run as Xen VM and fix problems until it does not crash
“Done”

Steps 1) and 2) are straight forward and just required some Makefile modifications for the 2).

In order to make it compile, that is 3), we had to make some adaptions (read hacks) to the Mini-OS environment and Redis. For example, minor changes had to be done to the calls for randomness generation and process synchronization. We also wrapped the redis main function in the following code that fixes some problems with standard file descriptors and error handling.

#include <stdio.h>
#include <errno.h>
#include <unistd.h>

/* Ugly binary compatibility with Linux */
FILE *_stderr asm("stderr");
FILE *_stdout asm("stdout");
FILE *_stdin asm("stdin");
int *__errno_location;
void *__ctype_b_loc;

extern int redis_main(int argc, char **argv);

int main(int argc, char **argv, char **envp)
{
    _stderr = stderr;
    __errno_location = &errno;

    printf("starting redis\n");
    /* Wait before things might hang up */
    sleep(1);

    redis_main(argc, argv);
    return 0;
}

Furthermore, for step 4), the usage of fork() is crashing the stub domain due to missing support in Mini-OS, therefore we had to disable the database dump to hard disk and the virtual memory support (cf. Redis Virtual Memory).

The result is the following Xen VM image:

mini-os-20101211.gz

(It’s a proof of concept, so do not use it in any production environments and use it on your own risk)

Run the Redis Xen Image

Download the previously mentioned Xen image to your dom0 host system and store the following Xen VM configuration in redis_minios.conf.

kernel = "/path/to/redis/mini-os.gz"
name = "redis_minios"
memory = 512
vif = ['ip="10.0.0.1"']
on_crash = "destroy"

Set an IP alias for the dom0 ethernet interface:

ifconfig eth0:0 10.0.0.2

Start the Redis Xen VM using:

xm create redis_minios.conf

And now you are able to connect to the redis instance running on 10.0.0.1.

Benchmark

Since our hypothesis is that by removing the conventional operating system layer we gain a significant performance improvement, we had to benchmark the resulting Mini-OS-based Redis version and compare it to a traditional Linux-based one.

Our test machine is a Debian Lenny x86-64 box running Xen version 3.2-1 with 2.2GHz Athlon 64 3700+ CPU and 2G memory. We have to VMs running Redis (Mini-OS and Linux) and run the redis-benchmark from dom0.

Redis on MiniOS Xen VM:

PING: 14087.32 requests per second
PING (multi bulk): 13352.47 requests per second
SET: 11682.24 requests per second
GET: 13949.79 requests per second
INCR: 13125.98 requests per second
LPUSH: 13589.67 requests per second
LPOP: 13333.33 requests per second
SADD: 13175.23 requests per second
SPOP: 12970.17 requests per second
LPUSH (again, in order to bench LRANGE): 13123.36 requests per second
LRANGE (first 100 elements): 9330.22 requests per second
LRANGE (first 300 elements): 4894.76 requests per second
LRANGE (first 450 elements): 3387.53 requests per second
LRANGE (first 600 elements): 2905.29 requests per second

Redis on Debian Linux Xen VM:

PING: 11264.04 requests per second
PING (multi bulk): 11210.76 requests per second
SET: 10857.92 requests per second
GET: 11098.78 requests per second
INCR: 10854.66 requests per second
LPUSH: 10896.74 requests per second
LPOP: 11100.55 requests per second
SADD: 11080.84 requests per second
SPOP: 11251.12 requests per second
LPUSH (again, in order to bench LRANGE): 11061.95 requests per second
LRANGE (first 100 elements): 8849.56 requests per second
LRANGE (first 300 elements): 4944.28 requests per second
LRANGE (first 450 elements): 4075.79 requests per second
LRANGE (first 600 elements): 3558.16 requests per second

Analysis

The most interesting operations for common applications are SET and GET. We get about a 11-13% performance improvement when using the Mini-OS-based Redis version for these two operations in comparison to a Linux-based one. It is not a significant improvement, but demonstrates the performance tax of the conventional operating system layer.

Limitations

In the current proof of concept we have the following limitations:

No DHCP support, therefore the image has a hard-coded 10.0.0.1 IP address
Redis functionality disabled: virtual memory and persistent database

Next Steps

In the next steps we have to do a more thorough benchmarking and profiling of the Mini-OS-based Redis, in order to determine bottlenecks in the current implementation. Furthermore, we need to get DHCP working that the image can be run in dynamic network environments. For example, we could create a Amazon Web Services EC2 image using Amazon’s use-your-own-kernel technology, which is, by the way, based on Mini-OS.