[입 개발] Redis Scan은 어떻게 동작할까? PART #1

처음부터 꾸준히 입만 놀리는 입개발(or 혀로그래머) CharSyam입니다. Redis의 기능중에 사람들이 쓰면 안되지만, 그 단맛에 끌려 어쩔 수 없이 치게 되는 명령이 KEYS 입니다. KEYS를 쓰는 순간, Redis는 이 명령을 처리하기 위해서 멈춰버립니다. 특히 트래픽이 많은 서버는… 이 KEYS 명령 하나에 많은 장애를 내게 됩니다. 그런데… 어느 순간… Redis에 Scan이라는 명령이 생겼습니다. KEYS의 단점을 없애면서도, 느리지 않은… 그렇다면, Redis에서는 어떻게 Scan 이 동작하게 될까요?

Scan의 내부 동작을 알기 전에… 먼저 Redis가 어떻게 데이터를 저장하는지 부터 다시 한번 집고 넘어가야 합니다. Redis 의 가장 기초적인 자료구조는 KV 즉 Key/Value 형태를 저장하는 것입니다.(String 타입이라고도 합니다.) 이를 위해 Redis는 Bucket을 활용한 Chained Linked List 구조를 사용합니다. 최초에는 4개의 Bucket에서 사용하여… 같은 Bucket에 들어가는 Key는 링크드 리스트 형태로 저장하는 거죠. 즉 다음 그림과 같습니다.

이 Chained Linked List에는 다음과 같은 약점이 있습니다. 즉 한 Bucket 안에 데이터가 많아지면 결국 탐색 속도가 느려집니다. 이를 위해서 Redis는 특정 사이즈가 넘을때마다 Bucket을 두 배로 확장하고, Key들을 rehash 하게 됩니다. 먼저 이 때 Key의 Hash로 사용하는 해시함수는 다음과 같습니다. MurmurHash2를 사용합니다.

/* MurmurHash2, by Austin Appleby
 * Note - This code makes a few assumptions about how your machine behaves -
 * 1. We can read a 4-byte value from any address without crashing
 * 2. sizeof(int) == 4
 *
 * And it has a few limitations -
 *
 * 1. It will not work incrementally.
 * 2. It will not produce the same results on little-endian and big-endian
 *    machines.
 */
unsigned int dictGenHashFunction(const void *key, int len) {
    /* 'm' and 'r' are mixing constants generated offline.
     They're not really 'magic', they just happen to work well.  */
    uint32_t seed = dict_hash_function_seed;
    const uint32_t m = 0x5bd1e995;
    const int r = 24;

    /* Initialize the hash to a 'random' value */
    uint32_t h = seed ^ len;

    /* Mix 4 bytes at a time into the hash */
    const unsigned char *data = (const unsigned char *)key;

    while(len >= 4) {
        uint32_t k = *(uint32_t*)data;

        k *= m;
        k ^= k >> r;
        k *= m;

        h *= m;
        h ^= k;

        data += 4;
        len -= 4;
    }

    /* Handle the last few bytes of the input array  */
    switch(len) {
    case 3: h ^= data[2] << 16;
    case 2: h ^= data[1] << 8;
    case 1: h ^= data[0]; h *= m;
    };

    /* Do a few final mixes of the hash to ensure the last few
     * bytes are well-incorporated. */
    h ^= h >> 13;
    h *= m;
    h ^= h >> 15;

    return (unsigned int)h;
}

그리고 hash 값이 들어가야 할 hash table 내의 index를 결정하는 방법은 다음과 같습니다.

/* Returns the index of a free slot that can be populated with
 * a hash entry for the given 'key'.
 * If the key already exists, -1 is returned.
 *
 * Note that if we are in the process of rehashing the hash table, the
 * index is always returned in the context of the second (new) hash table. */
static int _dictKeyIndex(dict *d, const void *key)
{
    ......
    h = dictHashKey(d, key);
    for (table = 0; table <= 1; table++) {
        idx = h & d->ht[table].sizemask;
        ......
    }
    return idx;
}

table에는 Key를 찾기위해 비트 연산을 하기 위한 sizemask가 들어가 있습니다. 초기에는 table의 bucket이 4개 이므로 sizemask는
이진수로 11 즉 3의 값이 셋팅됩니다. 즉 해시된 결과 & 11의 연산결과로 들어가야 하는 Bucket이 결정되게 됩니다.

여기서 Key 가 많아지면 Redis는 Table의 사이즈를 2배로 늘리게 됩니다. 그러면 당연히 sizemask도 커지게 됩니다. Table size가 8이면 sizemask는 7이 됩니다.

먼저 간단하게 말하자면, Scan의 원리는 이 Bucket을 한 턴에 하나씩 순회하는 것입니다. 그래서 아래 그림과 같이 처음에는 Bucket Index 0를 읽고 데이터를 던져주는 것입니다.

        t0 = &(d->ht[0]);
        m0 = t0->sizemask;

        /* Emit entries at cursor */
        de = t0->table[v & m0];
        while (de) {
            fn(privdata, de);
            de = de->next;
        }

다음 편에서는 실제 사용되는 cursor 가 어떤 식으로 만들어지는 지, 그 외의 예외 케이스는 어떤 것이 있는지… 그리고 scan 을 사용할 때 주의사항에는 어떤 것이 있는지 알아보도록 하겠습니다.(이거 적고, 안 쓸지도 ㅎㅎㅎ)

Pingback: [입개발] Redis Scan은 어떻게 동작할까? PART #2 | Charsyam's Blog
hexists 7:40 am on January 28, 2015

궁금한게 있어 질문 남깁니다.
redis에서는 data type이 strings, lists, sets, hashes, sorted sets 이렇게 지원되는 것으로 아는데,
이 data type들이 모두 bucket을 활용한 chained linked list 구조로 저장되나요?
아니면 각각의 data type에 따라 저장 방식이 다른가요? ^^
인터넷을 많이 찾아봤지만, 명확히 해답을 얻지 못하여 글 남깁니다.
- charsyam 7:46 am on January 28, 2015
  
  일단 기본적인 데이터 구조는 bucket을 활용한 chained linked list 구조입니다.
  기본적인 구조라고 한것은 redis.conf의 옵션에 따라서 내부 구조가 ziplist와 intset 같은 몇가지 구조가 더 있습니다.
  메모리를 아끼기 위해서이구요. 그리고 최신 소스를 보면 list는 또 예전에는 더블링크드 리스트였다가 quicklist라는(list + ziplist)
  형태로 바뀌었습니다. Sorted Set은 skiplist 이구요.
  
  간단하게 정리하면
  String(Key/Value) : Hash(Bucket을 활용한 Chained Linked List)
  Hash : Hash(Bucket을 활용한 Chained Linked List)
  Set : intset/Hash(Bucket을 활용한 Chained Linked List)(Main)
  Sorted Set : ziplist/SkipList(Main)
  List : Double LinkedList(이전), QuickList(list + ziplist 블럭)
  
  이런식입니다. 감사합니다.
  - hexists 7:58 am on January 28, 2015
    
    감사합니다. 자세히 설명해주신 덕분에 잘 이해했습니다.
    혹시, 관련 doc 링크도 알려주실 수 있나요?^^
charsyam 9:11 am on January 28, 2015

@hexists 슬프게도, 그런 내용이 있는 doc은 잘 모르겠네요. 저는 그냥 소스를 보고 이해한거라 T.T
- hexists 9:23 am on January 28, 2015
  
  그렇군요^^ 감사합니다!!

Share this:

Related