Fastbin dup with tcache

Aug 30, 2022

While learning the fastbin dup attack, all the online resources I found made the same assumptions: GLIBC is either compiled without tcache support, or in one of the exploit steps calloc is called instead of malloc, and that made me confused.

In this article I will cover how to use the fastbin dup attack with a modern GLIBC and shed some light on this exploitation technique.

The original fastbin dup attack

The original fastbin dup attack leverages a so-called double free. A double free occurs when you call free on an already free’d chunk. The fastbin dup attack takes advantage of the double free and forces malloc to return the same chunk two times. This can later be used to edit the chunk’s metadata and obtain an arbitrary read/write primitive.

What is a fastbin?

A fastbin is one of the freelists that malloc uses to keep track of free chunks. Bins are usually differentiated based on the size of chunks they contain. There are 10 fastbins, each containing a non-circular singly linked list of different single-sized chunks.

The first fastbin contains 0x20 sized chunks, the second fastbin contains 0x30 sized chunks, the third 0x40, up to the tenth that contains 0xb0 sized chunks. Under default conditions, only the first 7 fastbins are available.

When a chunk is free’d into a fastbin, that chunk is linked to the HEAD of the fastbin. The fastbins’ HEAD pointers are stored in the main_arena variable, which is of type malloc_state.

// glibc/malloc/malloc.c
struct malloc_state
{
  /* Serialize access.  */
  __libc_lock_define (, mutex);

  /* Flags (formerly in max_fast).  */
  int flags;

  /* Set if the fastbin chunks contain recently inserted free blocks.  */
  /* Note this is a bool but not all targets support atomics on booleans.  */
  /* This field is only present from glibc 2.26 onwards, it was part of
     flags before */
  int have_fastchunks;

  /* Fastbins */
  mfastbinptr fastbinsY[NFASTBINS];

  /* Base of the topmost chunk -- not otherwise kept in a bin */
  mchunkptr top;

  /* The remainder from the most recent split of a small request */
  mchunkptr last_remainder;

  /* ... other fields that we won't cover */
};

This roughly translates to the following scheme. It is not complete, since main_arena has many more fields, but they are not relevant to our discussion now.

A table representing the struct main_arena If you’re curious about the actual fields contained in the malloc_state struct, you can check them out in the source code.

The following picture shows how the 0x20 fastbin could look like once some chunks are free’d into it. Note how the first quadword of user data gets repurposed as an fd pointer. The NULL pointer of the third chunk indicates that no further chunks are linked into the list.

Please also note how malloc pointers point eight bytes before the size field.

A picture representing how the fastbin list looks like

When does a chunk get linked into a fastbin?

In GLIBCs that do not have support for tcache or have been compiled without support for tcache, a free chunk that fits in a fastbin’s size gets linked into that fastbin.

// fastbin_demo_no_tcache.c
//
// glibc 2.31 compiled without tcache
// gcc -o fastbins_demo_no_tcache fastbin_demo_no_tcache.c -g
#include <stdio.h>
#include <stdlib.h>

int main() {
  // This is needed to prevent printf buffer (which is stored
  // in the heap) messes everything up.
  setbuf(stdout, 0);

  void *a = malloc(0x18);
  void *b = malloc(0x18);
  free(a);
  free(b);

  return 0;
}

If we added a breakpoint to the return 0 instruction, we would see that the fastbin contains only the chunk A.

Let’s check it.

pwndbg> vis
0x603000	0x0000000000000000	0x0000000000000021	........!.......	<-- fastbins[0x20][1]
0x603010	0x0000000000000000	0x0000000000000000	................
0x603020	0x0000000000000000	0x0000000000000021	........!.......	<-- fastbins[0x20][0]
0x603030	0x0000000000603000	0x0000000000000000	.0`.............
0x603040	0x0000000000000000	0x0000000000020fc1	................	<-- Top chunk

pwndbg> fastbins
fastbins
0x20: 0x603020 -> 0x603000 <- 0x0
0x30: 0x0
0x40: 0x0
0x50: 0x0
0x60: 0x0
0x70: 0x0
0x80: 0x0

We can see two chunks: chunk A, starting at 0x603000 and chunk B, starting at 0x603020. Since chunk A has been free’d first, it appears last in the fastbin freelist. Note how the first quadword of chunk B has been repurposed to hold the fd pointer, which points to the following chunk in the 0x20 fastbin, chunk A.

After chunk B we can see a strange value (0x20fc1). This is part of the top chunk, and holds the remaining space in the current arena. When a request’s size exceeds the top chunk size field, malloc tries to request additional space from the OS to serve the request.

When does a chunk get linked out of a fastbin?

In GLIBCs that don’t support tcache or are compiled without support for tcache, when a request size matches the fastbin range, malloc checks for free chunks in the correct size fastbin. If the bin’s not empty, malloc allocates a chunk from the head of the fastbin, removing it from the list.

The double free vulnerability

With everything we’ve learned so far, we know that if we free a proper sized chunk, this gets linked into a list of free chunks, called fastbin.

When a same sized request arrives, in GLIBCs that don’t support tcache, malloc tries to allocate a chunk looking into the fastbin first and then falling back to allocate a new chunk.

What happens if we try to free the same chunk two times? Based on what we know, we might expect to see it linked two times in the fastbin.

Let’s try that and see what happens.

// double_free_demo_no_tcache.c
//
// glibc 2.31 compiled without tcache
// gcc -o double_free_demo_no_tcache double_free_demo_no_tcache.c -g

#include <stdio.h>
#include <stdlib.h>

int main() {
  // This is needed to prevent printf buffer (which is stored
  // in the heap) messing everything up.
  setbuf(stdout, 0);

  void *a = malloc(0x18);

  free(a);
  free(a);
}

// If we run it, we get the following message:
// double free or corruption (fasttop)
// fish: Job 1, './double_free_demo_no_tcache' terminated by signal SIGABRT (Abort)

The program explodes with a message, saying double free or corruption (fasttop). We have hit one of the GLIBC heap hardening measures. We can search the source code to find the offending line. malloc checks if the head pointer of the fastbin is the same as the chunk being free’d. To bypass this check we can free another chunk between the two free(a) calls.

// double_free_2_demo_no_tcache.c
//
// glibc 2.31 compiled without tcache
// gcc -o double_free_2_demo_no_tcache double_free_2_demo_no_tcache.c -g

#include <stdio.h>
#include <stdlib.h>

int main() {
  // This is needed to prevent printf buffer (which is stored
  // in the heap) messing everything up.
  setbuf(stdout, 0);

  void *a = malloc(0x18);
  void *b = malloc(0x18);

  free(a);
  free(b);
  free(a);

  return 0;
}

This time the program doesn’t explode and behaves. Let’s see what happens with pwndbg.

pwndbg> vis

0x603000	0x0000000000000000	0x0000000000000021	........!.......	<-- fastbins[0x20][0], fastbins[0x20][0]
0x603010	0x0000000000603020	0x0000000000000000	0`.............
0x603020	0x0000000000000000	0x0000000000000021	........!.......	<-- fastbins[0x20][1]
0x603030	0x0000000000603000	0x0000000000000000	.0`.............
0x603040	0x0000000000000000	0x0000000000020fc1	................	<-- Top chunk

pwndbg> fastbins
fastbins
0x20: 0x603000 —▸ 0x603020 ◂— 0x603000
0x30: 0x0
0x40: 0x0
0x50: 0x0
0x60: 0x0
0x70: 0x0
0x80: 0x0

pwndbg> dq &main_arena 14
00007ffff7dcfb60     0000000000000000 0000000000000001
00007ffff7dcfb70     0000000000603000 0000000000000000 <--- the first quadword is the head to the 0x20 fastbin
00007ffff7dcfb80     0000000000000000 0000000000000000
00007ffff7dcfb90     0000000000000000 0000000000000000
00007ffff7dcfba0     0000000000000000 0000000000000000
00007ffff7dcfbb0     0000000000000000 0000000000000000
00007ffff7dcfbc0     0000000000603040 0000000000000000 <--- the top chunk pointer

We turned a non circular linked list into one. If we now request a chunk, malloc will hand us the chunk starting at 0x603000 and it will set the top of the 0x20 fastbin to the fd of the 0x603000 chunk, removing it from the fastbin. At this point, the head of the 0x20 fastbin points to 0x603020, if we malloc(0x18) again, malloc will return the 0x603020 chunk. As always, before returning the chunk to the user, malloc will remove the chunk from the fastbin, setting the head of the fastbin to the chunk’s fd pointer. But this fd pointer is the same of the first malloc allocation.

#include <stdio.h>
#include <stdlib.h>

int main() {
  // This is needed to prevent printf buffer (which is stored
  // in the heap) messing everything up.
  setbuf(stdout, 0);

  void *a = malloc(0x18);
  void *b = malloc(0x18);

  free(a);
  free(b);
  free(a);

  a = malloc(0x18);
  malloc(0x18);
  b = malloc(0x18);

  printf("A points to %p\n", a);
  printf("B points to %p\n", b);

  return 0;
}

The fastbin dup attack has worked, a and b point to the same address. So far the attack is straightforward. But what happens if we throw tcache in the mix?

If we try to execute the code above on a tcache enabled GLIBC, the program aborts with free(): double free detected in tcache 2. Let’s head into what is this tcache.

Tcache

The tcache is a per-thread cache that contains a small collection of chunks that can be accessed without needing to lock an arena, offering a substantial performance optimization in certain workloads.

These chunks are stored as an array of singly-linked lists, like fastbins, but with links pointing to the payload (user data), not the chunk header (as opposed to fastbins).

Each bin contains one size chunk, so the array is indexed by chunk size. Unlike fastbins, the tcache is limited in how many chunks are allowed in each bin. The tcache_count field in struct malloc_par defines this limit. The default value is 7. This limit can be changed using Tunables.

When the tcache bin is empty for a given requested size, the request is passed to the normal malloc routines.

The tcachebins hold chunk sizes that span from 0x20 to 0x410 bytes. tcache saves the number of chunks linked in every tcachebin in an array of counters called counts, in the tcache_perthread_struct.

When does a chunk get linked in a fastbin when `tcache` is enabled?

When a free’d chunk is in tcachebin range, the free algorithm tries to link the chunk there. When there’s no more room for other chunks in the tcachebin (i.e. the number of linked chunks in a tcachebin equals tcache_count) and the size is valid for a fastbin, the chunk is linked into the relevant fastbin instead.

When does a chunk get linked out of a fastbin when `tcache` is enabled?

When a request size is in the fastbins range, since the tcachebins and the fastbins’ sizes partially overlap, malloc tries to service the request from the relevant tcachebin first. If the tcachebin is empty, malloc falls back to fastbins.

Tcache double-free protection

One could think that we could attack tcachebins in the same way we attacked fastbins, but tcache has some quirks that make them harder to exploit.

In tcache chunks, the second quadword of user data is repurposed as a key value. When a chunk is linked into a tcachebin, the key field is set to a pointer to a malloc struct.

When a new chunk is a candidate for free, the key field is checked: if it matches the malloc struct’s address, the candidate chunk’s address is checked against every chunk in the tcachebin, if free finds a match, the program aborts.

A simple bypass would be to overwrite the key field with a random value, but that’s not always possible, depending on the bug primitives you have available.

The introduction of the safe linking protections in newer GLIBCs makes things even harder without a heap leak.

However, the bypass of the tcachebin double-free protections is out of the scope of this article.

If we control a sufficient number of mallocs and frees, we can bypass the use of tcachebins altogether and fall back to fastbins, in order to perform a standard fastbin dup attack, even with the tcache enabled.

The exploit on how2heap that accounts for `tcache` but uses `calloc` (?)

Legendary CTF super-team Shellphish manages a public repository on GitHub, called how2heap, that demonstrates the most famous heap exploits and vulnerabilities. The fastbin dup attack is also covered.

Their demo exploit accounts for the presence of the tcache filling the tcachebins before proceeding to the actual attack.

After this, it uses the calloc function instead of malloc to allocate chunks, and this, at first, confused me a lot.

Turns out that calloc does not allocate from the tcache. And, apparently, no one knows why.

This made me even more confused.

Does this mean that the fastbin dup attack does not work unless calloc is used instead of malloc, while using GLIBCs with tcache? Fortunately no, I guess the shellphish team used calloc for clarity and to avoid useless clutter, but it made me scratch my bald head for quite a minute.

I managed to make the shellphish exploit work with mallocs only. It’s a little more cluttered, but it works.

You only have to keep in mind the cases when malloc and free turn to fastbin, keeping the tcachebins full or empty, based on our needs.

This means, if we want to free a chunk into the fastbin, we have to fill up the respective tcache beforehand, so that our target chunk gets free’d into the fastbin, and not into a tcachebin (since it’s full).

On the contrary, if we want to malloc a particular chunk from fastbins, we need to make sure that the respective tcachebin is empty, so that malloc turns to fastbins to select candidates to return to the user.

For example, when we need to put the three chunks in the fastbin, we have to make sure that the tcachebin is full. Likewise, when we need to get the chunks out of the fastbin, we need to make sure that the tcachebin is empty.

The shellphish exploit edit with only mallocs is described and commented here.

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>

int main()
{
    setbuf(stdout, NULL);
  
    printf("This file demonstrates a simple double-free attack with fastbins.\n");
  
    printf("Allocating 3 buffers.\n");
    int *a = malloc(8);
    int *b = malloc(8);
    int *c = malloc(8);
   
    printf("Fill up tcache.\n");
    void *ptrs[7];
    for (int i=0; i<7; i++) {
        ptrs[i] = malloc(8);
    }
      
    // Fill the 0x20 sized tcachebin.
    for (int i=0; i<7; i++) {
        free(ptrs[i]);
    }

    printf("1st malloc(8): %p\n", a);
    printf("2nd malloc(8): %p\n", b);
    printf("3rd malloc(8): %p\n", c);
  
    printf("Freeing the first one...\n");
    // This should go in the 0x20 fastbin, since the 0x20 tcachebin is full.
    free(a);
  
    printf("If we free %p again, things will crash because %p is at the top of the free list.\n", a, a);
    // free(a);
  
    printf("So, instead, we'll free %p.\n", b);
    // This should go in the 0x20 fastbin too.
    free(b);
  
    printf("Now, we can free %p again, since it's not the head of the free list.\n", a);
    // Here's our double free, the chunk A is put into the 0x20 fastbin a second time.
    free(a);
  
    // Empty the 0x20 tcache, so that `malloc` has to look into the 0x20 fastbin to allocate.
    for (int i=0; i<7; i++) {
        ptrs[i] = malloc(8);
    }
    
    printf("Now the free list has [ %p, %p, %p ]. If we malloc 3 times, we'll get %p twice!\n", a, b, a, a);
    a = malloc(8);
    b = malloc(8);
    c = malloc(8);
    printf("1st malloc(8): %p\n", a);
    printf("2nd malloc(8): %p\n", b);
    printf("3rd malloc(8): %p\n", c);
  
    assert(a == c);
}

Conclusions

We learned how to use the fastbin dup attack, even with the tcache enabled.

I hope that I didn’t forget anything or didn’t commit any major error. For anything, errors, ideas or if you just want to say hi, don’t esitate to contact me on Telegram (see About page) or via e-mail at manuel[dot]romei[at]outlook[dot]it.

Pwn the world and have fun!

Many thanks to Pratik Devkota for proofreading this article and suggesting improvements.