TCMalloc and MySQL

jeffdavis · on Feb 22, 2013

Warning: tcmalloc does not release memory back to the OS, ever:

"TCMalloc currently does not return any memory to the system."[1]

That means if you have many long-running processes, then each of them will consume the maximum amount of memory that it ever has. Not good for a multi-tenant setup.

If it's a dedicated server running one multi-threaded application, maybe that's OK, although I'd be a little bit wary anyway.

I should note that, even if the application doesn't let the memory go, the OS could page out the inactive regions. Not really something that I would like to rely on, though. There are some other caveats also, like it would make memory accounting a little trickier ("Wow, that process is huge! Oh, never mind, it's mostly paged out.").

For what it's worth, I just spent considerable effort to get rid of tcmalloc due (in part) to problems like this. [2]

[1] http://goog-perftools.sourceforge.net/doc/tcmalloc.html

[2] You wouldn't think it would be a lot of effort, but we were using dynamic libraries that were linking against tcmalloc, which is outright dangerous if the main executable isn't linked against tcmalloc (you don't want to replace the allocator in a running executable). And some of those libraries were actually using the tcmalloc-specific features/symbols, so I had to get away from that first.

sghemawat · on Feb 22, 2013

About [1] Sorry about that: the document you linked to is amazingly stale. tcmalloc has been releasing memory to the system for many years. See for example the IncrementalScavenge routine in a version of page_heap.cc from Dec 2008:

https://code.google.com/p/gperftools/source/browse/trunk/src...

One caveat: physical memory and swap space is released, but the process's virtual size will not decrease since tcmalloc uses madvise(MNONE) to release memory.

About [2], code using tcmalloc-specific features/symbols is definitely a problem. I would strongly advise against doing that and sticking to the libc interfaces instead for the reason you pointed out.

jeffdavis · on Feb 22, 2013

Strange, the page showed up first when I googled "tcmalloc" and the problem was also present in the version that I was using (at least I think it was). My apologies.

Yeah, regarding [2], that was definitely not my idea.

sghemawat · on Feb 22, 2013

Not your fault. We just plain forgot to update the documentation, so the freshest available document is a few years out of date.

jeffffff · on Feb 22, 2013

wow i didn't realize madvise could actually modify the memory contents and return pages, but it makes sense that it can because that is a useful feature. very cool!

jeffffff · on Feb 22, 2013

i don't know of any malloc implementations that return memory to the system. the only way to do that is to pass a negative value to sbrk, which requires all the memory being returned to be at the end of the data segment. even if you free 99% of the memory you were using, if one byte is still in use at the end of the data segment no memory can be returned. this is almost always the case in practice, so no malloc implementations bother to return memory in this way. on the other hand, unmapping an anonymous mmap does return the memory to the system. most mallocs handle large allocations by delegating to anonymous mmap for this reason. you probably could've tweaked the mmap threshold for tcmalloc to get the same result.

edit: apparently tcmalloc is using mmap to allocate more of its memory than i realized, not sure why i thought it was using sbrk for everything

priteau · on Feb 22, 2013

OpenBSD's malloc implementation does: "On a call to free, memory is released and unmapped from the process address space using munmap."

http://en.wikipedia.org/wiki/C_dynamic_memory_allocation#Ope...

gsg · on Feb 22, 2013

You mean address space. The OS can and will reclaim all the memory it likes, when it likes.

Pages that are mapped but are left untouched for a long time aren't problematic in modern systems. There is a small cost for the PTE but nothting like an entire page of physical memory.

jeffdavis · on Feb 22, 2013

I was talking about memory that was used in the past, but is no longer used.

The OS can either keep it resident or swap it out, but it has no way of knowing that it is no longer in use (short of something like madvise()). In the realm of sanity, the OS can't just arbitrarily throw away memory that a process has written to (unless it also kills the process).

antirez · on Feb 21, 2013

Good allocators are good for different things, but what the glibc allocator is good for is yet to be discovered: fragments like a glass fallen in the floor and has contention issues.

scott_s · on Feb 22, 2013

Stability. Give it credit for being - probably - the most widely used implementation of malloc in the world.

I say this as someone who has implemented a lock-free memory allocator for mutlithreaded applications. I cared about performance, and I was willing to sacrifice nice things like detecting double-frees. I moved away from the project largely because I didn't want to be in a performance race with TCMalloc. (At the end, TCMalloc outperformed my allocator in some benchmarks, but not in others. But, surprisingly, there were also some places were glibc outperformed both.)

malkia · on Feb 22, 2013

It could be that the MSVCRT implementation is the most used one (actually maybe the one in HeapAlloc and so). How would one know for sure :)

It's probably also used in all Xbox-es too...

HarrisonFisk · on Feb 21, 2013

jemalloc is the new hotness for MySQL. We are using it at Facebook (and I know percona/oracle use it for benchmarks and testing as well).

Good benchmark showing the impact of the different options:

http://www.mysqlperformanceblog.com/2012/07/05/impact-of-mem...

thrownaway2424 · on Feb 22, 2013

That blog makes it look like it and tcmalloc are roughly on-par. Do you think there's any possibility that FB leans toward jemalloc chiefly because the author works there, and the author of tcmalloc works at Google?

cpeterso · on Feb 22, 2013

Firefox uses jemalloc, too.

ihsw · on Feb 22, 2013

Redis as well.

ck2 · on Feb 22, 2013

Apparently once you start having more threads than cores, tcmalloc really shines:

http://i.imgur.com/4RzmQD6.png

Looks like those on centos can install it easily via

   yum install gperftools-libs --enablerepo=epel

which installs

  /usr/lib64/libtcmalloc.so.4
  /usr/lib64/libtcmalloc_minimal.so.4

then you just need to edit your mysql init script?

  test -e /usr/lib64/libtcmalloc_minimal.so.4 && export LD_PRELOAD="/usr/lib64/libtcmalloc_minimal.so.4"

You can also try jemalloc which supposedly is close to as good as tcmalloc but uses less memory

   yum install jemalloc  --enablerepo=epel

which installs

  /usr/lib64/libjemalloc.so.1

and for your init.d

  test -e /usr/lib64/libjemalloc.so.1 && export LD_PRELOAD="/usr/lib64/libjemalloc.so.1"

ck2 · on Feb 22, 2013

oh and apparently mysql 5.5 users (not 5.1) can just directly use in my.cnf

  [mysqld_safe]
  malloc-lib=/usr/lib64/libtcmalloc_minimal.so.4

or

  malloc-lib=/usr/lib64/libjemalloc.so.1

http://dev.mysql.com/doc/refman/5.5/en//mysqld-safe.html#opt...

no export or script editing required

stock_toaster · on Feb 22, 2013

This is what I do at $dayjob.

We have been using tcmalloc for a while on our databases, as well as disabling the transparent huge pages and transparent huge page defrag (centos6). It made a big difference for us.

ck2 · on Feb 22, 2013

Can I ask you a dumb question: I think I just turned it on properly but I have no idea how to proactively confirm that mysql is actually using jemalloc, rather than just wait for better performance numbers?

Because it's an external environment variable, it doesn't actually show inside any of mysql's settings. No startup errors or runtime problems is always nice but I really am curious to know for a fact it worked.

Will probably have to ask this on stackexchange if you don't know.

stock_toaster · on Feb 22, 2013

With tcmalloc I get a few messages in the log about a large allocation on startup, but you can probably find it with this.

    # as root or sudo
    pmap -x $(pidof mysqld)|grep malloc

josephscott · on Feb 21, 2013

jemalloc is another strong option -

http://www.mysqlperformanceblog.com/2012/07/05/impact-of-mem...

http://www.quora.com/Is-tcmalloc-stable-enough-for-productio...

ComputerGuru · on Feb 22, 2013

nedmalloc [0] is my absolute favorite and pretty much owns everything else in terms of performance (esp. multi-threaded memory allocations), though I would not use it at the scale Facebook and GitHub are running on. It has subtle bugs that creep in and get fixed down the road. jemalloc and tcmalloc are very heavily tested and vetted, though and are great options. Basically, anything other than the default allocator on Windows/Mac/Linux is fine :)

The author of nedmalloc is working on a very exciting C++ API (actually, I think it's API-complete now) to make it a drop-in STL allocator. I personally use the C API in my C++ applications without a problem, mainly as a pool allocator. For me, the Windows allocators (both the old default and the new "low-fragmentation" default) are absolutely abysmal at deallocation. Pool allocators in general make that go away.

0: http://www.nedprod.com/programs/portable/nedmalloc/

xal · on Feb 21, 2013

Shopify runs TCMalloc for mysql as well.

telemachos · on Feb 21, 2013

I don't work with MySQL or Rails, but I read this all the way through, mostly because the story was well told.

Strikes me as a perfect example of a culture that works hard and enjoys the hell out of it too.

SEJeff · on Feb 21, 2013

They found a proverbial "silver bullet" in performance land. This almost never happens, but props to them for finding it. Now time to try this out!

kev009 · on Feb 22, 2013

It's fairly common to see double digit percent changes when swapping out lower level component implementations or version (compiler, JVM, OS, kernel, etc). That can be a good thing, in the case here where they found a win, or an awful thing.

minimax · on Feb 22, 2013

This is interesting and from a black box view of MySQL, this is a good solution. For the MySQL developers, it seems like an opportunity for improvement. When you get bottlenecked on malloc() it usually means you are frequently allocating many small objects. To me this sounds like a good opportunity to use a memory pool allocator (or find a way in the code to do fewer allocations).

malkia · on Feb 22, 2013

I've had mixed feelings about tcmalloc on Windows - that was 4-5 years ago, so things might be better. It was doing some hooking, looking at places to replace standard malloc/free/etc. throughout the whole address space, and on new dll's coming. Other than that, except when it was crashing for no reason (on some Windows 2003 servers for example), it was pretty good.

thrownaway2424 · on Feb 22, 2013

Try, as well, setting the value of tcmalloc.max_total_thread_cache_bytes to something larger than 16MB (the default). Reasonable values might range all the way to 1GB or more. Best to experiment and get data.