C++ 02 Apr 2010 19:11:11
C++ Set Performance
(newer version from 2012-02-20, older std::map benchmarks for GNU g++ 4.3.2 and MSVC++ 2010 beta 2)
A performance comparison of the speed of operations on the various set implementations used in C++. There are 16383 unique elements across which 1000000 insert, lookup, iterate, and erase operations are performed.
And this time I’ve left the raw tick numbers and enabled table sorting so you can compare for yourself. Just be aware that the Linux and Windows numbers cannot be compared against each other.
Sources
- Containers: std::set, std::unordered_set, boost::unordered_set, sorted_vector, sorted_deque
- Ticks counted via cycle.h (local mirror)
- Source: speed-set-int.cpp and speed-set-string.cpp
Linux: GNU g++ 4.4.1
- Updated 2010-04-14: Rerun with g++ 4.4.1. Almost identical numbers, but boost::unordered_set::erase() is much faster in this version.
- Compiler: GNU g++ 4.4.1 -std=c++0x -O3
- Arch: Linux 2.6.27.9-159.fc10.x86_64, 2.66GHz Xeon, 8 GiB RAM
- Memory usage counted via Valgrind
Key: uint32_t
g++ 4.4.1 <uint32_t> | Insert | Lookup | Iterate | Erase | Allocs | Bytes Allocated |
---|---|---|---|---|---|---|
std::set | 363054416 | 349605792 | 769264 | 35981600 | 16393 | 655520 |
std::unordered_set | 33186608 | 30150224 | 184160 | 30409952 | 16405 | 746448 |
boost::unordered_set | 33497792 | 30615040 | 145888 | 7092176 | 16407 | 657536 |
sorted_vector | 360044096 | 262507936 | 24736 | 112005632 | 24 | 131228 |
sorted_deque | 530982432 | 420268752 | 42560 | 171634064 | 143 | 68608 |
google::sparse_hash_set | 172455808 | 162434576 | 582080 | 1321102576 | 42586 | 3709528 |
google::dense_hash_set | 26997632 | 10709024 | 158496 | 117169760 | 20 | 262176 |
Key: std::string
g++ 4.4.1 <string> | Insert | Lookup | Iterate | Erase | Allocs | Bytes Allocated |
---|---|---|---|---|---|---|
std::set | 1670245616 | 1636179776 | 1030912 | 181400576 | 16393 | 655520 |
std::unordered_set | 231392336 | 224671712 | 971824 | 120232400 | 16405 | 746448 |
boost::unordered_set | 207438368 | 198460448 | 651152 | 26624448 | 16407 | 657536 |
sorted_vector | 3500242208 | 1603444144 | 132976 | 1846192656 | 24 | 262296 |
sorted_deque | 2747723712 | 1867292544 | 131840 | 1092511920 | 272 | 136688 |
Windows: MSVC++ 2010 RC1
- Compiler: MSVC++ 2010 RC1 _SECURE_SCL=0
- Arch: Windows 7 64 bit, 3.0GHz Core2Duo, 4 GiB RAM
- Extra Container: sti::sset (wouldn’t compile with g++, so omitted there for now)
Key: uint32_t
VC++ 2010 RC1 <uint32_t> | Insert | Lookup | Iterate | Erase |
---|---|---|---|---|
std::set | 714514707 | 342805817 | 663471 | 75306186 |
std::unordered_set | 411334245 | 47025360 | 113256 | 118094400 |
boost::unordered_set | 50255388 | 17265717 | 121761 | 10382427 |
sorted_vector | 685004391 | 283634424 | 21754 | 117755028 |
sorted_deque | 1152645203 | 470817207 | 255663 | 650932533 |
sti::sset | 294897976 | 267982380 | 67113 | 54127557 |
Key: std::string
VC++ 2010 RC1 <string> | Insert | Lookup | Iterate | Erase |
---|---|---|---|---|
std::set | 1370212416 | 876426329 | 699075 | 166895604 |
std::unordered_set | 677187891 | 141716270 | 224622 | 137859399 |
boost::unordered_set | 173249656 | 133769232 | 366102 | 20043045 |
sorted_vector | 2529399042 | 899673525 | 45090 | 2404031290 |
sorted_deque | 2560535001 | 1167986520 | 270522 | 1887514182 |
sti::sset | 724926886 | 639082764 | 97317 | 112394043 |
on 27 Oct 2010 at 22:34:49 1.Mahbub Murshed said …
The numbers are: the smaller the better. Right?
on 27 Oct 2010 at 22:37:41 2.Tino Didriksen said …
Yep. The numbers denote time/ticks and memory used performing the action, so smaller is better.
on 25 Jan 2011 at 13:55:58 3.Ethan Schreiber said …
Why do you suppose google::dense_hash_set used less memory than google::sparse_hash_set ? According to Google, this is opposite of what one should expect.
on 25 Jan 2011 at 16:05:46 4.Tino Didriksen said …
Very likely because I am using key type uint32_t.
sparse_hash_set allocates each element separately, where dense_hash_set allocates contiguous chunks to be doled out later. For type uint32_t that is much more efficient, but I would expect that for a larger type it would be drastically inverted.