C++ 16 Feb 2010 19:58:03
C++ Convert String to Int Speed
(There is also an opposite int-to-string performance test.)
(Updated 2010-04-15: Re-run with g++ 4.4.1 and VC++ 2010 Express; updated tables to show both ticks and relative factor, instead of just percentage of baseline.)
A performance benchmark of which method is faster of converting an std::string to an integer. The goal is ending up with an integer with the value represented in an std::string.
The tested methods are:
- atoi()
- atol()
- strtol()
- std::stringstream
- std::stringstream, reusing the object
- boost::lexical_cast<int>()
- a hand-written naive loop
Source for the test is at speed-string-to-int.cpp with cycle.h.
The compilers are Microsoft Visual C++ 2010 Express as VC10 with _SECURE_SCL disabled, GNU g++ 4.4.1, and LLVM clang++ from svn.
Tests were run for converting 100000 string containing integers in the range -50000 to 50000. The result for atoi() is set as the baseline 100% and the other numbers is time spent relative to that. The naive loop wins by a large margin.
Windows: MSVC++ 2010 Express
- Compiler: MSVC++ 2010 Express _SECURE_SCL=0
- Arch: Windows 7 64 bit, 1.83GHz Core2Duo, 4 GiB RAM
VC++ 2010 | Ticks | Relative Factor |
---|---|---|
atoi() | 12414303 | 1.00 |
atol() | 12373119 | 1.00 |
strtol() | 11709643 | 0.94 |
lexical_cast | 596050246 | 48.01 |
stringstream | 667230146 | 53.75 |
stringstream reused | 365722225 | 29.46 |
naive | 2307921 | 0.19 |
Linux: GNU g++ 4.4.1
- Compiler: GNU g++ 4.4.1 -std=c++0x -O3
- Arch: Linux 2.6.27 x86_64, 2.66GHz Xeon, 8 GiB RAM
g++ 4.4.1 | Ticks | Relative Factor |
---|---|---|
atoi() | 7916992 | 1.00 |
atol() | 7953520 | 1.00 |
strtol() | 7946112 | 1.00 |
lexical_cast | 92099568 | 11.63 |
stringstream | 188648432 | 23.83 |
stringstream reused | 43137456 | 5.45 |
naive | 2688976 | 0.34 |
Linux: LLVM clang++ 1.5 (trunk 101368)
- Compiler: clang++ 1.5 (trunk 101368) -std=c++0x -O3
- Arch: Linux 2.6.27 x86_64, 2.66GHz Xeon, 8 GiB RAM
clang++ 1.5 (trunk 101368) | Ticks | Relative Factor |
---|---|---|
atoi() | 8026304 | 1.00 |
atol() | 7823456 | 0.97 |
strtol() | 7597696 | 0.95 |
lexical_cast | 94250752 | 11.74 |
stringstream | 182800752 | 22.78 |
stringstream reused | 43234320 | 5.39 |
naive | 2951504 | 0.37 |
on 20 Mar 2010 at 03:47:49 1.Alex said …
It’s funny.
But why boost::lexical_cast is so slow?
on 20 Mar 2010 at 10:28:03 2.Tino Didriksen said …
boost::lexical_cast is so slow because it wraps around a std::stringstream and performs many extra checks on top of that, such as ensuring that the entire input was used for conversion.
on 09 Mar 2011 at 13:18:58 3.Tomalak Geret'kal said …
Does MSVS2010 use a C++0x stdlib by default?
on 09 Mar 2011 at 13:57:42 4.Tino Didriksen said …
Yes it does.
on 20 Apr 2011 at 01:40:45 5.graphitemaster said …
the naive method is the fastest – always is.
on 13 Jul 2011 at 21:30:04 6.phax said …
static uint64_t decdigits[100] =
{
0ll,0ll,0ll,0ll,0ll,0ll,0ll,0ll,0ll,0,
1ll,10ll,100ll,1000ll,10000ll,100000ll,1000000ll,10000000ll,100000000ll,1000000000ll,
2ll,20ll,200ll,2000ll,20000ll,200000ll,2000000ll,20000000ll,200000000ll,2000000000ll,
3ll,30ll,300ll,3000ll,30000ll,300000ll,3000000ll,30000000ll,300000000ll,3000000000ll,
4ll,40ll,400ll,4000ll,40000ll,400000ll,4000000ll,40000000ll,400000000ll,400000000ll,
5ll,50ll,500ll,5000ll,50000ll,500000ll,5000000ll,50000000ll,500000000ll,5000000000ll,
6ll,60ll,600ll,6000ll,60000ll,600000ll,6000000ll,60000000ll,600000000ll,6000000000ll,
7ll,70ll,700ll,7000ll,70000ll,700000ll,7000000ll,70000000ll,700000000ll,7000000000ll,
8ll,80ll,800ll,8000ll,80000ll,800000ll,8000000ll,80000000ll,800000000ll,8000000000ll,
9ll,90ll,900ll,9000ll,90000ll,900000ll,9000000ll,90000000ll,900000000ll,9000000000ll
};
long long int naive2(const char * __restrict__ p)
{
bool neg((*p==’-‘)?1:0);
register int64_t num(0);
register size_t pos(strlen(p+neg)-1);
neg && ++p;
while (*p)
num+= decdigits[(*p++ – ‘0’)* 10 + pos–];
return (neg?-num:num);
}
on 14 Jul 2011 at 15:06:49 7.phax said …
Another naive() in asm though, only tested on Linux g++ 4.4.3, not sure if VC++ supports this semantics for inlining asm.
long long int __inline__ naive2(const char * p)
{
int result(0);
int neg(0);
// result -> ecx, p -> ebx
asm (
“movl %0, %%ecx\t\n”
“movl %2, %%ebx\t\n”
“movl $0, %%edx\t\n”
“movb (%%ebx),%%dl\t\n”
“subb $45, %%dl\t\n”
“jnz loop1\t\n”
“movl $1, %1\t\n”
“incl %%ebx\t\n”
“loop1:\t\n”
“movl $0, %%edx\t\n”
“movb (%%ebx), %%dl\t\n”
“testb %%dl,%%dl\t\n”
“jz exitloop1\t\n”
“subb $48, %%dl\t\n”
“movl %%ecx, %%eax\t\n”
“shll $3,%%ecx\t\n”
“addl %%eax, %%ecx\t\n”
“addl %%eax, %%ecx\t\n”
“addl %%edx,%%ecx\t\n”
“incl %%ebx\t\n”
“jmp loop1\t\n”
“exitloop1:\t\n”
“movl %%ecx, %0\t\n”
:
“=r” (result), “=m” (neg)
:
“r”(p),”0″(result), “m” (neg)
: “%ebx”, “%eax”, “%ecx”, “%edx”
);
return neg?-result:result;
}
on 25 Oct 2015 at 04:39:23 8.Ariel said …
Could you please help me with something I can’t understand?
define inf 5000000000LL
I don’t really get what is that meaning.
Why there are two ‘LL’ behind the number?
thank you
on 25 Oct 2015 at 11:02:03 9.Tino Didriksen said …
It means “long long”, see http://en.cppreference.com/w/cpp/language/integer_literal. You’d be better off asking such questions on IRC: Freenode’s ##C++ or QuakeNet’s #C++