January 6, 2010

boost.spirit2 vs. atoi

I did very primitive test of performance for new version of boost.spirit. I read somewhere, that boost.spirit v.2 over-performs atoi on parsing of integers, and I decided to check this with help of program, shown below. With compilation in release mode, and for 10000000 iterations, I got following results:
atoi   = (10000000 rep): 00:00:00.404165
spirit = (10000000 rep): 00:00:00.043559
Update: I slightly modified parser so it will handle whitespace before numbers (as atoi does), and still have better results for boost.spirit:
atol   = (10000000 rep): 00:00:00.411694
spirit = (10000000 rep): 00:00:00.112000
Next, I'll do tests for complex parsers.

#include <boost/spirit/include/qi.hpp>
#include <iostream>
#include <boost/date_time/local_time/local_time.hpp>
#include <boost/date_time/posix_time/posix_time.hpp>

using namespace boost::spirit;

typedef void (*void_func)(int);
void measure_time(const char *str, void_func func, int number_of_repeat) {
    using namespace boost::posix_time;
    ptime tStart = microsec_clock::local_time();
    (*func)(number_of_repeat);
    ptime tEnd = microsec_clock::local_time();
    time_duration tVal = tEnd - tStart;
    std::cout << str << " (" << number_of_repeat << " rep): " << to_simple_string(tVal)  << std::endl;
}

const char* nstr="123456";
const char* const nstr2="123456";

void test_spirit(int number_of_repeat) {
    int val=0;
    for(int i=0; i < number_of_repeat; i++) {
        qi::parse(nstr,nstr+6,int_,val);
        if (val != 123456)
            std::cout << "Spirit Errror! val=" << val << std::endl;
    }
}

void test_atol(int number_of_repeat) {
    long val=0;
    for(int i=0; i < number_of_repeat; i++) {
        val=atoi(nstr2);
        if (val != 123456)
            std::cout << "Atoi Errror! val=" << val << std::endl;
    }
}

int main(int /*argc*/, char **/*argv*/) {
    measure_time("atol   =", test_atol, 10000000);
    measure_time("spirit =", test_spirit, 10000000);
    return 0;
}

15 comments:

The Peripatetic Programmer said...

Are you certain that spirit's int_ parser recognizes the same input as atoi?

Alex Ott said...
This comment has been removed by the author.
Joel de Guzman said...

This post got interest here: http://www.facebook.com/djowel?v=feed&story_fbid=276147167924

Alex Ott said...

with small modification of parser, that allow to skip spaces, i got following results:
atol = (10000000 rep): 00:00:00.411694
spirit = (10000000 rep): 00:00:00.112000
integer overflow is checked by default, as i remember...

KissTheGoat said...

Que? the meaningful parts of your numbers are off the right side of the column in my very-late-model mozilla-based browwer.

Mateusz Loskot said...

Alex, I've been thinking of doing similar checks, but due to laziness...
It's wonderful news confirming it's worth to consider Spirit for small things too. It doesn't bite as many may think anyway :)

Alex Ott said...

Yep, Mateuz - I also thought, that atoi will overperform spirit.
Now I'm working on more complex test - parsing of HTTP Date string - I have manual parser in our internal project, and I also want to compare spirit2 parser with spirit.classic parser

Mateusz Loskot said...

Alex,

Great, thanks in advance.
By the way, do you have any observations regarding size of generated binary files, quite a concern against Spirit sometimes.

Alex Ott said...

with stripped debug information, this test has size 56464. With many templates, it will bigger, but not so much. Biggest executable are when you put debug information into it, yes - in this case Spirit adds a lot...

Mateusz Loskot said...

Thanks Alex for this additional detail.

Lovely Day said...

Hi, i did test your code. And i found if "int val=0;" is moved inside the loop, only the first parse result is correct. did i get anything wrong?

自由的马甲 said...

Alex, glad to see your test, but I copied your code and run on my machine, it shows:
atol = (10000000 rep): 00:00:00.589094
spirit = (10000000 rep): 00:00:01.269398

Any comments?

Alex Ott said...

it's depends on compilation flags, please use -O2 and no debug

Jean-Paul Rigault said...

I was somewhat puzzled by the remark from "Lovely Day" on this blog (about moving 'int val=0' inside the loop). In fact your test program is somewhat flawed, if I may : in the statement 'qi::parse(nstr,nstr+6,int_,val)' the first iterator (nstr) is passed by reference and is modified by parse(). At the return point its value is (old)nstr+6 which means that all the calls to qi::parse() (but the first) operate on an empty sequence. I fixed easily the program by reinitializing nstr at each iteration.

Your initial program gave the following result on my machine:

atol = (10000000 rep): 00:00:00.288255
spirit = (10000000 rep): 00:00:00.037653

After the fix, we have:

atol = (10000000 rep): 00:00:00.293819
spirit = (10000000 rep): 00:00:00.090364

The speedup is only 3 instead of 7, but Spirit has still the lead (Intel Xeon 4 procs, Linux x86_64, boost-1.43, gcc-4.5.0 with -O2).

May be you already had realized the problem, but I have seen no mention of it on the web... Thanks anyhow for bringing this interesting point.

Alex Ott said...

Yes, I was pointed to this problem in other discussion. But spirit developers had shown, that Spirit2 faster also in other parsing examples