Discussion:
attempt to compare antlr vs bison/flex performance
Minas Hambardzumyan
2007-12-19 23:22:43 UTC
Permalink
Hello,

In an effort to compare antlr and bison/flex parsers, I did the
following exercise:

- found and compiled bison/flex "C" language parser from the following
location:
ftp://ftp.uu.net/usenet/net.sources/ansi.c.grammar.Z
- downloaded and compiled antlr3 implementation of "C" language parser
from this directory in antlr3 examples:
antlr/examples-v3/C/C
- preprocessed a "C" language file and stripped it such that both parser
would pass it through (mostly minor changes)
- compared runtime for each parser


The results of this exercise show that the antlr parser is ~9 times
slower than the flex/bison parser. I have used the same version of GNU C
compiler for both parsers, with exact same compile options.

Although the grammar definitions come from different sources, I still
think this exercise gives a general idea about performance differences
of these parsers. Could anyone please tell me if this is an expected
runtime difference or you think I could do some optimizations to get
faster performance from antlr3.

I can send the complete code I used for both parsers, or any parts of
it, if necessary. Just tell me which part would be of interest.

Thanks,
Minas
Johannes Luber
2007-12-19 23:31:10 UTC
Permalink
Post by Minas Hambardzumyan
Although the grammar definitions come from different sources, I still
think this exercise gives a general idea about performance differences
of these parsers. Could anyone please tell me if this is an expected
runtime difference or you think I could do some optimizations to get
faster performance from antlr3.
I know that in the perforce repository the current source is supposed to
perform at least 50% better than the one available. So it is possible to
optimize stuff. But Ter should know more about this.

Johannes
Terence Parr
2007-12-19 23:39:46 UTC
Permalink
Post by Johannes Luber
Post by Minas Hambardzumyan
Although the grammar definitions come from different sources, I still
think this exercise gives a general idea about performance
differences
of these parsers. Could anyone please tell me if this is an expected
runtime difference or you think I could do some optimizations to get
faster performance from antlr3.
I know that in the perforce repository the current source is
supposed to
perform at least 50% better than the one available. So it is
possible to
optimize stuff. But Ter should know more about this.
I'm working on ANTLR's analysis engine with Kay Roepke. A TSQL
grammar that took 1m30s is now taking about 10s! I have not optimized
the runtime.

Also note that the C grammar for v3 is optimized for readability not
at all for speed. It is not LL(1) by any stretch. I could alter to
be superfast but less readable...

Ter
Terence Parr
2007-12-20 17:39:36 UTC
Permalink
Hi Minas. Can you try it on the Java grammar(s)? I spent a bit of
time making that fast.

Ter
Post by Minas Hambardzumyan
Hello,
In an effort to compare antlr and bison/flex parsers, I did the
- found and compiled bison/flex "C" language parser from the
ftp://ftp.uu.net/usenet/net.sources/ansi.c.grammar.Z
- downloaded and compiled antlr3 implementation of "C" language
antlr/examples-v3/C/C
- preprocessed a "C" language file and stripped it such that both
parser would pass it through (mostly minor changes)
- compared runtime for each parser
The results of this exercise show that the antlr parser is ~9 times
slower than the flex/bison parser. I have used the same version of
GNU C compiler for both parsers, with exact same compile options.
Although the grammar definitions come from different sources, I
still think this exercise gives a general idea about performance
differences of these parsers. Could anyone please tell me if this
is an expected runtime difference or you think I could do some
optimizations to get faster performance from antlr3.
I can send the complete code I used for both parsers, or any parts
of it, if necessary. Just tell me which part would be of interest.
Thanks,
Minas
Edwards, Waverly
2007-12-20 19:11:58 UTC
Permalink
If possible, would you create a before and after grammar. By having
both it would give me ideas on what
to look for and think about in making a faster grammar.


Thank you,


W.

-----Original Message-----
From: antlr-interest-bounces-***@public.gmane.org
[mailto:antlr-interest-bounces-***@public.gmane.org] On Behalf Of Terence Parr
Sent: Thursday, December 20, 2007 12:40 PM
To: Minas Hambardzumyan
Cc: antlr-interest-***@public.gmane.org
Subject: Re: [antlr-interest] attempt to compare antlr vs
bison/flexperformance

Hi Minas. Can you try it on the Java grammar(s)? I spent a bit of time
making that fast.

[snip]
Minas Hambardzumyan
2007-12-20 22:27:29 UTC
Permalink
I guess I could compare antlr java parser with the bison version in the
gcc compiler. I will send an update when I get some results...

One conclusion seams to be obvious though. If the speed is of essence,
then one has to simplify the grammar definition as close to LL(1) as
possible. Doing so though moves the grammar definition away from the
simple to read EBNF format -- one of the most important advantages of
using antlr. So it is up to the user to pick -- easy to use/read vs not
os easy but fast...

Regards,
Minas.
Hi Minas. Can you try it on the Java grammar(s)? I spent a bit of time
making that fast.
Ter
Post by Minas Hambardzumyan
Hello,
In an effort to compare antlr and bison/flex parsers, I did the
- found and compiled bison/flex "C" language parser from the
ftp://ftp.uu.net/usenet/net.sources/ansi.c.grammar.Z
- downloaded and compiled antlr3 implementation of "C" language
antlr/examples-v3/C/C
- preprocessed a "C" language file and stripped it such that both
parser would pass it through (mostly minor changes)
- compared runtime for each parser
The results of this exercise show that the antlr parser is ~9 times
slower than the flex/bison parser. I have used the same version of
GNU C compiler for both parsers, with exact same compile options.
Although the grammar definitions come from different sources, I still
think this exercise gives a general idea about performance
differences of these parsers. Could anyone please tell me if this is
an expected runtime difference or you think I could do some
optimizations to get faster performance from antlr3.
I can send the complete code I used for both parsers, or any parts of
it, if necessary. Just tell me which part would be of interest.
Thanks,
Minas
Terence Parr
2007-12-28 17:56:36 UTC
Permalink
Post by Minas Hambardzumyan
I guess I could compare antlr java parser with the bison version in
the gcc compiler. I will send an update when I get some results...
One conclusion seams to be obvious though. If the speed is of
essence, then one has to simplify the grammar definition as close to
LL(1) as possible.
Just LL(*) or perhaps LL(k). It's the backtracking that is inefficient.
Post by Minas Hambardzumyan
Doing so though moves the grammar definition away from the simple to
read EBNF format -- one of the most important advantages of using
antlr. So it is up to the user to pick -- easy to use/read vs not os
easy but fast...
The point of LL(*) is to get power w/o forcing unnatural grammars or w/
o deadly performance. 90% of the time, you'll be fine with ANTLR.

Ter

Arnulf Heller
2007-12-21 21:10:13 UTC
Permalink
Hi,

a point that you might also consider is the performance of the Lexer alone.

As far as I can remember there were statements in this mailing list
that the ANTLR Lexer performance is not mind-blasting :-)
These folks also found ways to use Flex for the lexing part with good
speed improvements.

This gives the power of ANTLR parsing and the speed of Flex lexing :-)

Maybe you manage to separate your benchmark into the lexing and full
parsing part.
With ANTLR that should be perfectly possible because you can use the
Lexer "standalone" and pop tokens out.
If you are able to do that with Flex too (and I assume that is
possible but my Flex experience is rusty) you can compare the lexing
and parsing processing overhead for both approaches.

It might also prove difficult to compare the performance due to
caching of the file in the OS, the CPU, ...
I assume it's good advice to repeat the test several times an
alternating the two competitors.

In case you have reproducible results, please post it - I'm also curious :-)
Post by Minas Hambardzumyan
Hello,
In an effort to compare antlr and bison/flex parsers, I did the
- found and compiled bison/flex "C" language parser from the
ftp://ftp.uu.net/usenet/net.sources/ansi.c.grammar.Z
- downloaded and compiled antlr3 implementation of "C" language
antlr/examples-v3/C/C
- preprocessed a "C" language file and stripped it such that both
parser would pass it through (mostly minor changes)
- compared runtime for each parser
The results of this exercise show that the antlr parser is ~9 times
slower than the flex/bison parser. I have used the same version of
GNU C compiler for both parsers, with exact same compile options.
Although the grammar definitions come from different sources, I
still think this exercise gives a general idea about performance
differences of these parsers. Could anyone please tell me if this is
an expected runtime difference or you think I could do some
optimizations to get faster performance from antlr3.
I can send the complete code I used for both parsers, or any parts
of it, if necessary. Just tell me which part would be of interest.
Thanks,
Minas
Terence Parr
2007-12-22 17:27:37 UTC
Permalink
Post by Arnulf Heller
Hi,
a point that you might also consider is the performance of the
Lexer alone.
As far as I can remember there were statements in this mailing list
that the ANTLR Lexer performance is not mind-blasting :-)
that was v2. v3 performs much better.

Ter
Loading...