Thiruvel Thirumoolan
2012-11-30 14:56:54 UTC
Hello,
I work on Apache Hive and it currently uses antlr 3.0.1. We would like to upgrade to antlr 3.4 so its easy to work with other Apache projects on Hadoop that use antlr 3.4. We found that the parse tree generated from Hive.g [1] is different with 3.0.1/3.1/3.2 and 3.3/3.4.
I have stripped down the lengthy grammar and created a smaller version (Insert.g [2]). I have pushed a small mvn v3 project to https://github.com/thiruvel/HiveANTLR34Issue that uses ANTLR in a way Hive uses it. Here is the tree difference and the entire output is on github. One can run "mvn test" to simulate it.
Antlr 3.0.1/3.1/3.2:
( TOK_DESTINATION( TOK_TAB( TOK_TABNAME( TABLE_X))( TOK_PARTSPEC( TOK_PARTVAL( DIM_1)( 'A'))( TOK_PARTVAL( DIM_2)( 'B')))))
Antlr 3.3/3.4:
( TOK_DESTINATION( TOK_TAB))
Are we missing something in the grammar or is this a bug addressed in v4? I am afraid we can't move to v4 as that would mean moving all other projects to v4. Are there any workarounds that we can use with antlr 3.4 to ensure a similar Tree is generated?
Any help is greatly appreciated.
Thank You!
Thiruvel
[1] - http://svn.apache.org/repos/asf/hive/branches/branch-0.9/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g
[2] - https://github.com/thiruvel/HiveANTLR34Issue/blob/master/src/main/antlr3/com/yahoo/antlr/Insert.g
List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
I work on Apache Hive and it currently uses antlr 3.0.1. We would like to upgrade to antlr 3.4 so its easy to work with other Apache projects on Hadoop that use antlr 3.4. We found that the parse tree generated from Hive.g [1] is different with 3.0.1/3.1/3.2 and 3.3/3.4.
I have stripped down the lengthy grammar and created a smaller version (Insert.g [2]). I have pushed a small mvn v3 project to https://github.com/thiruvel/HiveANTLR34Issue that uses ANTLR in a way Hive uses it. Here is the tree difference and the entire output is on github. One can run "mvn test" to simulate it.
Antlr 3.0.1/3.1/3.2:
( TOK_DESTINATION( TOK_TAB( TOK_TABNAME( TABLE_X))( TOK_PARTSPEC( TOK_PARTVAL( DIM_1)( 'A'))( TOK_PARTVAL( DIM_2)( 'B')))))
Antlr 3.3/3.4:
( TOK_DESTINATION( TOK_TAB))
Are we missing something in the grammar or is this a bug addressed in v4? I am afraid we can't move to v4 as that would mean moving all other projects to v4. Are there any workarounds that we can use with antlr 3.4 to ensure a similar Tree is generated?
Any help is greatly appreciated.
Thank You!
Thiruvel
[1] - http://svn.apache.org/repos/asf/hive/branches/branch-0.9/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g
[2] - https://github.com/thiruvel/HiveANTLR34Issue/blob/master/src/main/antlr3/com/yahoo/antlr/Insert.g
List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address