Discussion:
Occasionally match hidden channel
Zenzike
2008-12-18 18:04:55 UTC
Permalink
I'm trying to make it so that input only occasionally matches tokens
that are sent to the hidden channel.
My grammar has something along the following:

SLASHSLASH : '\\\\' {$channel=HIDDEN}; // For the symbols "\\"

This is regarded as whitespace in the language I'm trying to define,
and should be able to
appear between any two tokens. However, occasionally, I want to make
sure that there is definitely
a SLASHSLASH token between two other tokens.

As far as I can tell, the way to do this is to hide the SLASHSLASH
(which I have done),
and in places where it is needed, I should query the hidden channel --
but I can't find
any documentation about how this done (although there seems to be
plenty saying that it is possible!)

Something like this would be ideal:

rule : TOKEN1 {channel.hidden.match(SLASHSLASH)} TOKEN2 ;

What is the right syntax?

Many thanks!

zenzike

List: http://www.antlr.org:8080/mailman/listinfo/antlr-interest
Unsubscribe: http://www.antlr.org:8080/mailman/options/antlr-interest/your-email-address
Nicolas Wu
2008-12-18 18:13:48 UTC
Permalink
To make things worse, sometimes I want to make sure that there is
definitely no SLASHSLASH between elements.

I know it sounds like it should just be an ordinary token, but it isn't,
since there are only two specific places in the grammar where
these funny requirements crop up. The rest of the time,
SLASHSLASH is invisible.

Thanks again,

zenzike
Post by Zenzike
I'm trying to make it so that input only occasionally matches tokens
that are sent to the hidden channel.
SLASHSLASH : '\\\\' {$channel=HIDDEN}; // For the symbols "\\"
List: http://www.antlr.org:8080/mailman/listinfo/antlr-interest
Unsubscribe: http://www.antlr.org:8080/mailman/options/antlr-interest/your-email-address
Jim Idle
2008-12-18 18:42:18 UTC
Permalink
Post by Nicolas Wu
To make things worse, sometimes I want to make sure that there is
definitely no SLASHSLASH between elements.
I know it sounds like it should just be an ordinary token, but it isn't,
since there are only two specific places in the grammar where
these funny requirements crop up. The rest of the time,
SLASHSLASH is invisible.
To predicate on it not being there use this:

r : SOMETHING {(TokenStream)input ).get( input.index()-1 ).getType() !=
SLASHSLASH) }?=> SOMETHINGELSE
;
Jim Idle
2008-12-18 18:40:41 UTC
Permalink
Post by Zenzike
I'm trying to make it so that input only occasionally matches tokens
that are sent to the hidden channel.
SLASHSLASH : '\\\\' {$channel=HIDDEN}; // For the symbols "\\"
This is regarded as whitespace in the language I'm trying to define,
and should be able to
appear between any two tokens. However, occasionally, I want to make
sure that there is definitely
a SLASHSLASH token between two other tokens.
As far as I can tell, the way to do this is to hide the SLASHSLASH
(which I have done),
and in places where it is needed, I should query the hidden channel --
but I can't find
any documentation about how this done (although there seems to be
plenty saying that it is possible!)
rule : TOKEN1 {channel.hidden.match(SLASHSLASH)} TOKEN2 ;
What is the right syntax?
This will do it:

((TokenStream)input ).get( input.index()-1 ).sometokenmethodsuchasgetType() == SLASHSLASH

Jim
Zenzike
2008-12-20 13:32:29 UTC
Permalink
Post by Jim Idle
((TokenStream)input ).get( input.index()-1 ).sometokenmethodsuchasgetType() == SLASHSLASH
Thanks! I'm amazed at how friendly and quick to reply this community is!

Unfortunately, the problem is a little more subtle than that; there
might be real whitespace elements that I don't want to parse between
SLASHSLASH and OTHER.
However, I thought I solved this by using:

!((CommonTokenStream)input).getTokens(input.LT(-1).getTokenIndex(),
input.LT(1).getTokenIndex(), SLASHSLASH).isEmpty()

But it turns out the problem is more subtle than this ... consider the
following snippet of my grammar (the whole this is rather huge, but
hopefully this will be enough):

display
: BDISPLAY item
( sep item
| {!((CommonTokenStream)input).getTokens(input.LT(-1).getTokenIndex(),
input.LT(1).getTokenIndex(), SLASHSLASH).isEmpty()}?=>
// {((LateChannelTokenStream)input).setTokenTypeChannel(SLASHSLASH,
Token.DEFAULT_CHANNEL);}
// SLASHSLASH+
// {((LateChannelTokenStream)input).resetTokenTypeChannel(SLASHSLASH);}
// item
)* EDISPLAY -> ^(DISPLAY item+)
;

sep
: SEMICOLON ;
...
SLASHSLASH : '\\\\' {$channel=HIDDEN;};


The trouble is that matching "item" by itself, rather than "sep item",
causes the "multiple alternatives" problem elsewhere in the grammar --
a separation token like "sep" is really needed
in this case. The trouble is that as it stands, the grammar seems
fine, but compiling gives me the following error (due to the semantic
predicate above):

Error creating resources for Aztex. Cause
java.util.NoSuchElementException: no such attribute: description in
template context [outputFile parser genericParser(...) cyclicDFA
if(dfa.specialStateSTs)_subtemplate anonymous cyclicDFAState
cyclicDFAEdge notPredicate evalPredicate(...)]

I expect this is because we need to tell the DFA that there is no
ambiguity here -- which wouldn't be possible for it to detect, since
SLASHSLASH is in the hidden channel.
to try and solve this I implemented a LateChannelTokenStream, that has
the following methods:

protected int skipOffTokenChannels(int i) {
int n = tokens.size();

while ( i<n && ((Token)tokens.get(i)).getChannel()!=channel ) {
if (channelOverrideMap!=null) {
Integer channelI = (Integer)
channelOverrideMap.get(new Integer(((Token)tokens.get(i)).getChannel()));

if (channelI.intValue() == channel) {
((Token)tokens.get(i)).setChannel(channel);
break;
}
}
i++;
}
return i;
}

public void resetTokenTypeChannel(int ttype) {
if (channelOverrideMap != null)
channelOverrideMap.remove(new Integer(ttype));
}

The idea is to add the SLASHSLASH token to the default channel by
using setTokenTypeChannel(SLASHSLASH, DEFAULT_CHANNEL) where
necessary, then switching it back off with resetTokenTypeChannel()
when it isn't.

For some reason this isn't working at all, though I expect it's
because the DFA isn't enabling the actions when doing a lookahead, and
so the hidden SLASHSLASH isn't visible, and the DFA fails.

To try and counter this, I tried to force backtracking, with the
options {k=0; backtrack=true;} set for this rule, but that doesn't
seem to have helped either.

I know this is a fairly involved problem -- I can supply the rest of
my grammar if it helps, but does anybody have any suggestions?

Thanks lots,

zenzike

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
Loading...