Token hook

From BorielWiki
Jump to: navigation, search

Token hooks (a.k.a. pattern actions)

<purge/> When a pattern matches the input a Token instance is created and a function (referred here as token hook or pattern action) is called with the Token instance as the first parameter.

The function must expect 2 parameters:

  • Token instance: This is the token instance created with the matched input. It's properties (such as id or text) will contain some information about it, what was matched, etc.
  • Lexer instance: This object is an instance of Blex (the lexical scanner). This is useful if you want to interact with the Blex instance which triggered the hook. E.g. you can change the Blex status (in a similar way as YY_STATUS in LEX), its operation mode, etc.

An example of Token action could be:

def print_hook(token, lexer):
    print token.text, 'was recognized as', token.id
 
    return token

This function is expected to return a Token instance (usually the same passed as 1st parameter), which eventually will be passed to the parser. As stated before, if this function returns False, the input will be discarded. On the other hand, if it returns None, the token will be discarded and another pattern will be tried.

The following is an example of a hook that will cause the matched input to be converted to uppercase. In the LEX/YACC framework, you will do something in C language, like:

str = toupper(yy_text);

Here, we could do the following:

def toupper_hook(token, lexer):
    token.text = token.text.upper()
 
    return token

So every token whose action hook is defined as toupper_hook will have its token.text converted to uppercase before being passed to the parser. Each pattern can have an associated hook. This is defined in the Blex instance, calling the add_token() method.