org.metasyntactic.math.automata.tool
Class LexicalTokenizer

java.lang.Object
  |
  +--org.metasyntactic.math.automata.tool.LexicalTokenizer

public class LexicalTokenizer
extends java.lang.Object

Pattern Mathing Based on NFA's

One method for pattern matching is to construct the transition table of an NFA N for the composite pattern p1 | p2 | ... | pn. This can be done by first creatig an NFA N(pi) for each pattern pi, then adding a new start state s0, and finall linking s0 to the start state of each N(pi) with an empty-transition.

To simulate this NFA we can use a modification of our normal NFA accept method. When we simulate the NFA, we construct the sequence of sets of states that the combined NFA can be in after seeing each input character. Even if we find a set of states that contains an accepting state, to find the longest match we must continue to simulate teh NFA until it reaches termination, that is, a set of states from which there are no transitions on the current input symbol.


Constructor Summary
LexicalTokenizer(java.util.List patterns)
          Creates new LexicalTokenizer
LexicalTokenizer(java.util.List patterns, TokenHandler handler)
           
 
Method Summary
static void main(java.lang.String[] args)
           
 void tokenize(java.util.List buffer)
           
 void tokenize(java.util.List buffer, java.util.Map symbolToClasses)
           
 void tokenize(java.util.List buffer, TokenHandler handler, java.util.Map symbolToClasses)
          Notes: Because buffer is randomily accessed it is recommended that you use a fast-lookup list (like Vector or ArrayList) rather than a sequential list (like LinkedList).
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

LexicalTokenizer

public LexicalTokenizer(java.util.List patterns,
                        TokenHandler handler)

LexicalTokenizer

public LexicalTokenizer(java.util.List patterns)
Creates new LexicalTokenizer

Method Detail

tokenize

public void tokenize(java.util.List buffer)

tokenize

public void tokenize(java.util.List buffer,
                     java.util.Map symbolToClasses)

tokenize

public void tokenize(java.util.List buffer,
                     TokenHandler handler,
                     java.util.Map symbolToClasses)
Notes: Because buffer is randomily accessed it is recommended that you use a fast-lookup list (like Vector or ArrayList) rather than a sequential list (like LinkedList).

Parameters:
buffer - The buffer to tokenize
handler - The tokenhandler that matched tokens are passed to

main

public static void main(java.lang.String[] args)