org.metasyntactic.math.automata
Interface RegularExpression

All Superinterfaces:
FiniteAutomata, java.io.Serializable
All Known Implementing Classes:
AbstractRegularExpression

public interface RegularExpression
extends FiniteAutomata

FORMAL DEFINITION OF A REGULAR EXPRESSION

Say that R is a regular expression if R is

  1. a for some a in the alphabet Σ,
  2. ε,
  3. ,
  4. (R1R2), where R1 and R2 are regular expressions,
  5. (R1R2), where R1 and R2 are regular expressions, or
  6. (R1*), where R1 is a regular expression

In items 1 and 2, the regular expressions a and ε represent the languages {a} and {ε}, respectively. In item 3, the regular expression ∅ represents the empty language. In items 4, 5, and 6, the expressions represent the languages obtained by taking the union or concatenation of the languages R1 and R2, or the star of the language R1, respectively.

Don't confuse the regular expressions ε and ∅. The expression ε represents the language containing a single string, namely, the empty string, whereas ∅ represents the language that doesn't contain any strings.

Seemingly, we are in danger of defining the notion of regular expression in terms of itself. If true, we would have a circular definition, which would be invalid. However, R1 and R2 always are smaller than R. Thus we actually are defning regular expressions in terms of smaller regular expressions and thereby avoiding circularity. A definition of this type is called an inductive definition.

Parentheses in an expression may be omitted. If they are, evaluation is done in the precedence order: star, then concatenation, then union.

Note! While not enforceable by an interface, Regular Expressions should be immutable objects


Method Summary
 RegularExpression complement()
          Complements the language that this regular expression accepts.
 RegularExpression concatenate(RegularExpression re)
          Let N1 = (Q1, Σ, δ1, q1, F1) recognize A1, and N2, Σ, δ2, q2, F2) recognize A2.
 boolean equals(RegularExpression regexp)
           
 boolean equivalent(RegularExpression regexp)
           
 int hashCode()
           
 RegularExpression star()
          Let N1 = (Q1, Σ, δ1, q1, F1) recognize A1.
 NondeterministicFiniteAutomata toNondeterministicFiniteAutomata()
          Returns an NFA that accepts the same regular language that this regular expression accepts.
 java.lang.String toString()
           
 RegularExpression union(RegularExpression re)
          Let N1 = (Q1, Σ, δ1, q1, F1) recognize A1, and N2 = (Q2, Σ, δ2, q2, F2) recognize A2.
 
Methods inherited from interface org.metasyntactic.math.automata.FiniteAutomata
accept
 

Method Detail

concatenate

public RegularExpression concatenate(RegularExpression re)
Let N1 = (Q1, Σ, δ1, q1, F1) recognize A1, and N2, Σ, δ2, q2, F2) recognize A2.

Construct N = (Q, Σ, δ, q1, F2) to recognize A1⋅A2.

  1. Q = Q1∪Q2. The states of N are all the states of N1 and N2.
  2. The state q1 is the same as the start state of N1.
  3. The accept states F2 are the same as the accept states of N2.
  4. Define δ so that for any q ∈ Q and any a ∈ Σε,

Returns:
a new RegularExpression equal to N1⋅N2

union

public RegularExpression union(RegularExpression re)
Let N1 = (Q1, Σ, δ1, q1, F1) recognize A1, and N2 = (Q2, Σ, δ2, q2, F2) recognize A2.

Construct N = (Q, Σ, δ, q0, F) to recognize A1∪A2.

  1. Q = {q0}∪Q1∪Q2. The states of N are all the states of N1 and N2, with the addition of a new start state q0
  2. The state q0 is the start state of N.
  3. The accept states F=F1∪F2. The accept states of N are all the accept states of N1 and N2. That way N accepts if either N1 or N2 accepts.
  4. Define δ so that for any q ∈ Q and and a ∈ Σε,

Returns:
a new RegularExpression equal to A1∪A2

star

public RegularExpression star()
Let N1 = (Q1, Σ, δ1, q1, F1) recognize A1. Construct N = (Q, Σ, δ, q0, F) to recognize A1*.
  1. Q = {q0}∪Q1. The states of N are the states of N1 plus a new start state.
  2. The state q0 is the new start state
  3. F = {q0}∪F1. The accept states are the old accept states plus the new start state.
  4. Define δ so that for any q ∈ Q and any a ∈ Σε,

Returns:
a new RegularExpression equal to R*

complement

public RegularExpression complement()
Complements the language that this regular expression accepts. This method may or may not return a new RegularExpression. If it does not, then it must return a reference to the regexp complemented


toNondeterministicFiniteAutomata

public NondeterministicFiniteAutomata toNondeterministicFiniteAutomata()
Returns an NFA that accepts the same regular language that this regular expression accepts.

Returns:
An NFA equivalent to this

hashCode

public int hashCode()
Overrides:
hashCode in class java.lang.Object

toString

public java.lang.String toString()
Overrides:
toString in class java.lang.Object

equals

public boolean equals(RegularExpression regexp)

equivalent

public boolean equivalent(RegularExpression regexp)