Luminous  git-master
 All Classes Files Functions Variables
Public Member Functions | Protected Member Functions | Protected Attributes | Private Attributes | List of all members
LuminousStatefulScanner Class Reference

Experimental transition table driven scanner. More...

Inheritance diagram for LuminousStatefulScanner:
Inheritance graph
[legend]
Collaboration diagram for LuminousStatefulScanner:
Collaboration graph
[legend]

Public Member Functions

 add_pattern ($name, $pattern, $end=null, $consume=true)
 Adds a pattern.
 add_transition ($from, $to)
 Adds a state transition.
 load_transitions ()
 Loads legal state transitions for the current state.
 main ()
 next_end_data ()
 Looks for the next state-pop sequence (close/end) for the current state.
 next_start_data ()
 Looks for the next legal state transition.
 pop_state ()
 Pops a state from the stack.
 push_child ($child)
 push_state ($state_data)
 Pushes a state.
 record ($str, $dummy1=null, $dummy2=null)
 record_range ($from, $to, $type=null)
 Helper function to record a range of the string.
 record_token ($str, $type)
 Records a complete token This is shorthand for pushing a new node onto the stack, recording its text, and then popping it.
 state_name ()
 Gets the name of the current state.
 tagged ()
 Returns the XML representation of the token stream.

Protected Member Functions

 collapse_token_tree ($node)
 setup ()
 Sets up the FSM.

Protected Attributes

 $legal_transitions = array()
 Legal transitions for the current state.
 $patterns = array()
 Pattern list.
 $token_tree_stack = array()
 The token tree.
 $transitions = array()
 Transition table.
- Protected Attributes inherited from LuminousSimpleScanner
 $overrides = array()
 Overrides array.
- Protected Attributes inherited from LuminousScanner
 $case_sensitive = true
 Whether or not the language is case sensitive.
 $filters = array()
 Individual token filters.
 $ident_map = array()
 A map of identifiers and their corresponding token names.
 $rule_tag_map = array()
 Rule remappings.
 $state_ = array()
 State stack.
 $stream_filters = array()
 Token stream filters.
 $tokens = array()
 The token stream.
 $user_defs
 Identifier remappings based on definitions identified in the source code.

Private Attributes

 $last_state = null
 $setup = false
 $transition_rule_cache = array()

Additional Inherited Members

- Static Public Member Functions inherited from LuminousScanner
static guess_language ($src, $info)
 Language guessing.
- Public Attributes inherited from LuminousScanner
 $version = 'master'
 scanner version.

Detailed Description

Experimental transition table driven scanner.

The stateful scanner follows a transition table and generates a hierarchical token tree. As such, the states follow a hierarchical parent->child relationship rather than a strict from->to

A node in the token tree looks like this:

array('token_name' => 'name','children' => array(...))

Children is an ordered list and its elements may be either other token nodes or just strings. We override tagged to try to collapse this into XML while still applying filters.

We now store patterns as the following tuple:

($name, $open_pattern, $teminate_pattern).

The termination pattern may be null, in which case the $open_pattern is complete. No transitions can occur within a complete state because the patterns' match is fixed.

We have two stacks. One is LuminousStatefulScanner::$token_tree_stack, which stores the token tree, and the other is a standard state stack which stores the current state data. State data is currently a pattern, as the above tuple.

Warning
Currently 'stream filters' are not applied, because we at no point end up with a flat stream of tokens. Although the rule name remapper is applied.

Member Function Documentation

LuminousStatefulScanner::add_pattern (   $name,
  $pattern,
  $end = null,
  $consume = true 
)

Adds a pattern.

Parameters
$namethe name of the pattern/state
$patternEither the entire pattern, or just its opening delimiter
$endIf $pattern was just the opening delimiter, $end is the closing delimiter. Separating the two delimiters like this makes the state flexible length, as state transitions can occur inside it.
$consumeNot currently observed. Might never be. Don't specify this yet.
LuminousStatefulScanner::add_transition (   $from,
  $to 
)

Adds a state transition.

This is a helper function for LuminousStatefulScanner::transitions, you can specify it directly instead

Parameters
$fromThe parent state
$toThe child state
LuminousStatefulScanner::collapse_token_tree (   $node)
protected

Recursive function to collapse the token tree into XML

LuminousStatefulScanner::load_transitions ( )

Loads legal state transitions for the current state.

Loads in legal state transitions into the legal_transitions array according to the current state

LuminousStatefulScanner::main ( )

Generic main function which observes the transition table

Reimplemented from LuminousSimpleScanner.

LuminousStatefulScanner::next_end_data ( )

Looks for the next state-pop sequence (close/end) for the current state.

Returns
Data in the same format as get_next: a tuple of (next, matches). If no match is found, next is -1 and matches is null
LuminousStatefulScanner::next_start_data ( )

Looks for the next legal state transition.

Returns
A tuple of (pattern_data, next, matches). If no match is found, next is -1 and pattern_data and matches is null
LuminousStatefulScanner::pop_state ( )

Pops a state from the stack.

The top token on the token_tree_stack is popped and appended as a child to the new top token.

The top state on the state stack is popped and discarded.

Exceptions
Exceptionif there is only the initial state on the stack (we cannot pop the initial state, because then we have no state at all)
LuminousStatefulScanner::push_child (   $child)

Pushes a new token onto the stack as a child of the currently active token

See Also
push_state
LuminousStatefulScanner::push_state (   $state_data)

Pushes a state.

Parameters
$state_dataA tuple of ($name, $open_pattern, $teminate_pattern). This should be as it is stored in LuminousStatefulScanner::patterns

This actually causes two push operations. One is onto the token_tree_stack, and the other is onto the actual stack. The former creates a new token, the latter is used for state information

LuminousStatefulScanner::record (   $str,
  $dummy1 = null,
  $dummy2 = null 
)

Records a string as a child of the currently active token

Warning
the second and third parameters are not applicable to this method, they are only present to suppress PHP warnings. If you set them, an exception is thrown.

Reimplemented from LuminousScanner.

LuminousStatefulScanner::record_range (   $from,
  $to,
  $type = null 
)

Helper function to record a range of the string.

Parameters
$fromthe start index
$tothe end index
$typedummy argument This is shorthand for $this->record(substr($this->string(), $from, $to-$from)
Exceptions
RangeExceptionif the range is invalid (i.e. $to < $from)

An empty range (i.e. $to === $from) is allowed, but it is essentially a no-op.

Reimplemented from LuminousScanner.

LuminousStatefulScanner::record_token (   $str,
  $type 
)

Records a complete token This is shorthand for pushing a new node onto the stack, recording its text, and then popping it.

Parameters
$strthe string
$typethe token type
LuminousStatefulScanner::setup ( )
protected

Sets up the FSM.

If the caller has omitted to specify an initial state then one is created, with valid transitions to all other known states. We also push the initial state onto the tree stack, and add a type mapping from the initial type to NULL.

LuminousStatefulScanner::state_name ( )

Gets the name of the current state.

Returns
The name of the current state
LuminousStatefulScanner::tagged ( )

Returns the XML representation of the token stream.

This function triggers the generation of the XML output.

Returns
An XML-string which represents the tokens recorded by the scanner.

Reimplemented from LuminousScanner.

Member Data Documentation

LuminousStatefulScanner::$last_state = null
private

remembers the state on the last iteration so we know whether or not to load in a new transition-set

LuminousStatefulScanner::$legal_transitions = array()
protected

Legal transitions for the current state.

See Also
LuminousStatefulScanner::load_transitions()
LuminousStatefulScanner::$patterns = array()
protected

Pattern list.

Pattern array. Each pattern is a tuple of

($name, $open_pattern, $teminate_pattern)
LuminousStatefulScanner::$setup = false
private

Records whether or not the FSM has been set up for the first time.

See Also
setup()
LuminousStatefulScanner::$token_tree_stack = array()
protected

The token tree.

The tokens we end up with are a tree which we build as we go along. The easiest way to build it is to keep track of the currently active node on top of a stack. When the node is completed, we pop it and insert it as a child of the element which is now at the top of the stack.

At the end of the process we end up with one element in here which is the root node.

LuminousStatefulScanner::$transition_rule_cache = array()
private

Cache of transition rules

See Also
next_start_data()

The documentation for this class was generated from the following file: