comunic/3rdparty/luminous/docs/site/complex-scanner

41 lines
1.5 KiB
Plaintext
Raw Permalink Normal View History

2016-11-19 11:08:12 +00:00
# parent: Writing-a-language-scanner
=Complex hand-written scanners=
Hand written scanners should subclass LumiousScanner.
Scanning occurs in the main() method. There are two things you have to worry about:
# advancing the scan pointer, which is done by calls to scan(), get(), etc
# recording the string segments you're matching as their relevant tokens. This is done by calling `record($string, $token_name, $escaped?=false)`
By the time you exit main(), the string should have been fully recorded. main() doesn't return anything.
Imagine in your langauge you need to keep track of a context (state) by tracking curly braces. The basic workflow looks something like this:
{{{lang=php_snippet
class MyScanner extends LuminousScanner {
function init() {
// set up any last-minute stuff in here
}
function main() {
while (!$this->eos()) {
if ($this->scan('/some_pattern/') !== null) {
$this->record($this->match(), 'TOKEN_NAME');
}
elseif($this->scan('/some_other_pattern/') !== null ) {
$this->record($this->match(), 'SOME_OTHER_TOKEN');
}
...
else { // ensure we advance the scan pointer
$this->record($this->get(), null);
}
}
}
}
}}}
Obviously, to make it worthwhile to use an explictly written scanner, you will be evaluating quite a lot of logic inbetween the calls to scan().
*Examples*: languages/json.php is a fairly simple scanner which implements its own loop explicitly and records stack based state information.