mirror of
https://github.com/pierre42100/comunic
synced 2024-11-17 02:51:13 +00:00
41 lines
1.5 KiB
Plaintext
Executable File
41 lines
1.5 KiB
Plaintext
Executable File
# parent: Writing-a-language-scanner
|
|
=Complex hand-written scanners=
|
|
|
|
Hand written scanners should subclass LumiousScanner.
|
|
|
|
Scanning occurs in the main() method. There are two things you have to worry about:
|
|
# advancing the scan pointer, which is done by calls to scan(), get(), etc
|
|
# recording the string segments you're matching as their relevant tokens. This is done by calling `record($string, $token_name, $escaped?=false)`
|
|
|
|
By the time you exit main(), the string should have been fully recorded. main() doesn't return anything.
|
|
|
|
Imagine in your langauge you need to keep track of a context (state) by tracking curly braces. The basic workflow looks something like this:
|
|
|
|
{{{lang=php_snippet
|
|
|
|
class MyScanner extends LuminousScanner {
|
|
|
|
function init() {
|
|
// set up any last-minute stuff in here
|
|
}
|
|
|
|
function main() {
|
|
while (!$this->eos()) {
|
|
if ($this->scan('/some_pattern/') !== null) {
|
|
$this->record($this->match(), 'TOKEN_NAME');
|
|
}
|
|
elseif($this->scan('/some_other_pattern/') !== null ) {
|
|
$this->record($this->match(), 'SOME_OTHER_TOKEN');
|
|
}
|
|
...
|
|
else { // ensure we advance the scan pointer
|
|
$this->record($this->get(), null);
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}}}
|
|
|
|
Obviously, to make it worthwhile to use an explictly written scanner, you will be evaluating quite a lot of logic inbetween the calls to scan().
|
|
|
|
*Examples*: languages/json.php is a fairly simple scanner which implements its own loop explicitly and records stack based state information. |