mirror of
https://github.com/pierre42100/comunic
synced 2025-06-21 17:45:18 +00:00
First commit
This commit is contained in:
148
3rdparty/luminous/docs/site/Scanning-API
vendored
Executable file
148
3rdparty/luminous/docs/site/Scanning-API
vendored
Executable file
@ -0,0 +1,148 @@
|
||||
# parent: index
|
||||
=Scanning API=
|
||||
|
||||
\contents 2
|
||||
|
||||
== Introduction ==
|
||||
This page is of interest if you are trying to write your own language scanner. It should form a general overview, and does not replace the Doxygen API documentation for the Scanner classes.
|
||||
|
||||
The abstract class `LumiousScanner` is a kind of abstract state machine. The general idea is that the caller uses its methods to say "does this match here?" ... "okay, how about this?" ... etc. As matches are found, the string is consumed. It is of course regular expression based, so at some level, it's a sort of super regular expression machine. It's fairly simple and low-level, but somewhat intimidating if you aren't familiar with the concept. Higher level automation functions are also provided.
|
||||
|
||||
|
||||
LuminousScanner exposes several interesting public methods:
|
||||
|
||||
`string($string)`
|
||||
`init()`
|
||||
`main()`
|
||||
`tagged()`
|
||||
|
||||
string() sets the string (it's a getter and setter - if you omit the argument it is a getter)
|
||||
|
||||
init() is where any setup should be performed. A scanner may have a slightly variable rule-set depending on its environment. construction is too early to read in any public settings the scanner exposes (because the caller won't have had time to set them), so init exists. This happens in languages like CSS and JavaScript: they expose public variables for whether or not they're embedded in HTML (and need to observe terminator tags) and set their rules accordingly. main() performs scanning. You may have to override this.
|
||||
|
||||
tagged() returns an XML string which represents the tokenized source code. This is then passed to a formatter. You shouldn't override this unless you know what you're doing.
|
||||
|
||||
Shorthand for all of these is:
|
||||
|
||||
`highlight($string)`
|
||||
|
||||
|
||||
==Primitive String Scanning Methods==
|
||||
|
||||
*Warning*: A lot of these functions return either `null` or their match. You should be very careful of this when using boolean testing: `if (null)` is false, but so is `if ('')`, but they are both very different situations. Worse, for reasons known only to the php designers, `if('0')` also evaluates to false! This can lead to you losing data if you're not careful, because the scanner consumes it, but the caller may not realise. You should always test for matches with `if ($return_value === null)`.
|
||||
|
||||
|
||||
=== Scanning ===
|
||||
|
||||
|
||||
`peek(chars=1)`
|
||||
`get(chars=1)`
|
||||
|
||||
peek returns the given number of characters from the current position onwards. get is identical but also consumes them. Neither logs their matches.
|
||||
|
||||
`scan($pattern)`
|
||||
|
||||
If the pattern matches at the current position, it is consumed and logged. Returns the match or null.
|
||||
|
||||
`scan_until($pattern)`
|
||||
|
||||
If the pattern matches somewhere beyond the current position, the substring up to the *start* of the match is consumed and logged. Returns the substring or null.
|
||||
|
||||
`check($pattern)`
|
||||
|
||||
Performs a lookahead at the current position. Identical to scan() but does not consume the string. Returns null if it fails.
|
||||
|
||||
`index($pattern)`
|
||||
|
||||
Returns the next index of a pattern, does not log or consume it.
|
||||
|
||||
`unscan()`
|
||||
|
||||
Reverts the last match, moving the scan pointer back to where it was before the match. *Warning* calling this more than once before executing another scan/check is currently undefined behaviour.
|
||||
|
||||
===Accessing matches===
|
||||
|
||||
`match()`
|
||||
`match_groups()`
|
||||
`match_group($group=0)`
|
||||
`match_pos()`
|
||||
|
||||
match() returns the most recent match (i.e. group 0). match_groups returns an array of match groups, indexed by group name/number (corresponding to the regex grouping). match_group() returns a particular group, and match_pos() returns the position of the most recent match (the start), as an offset into the string.
|
||||
|
||||
=== Automation ===
|
||||
|
||||
`get_next($patterns)`
|
||||
|
||||
Iterates over the given patterns (array) and determines the closest match beyond the current scan pointer. The patterns are not consumed or logged, it is up to the caller to decide what to do with them. The return is an array: array(0=>index, 1=>matches)
|
||||
|
||||
index will be -1 if none of the given patterns is found.
|
||||
|
||||
*?*: This is intended for nesting-situations, e.g. comment nesting in MATLAB/Haskell, one can keep calling this and increment or decrement a 'stack' depending on the text of the next match, and finally exit the comment state when the stack is 0/empty.
|
||||
|
||||
Similar to this is:
|
||||
|
||||
`add_pattern($name, $pattern)`
|
||||
`next_match($consume_and_log=true)`
|
||||
|
||||
next_match determines the next match for the patterns added with add_pattern. It returns: array(0=>$name, 1=>$index), and by default will consume and log the string so it is accessible by match*().
|
||||
|
||||
*Warning*:
|
||||
# this is mostly an internal function used to automate LuminousSimpleScanner
|
||||
# It will unset patterns if they are not found.
|
||||
|
||||
=== Manually moving the pointer ===
|
||||
|
||||
`pos($pos=null)`
|
||||
`pos_shift($offset)`
|
||||
|
||||
pos is a getter and setter for the current string position (scan pointer). pos_shift moves the pointer by the given offset. It is not currently recommended to move backwards, some internal caching may not currently account for this.
|
||||
|
||||
|
||||
=== misc ===
|
||||
|
||||
`rest()`
|
||||
|
||||
Returns the rest of the string, from the scan pointer onwards.
|
||||
|
||||
`bol()`
|
||||
`eol()`
|
||||
|
||||
Beginning/end of line. Returns true if the scan pointer is at the beginning or
|
||||
end of line, false otherwise.
|
||||
|
||||
`eos()`
|
||||
|
||||
Returns true if the scanner has reached the end of the string, else false.
|
||||
|
||||
`reset()`
|
||||
`terminate()`
|
||||
|
||||
Reset basically restarts the scanning process whereas terminate ends it, prematurely or otherwise. The scan pointer will be moved to the beginning or end of the string.
|
||||
|
||||
== Other important methods and properties ==
|
||||
|
||||
=== Consuming and tokenizing the string ===
|
||||
|
||||
`record($string, $token_name, $pre_escaped=false)`
|
||||
|
||||
Writes into the token array the given string segment with the given token name. The token name may be null. If the string is already an XML-tag, because you either wrote it yourself for some reason, or you got it from the tagged() method of a sub-scanner, set pre_escaped to true.
|
||||
|
||||
|
||||
|
||||
=== Filters ===
|
||||
|
||||
`add_filter([$name], $TOKEN_NAME, function);`
|
||||
`remove_filter($name);`
|
||||
|
||||
`add_stream_filter([$name], $function);`
|
||||
`remove_stream_filter($name);`
|
||||
|
||||
Filters are/will be explained elsewhere.
|
||||
|
||||
`$rule_tag_map`
|
||||
|
||||
a mapping of rule/token-names to tag-names. e.g. you might have a CSS_VALUE rule name you want mapped to 'VALUE' (so it can be highlighted by the VALUE css rule), you'd defined `$rule_tag_map['CSS_VALUE'] = 'VALUE'`. You can also null certain tokens that you logged as part of the scanning process but don't need highlighting. This is read by the 'rule-map' stream filter.
|
||||
|
||||
`add_identifier_mapping($target_token_name, $values)`
|
||||
|
||||
Tokens with name 'IDENT' are mapped by the 'map-ident' filter. You can use this to change 'function', 'if' and 'else' from an 'IDENT' into a 'KEYWORD', .e.g add_identifier_mapping('KEYWORD', array('function', 'if', 'else));
|
100
3rdparty/luminous/docs/site/User-API-Reference
vendored
Executable file
100
3rdparty/luminous/docs/site/User-API-Reference
vendored
Executable file
@ -0,0 +1,100 @@
|
||||
# parent: index
|
||||
=User's API reference=
|
||||
|
||||
|
||||
\contents 2
|
||||
|
||||
==Introduction==
|
||||
|
||||
This document gives a relatively high level overview of the user's API. For full API documentation, see the Doxygen HTML files in your distribution.
|
||||
|
||||
The entirety of the public API is contained within a class called `luminous`. This is used as a namespace, the methods within are static, which means you can call them directly without instantiating the class. For those unfamiliar with the syntax or terminology, this just means that you call the functions as normal but place `luminous::` in front of it. This is as it is shown on this page.
|
||||
|
||||
The functions in this namespace interact with a theoretically private singleton object called $luminous_. You should be aware of this if only to avoid overwriting it.
|
||||
|
||||
==Basic Functions==
|
||||
|
||||
The two main highlighting functions are:
|
||||
|
||||
`luminous::highlight($language, $source, $options=array())`
|
||||
`luminous::highlight_file($language, $path, $options=array())`
|
||||
|
||||
Note: in versions prior to 0.7, the third parameter was a boolean flag to switch on/off caching. This behaviour is preserved in 0.7 - you can still use a boolean flag instead of the options array. For options, keep reading this page.
|
||||
|
||||
$language can be a language code (open supported.php in a browser to see a list of what you have available), or your own instance of LuminousScanner.
|
||||
|
||||
Since 0.6.2 you can ask Luminous to guess the language of a piece of source code with the function:
|
||||
|
||||
`luminous::guess_language($src, $confidence=0.05, $default='plain')`
|
||||
|
||||
This will return a valid language code for the most-probable language. $confidence and $default are related: if no scanner is willing to say with above 0.05 (5%) certainty that it owns the source code, then $default is returned. It's probably best to leave this at 0.05.
|
||||
|
||||
*warning:* For obvious reasons, laguage guessing is inherently unreliable. Luminous will try to latch on to unique parts of the language, but this is difficult in languages like C, C# and Java, which are syntactically very similar.
|
||||
|
||||
|
||||
`luminous::head_html()`
|
||||
|
||||
This will output several link and script tags. It tries to determine the correct path to the luminous/ directory relative to the document root, but may fail. In this case, you can override it to set it manually. The settings: 'theme', 'relative-root', 'include-javascript' and 'include-jquery' affect this.
|
||||
|
||||
|
||||
Since 0.6.6:
|
||||
`luminous::cache_errors()`
|
||||
|
||||
Returns a list of cache errors encountered for the most recent highlight, or `FALSE` if the cache was not enabled. See the [cache cache] page.
|
||||
|
||||
==Themes==
|
||||
|
||||
`luminous::themes()`
|
||||
`luminous::theme_exists($theme_name)`
|
||||
|
||||
themes() returns a list of themes present in the style/ directory. Use this if you're building a theme selector.
|
||||
|
||||
theme_exists() returns true if a theme exists in the style/ directory, else false.
|
||||
|
||||
==Settings==
|
||||
|
||||
`luminous::set($name, $value)`
|
||||
|
||||
Sets an internal setting to the given value. An exception is raised if the setting is unrecognised.
|
||||
|
||||
Since 0.6.2, you can set the first argument as an array of $name => $value, and omit the second argument.
|
||||
|
||||
`luminous::setting($name)`
|
||||
|
||||
Returns the value currently set for the given setting. An exception is raised if the setting is unrecognised.
|
||||
|
||||
===List of observed settings===
|
||||
|
||||
*Note:* What's listed here might not reflect your version. A definitive list of settings can be found in Doxygen (the LuminousOptions class) as of 0.6.2.
|
||||
|
||||
As with php, setting an integer setting to 0 or -1 will disable it. As of 0.6.2 some validation is applied to these options and exceptions will be thrown if you try to do something nonsensical.
|
||||
|
||||
====Cache====
|
||||
* cache (bool): Whether to use the built-in cache (default: `TRUE`)
|
||||
* cache-age (int): age (seconds) at which to remove cached files (age is determined by mtime -- cache hits trigger a 'touch', so this setting removes cached files which have not been accessed for the given time.), 0 or -1 to disable. (default: 777600 : 90 days)
|
||||
* sql-function: See the [[cache]] page.
|
||||
|
||||
====Misc====
|
||||
|
||||
|
||||
* include-javascript (bool): controls whether luminous::head_html() outputs the javascript 'extras'.
|
||||
* include-jquery (bool): controls whether luminous::head_html() outputs jquery; this is ignored if include-javascript is false. You do not need this if your page already has jQuery!
|
||||
* relative-root (str): luminous::head_html() has to know the location of the luminous directory relative to the location of the document root. It tries to figure this out, but may fail if you are using symlinks. You may override it here.
|
||||
* theme: Sets the internal theme. The LaTeX and html-full formatters read this, and luminous::head_html observes this.
|
||||
* verbose (bool): Since 0.6.6. If `TRUE`, Luminous generates PHP warnings on problems (currently only cache problems which require attention from the caller). (default: `TRUE`)
|
||||
|
||||
====Formatter====
|
||||
|
||||
Formatting options relate to the display of highlighted output.
|
||||
|
||||
* auto-link (bool): if the formatter supports hyperlinking, URIs will be linked
|
||||
* html-strict (bool): Luminous uses the 'target' attribute of <a> tags. This is not valid for X/HTML4 strict, therefore it may be disabled. Note that this is purely academic: browsers don't care. Luminous produces valid HTML5 and HTML4 transitional output regardless.
|
||||
* line-numbers (bool): If the formatter supports line numbering, lines are numbered. (default: true)
|
||||
* start-line (int): If the formatter supports line numbering, lines start from this number. (default: 1)
|
||||
* max-height (int): if the formatter can control its height, it will constrain itself to this many pixels (you may specify this as a string with units) (default: 500)
|
||||
* format (string): Controls the output format:
|
||||
# 'html' (default): HTML. The HTML is contained in a <div> element. CSS must be included on the same page.
|
||||
# 'html-full': A full HTML page. The page is a valid and self-contained HTML document and includes all necessary CSS.
|
||||
# 'html-inline': This is a small variation on the HTML formatter which styles output for inline (in-text) display. The output is in an inline-block element, with line numbers and height constraints disabled. You probably want HTML.
|
||||
# 'latex': LaTeX.
|
||||
# 'none', null: the 'identity' formatter, i.e. no formatting is applied. The result is basically an XML fragment, the way Luminous embeds highlighting data in the string internally. This is implemented for debugging.
|
68
3rdparty/luminous/docs/site/Writing-a-formatter
vendored
Executable file
68
3rdparty/luminous/docs/site/Writing-a-formatter
vendored
Executable file
@ -0,0 +1,68 @@
|
||||
# parent: index
|
||||
=Writing a Formatter=
|
||||
|
||||
Luminous has two distinct stages in highlighting. The first consists of tokenizing the string (this is done by the scanners), and results in the string being represented in an intermediate format. The second stage occurs when a formatter is given the intermediate string and converts it to some output format.
|
||||
|
||||
The intermediate representation is a loose but well-formed XML structure, where the tags represent a way to embed the highlighting data.
|
||||
|
||||
Here's a brief example of a simple C program
|
||||
|
||||
{{{lang=c
|
||||
#include <stdio.h>
|
||||
int main() {
|
||||
/* NOTE: something */
|
||||
float f = 1.0f;
|
||||
return 100 * f;
|
||||
}
|
||||
}}}
|
||||
|
||||
and its resulting XML:
|
||||
|
||||
{{{lang=xml
|
||||
<PREPROCESSOR>#include <<STRING>stdio.h</STRING>></PREPROCESSOR>
|
||||
<TYPE>int</TYPE> main() {
|
||||
<COMMENT>/* <COMMENT_NOTE>NOTE:</COMMENT_NOTE> something */</COMMENT>
|
||||
<TYPE>float</TYPE> f <OPERATOR>=</OPERATOR> <NUMERIC>1.0f</NUMERIC>;
|
||||
<KEYWORD>return</KEYWORD> <NUMERIC>100</NUMERIC> <OPERATOR>*</OPERATOR> f;
|
||||
}
|
||||
}}}
|
||||
|
||||
A few things to notice:
|
||||
# It's not _quite_ valid XML because it lacks a root tag and it doesn't have a `<?xml` declaration. But apart from that, it's XML. Tags can nest and it should be well formed.
|
||||
# The contents of the tags are HTML entity escaped, `&`, `<` and `>` become `&`, `<` and `>` respectively.
|
||||
# Some things which are not deemed important for highlighting aren't inside tags.
|
||||
# We don't use (or need) attributes or self-closing tags.
|
||||
|
||||
*Note*: currently any multiline tokens are closed before the newline and re-opened after the newline. This is not yet configurable, it might be in future.
|
||||
|
||||
==The Formatter Class==
|
||||
A formatter should subclass `LuminousFormatter`. It needs to implement the method:
|
||||
|
||||
`format($str)`
|
||||
|
||||
This method receives the XML string and should return the formatted string.
|
||||
|
||||
It should respect the following class properties _if appropriate_.
|
||||
|
||||
`$wrap_length` - int - word wrap at n characters, 0 or -1 means no wrap
|
||||
`$line_numbers` - bool - whether or not to display line numbering
|
||||
`$link` - bool - convert URLs to hyperlinks
|
||||
`$height` - int or string - Constrain output to this height (applies mostly to HTML)
|
||||
|
||||
And if appropriate it should implement the following method:
|
||||
|
||||
`set_theme($css)`
|
||||
|
||||
This receives a CSS string representing the theme in the user's theme setting. The class `LuminousCSSParser` will parse this allowing you to translate colouring rules into whatever format is necessary (consult the LaTeX formatter to see this in action).
|
||||
|
||||
===Using The Formatter===
|
||||
|
||||
Insert an instance of the formatter as a setting, i.e.
|
||||
{{{lang=php
|
||||
<?php
|
||||
$formatter = new MyFormatter();
|
||||
Luminous::set('format', $formatter);
|
||||
}}}
|
||||
|
||||
That formatter (actually a clone of it) will be used to format subsequent calls to `highlight()`.
|
||||
|
87
3rdparty/luminous/docs/site/Writing-a-language-scanner
vendored
Executable file
87
3rdparty/luminous/docs/site/Writing-a-language-scanner
vendored
Executable file
@ -0,0 +1,87 @@
|
||||
# parent: index
|
||||
=Writing a language file (scanner)=
|
||||
|
||||
\contents 2
|
||||
|
||||
== Introduction==
|
||||
|
||||
Highlighting a language involves writing some logic to recognise syntax rules and identify different parts of a string of source code as matching different parts of the language's syntax. This is a process known generally as 'tokenization'. The machine we use to tokenize a string is called a 'scanner', which can be entirely automated or completely explicit depending on what's necessary for the language to be highlighted.
|
||||
|
||||
Generally, we need to consider a few classes of language:
|
||||
# Simple, flat languages like C# and Java, where we just want to provide a set of tokens and tell Luminous to figure it out.
|
||||
# Languages where context matters sometimes. For example, in JavaScript and related languages, a slash '/' sometimes means 'divide' and sometimes means 'regular expression delimiter'. This needs to be disambiguated somehow.
|
||||
# Languages heavily dependent on context, like CSS, LaTeX and JSON, where different symbols have different meanings depending on what they're nested inside.
|
||||
# Complex languages full of ambiguous constructs where it's best to just write an explicit scanner from scratch. An example is Ruby.
|
||||
|
||||
== Class Structure ==
|
||||
|
||||
Luminous implements a model for scanners via OO and class hierarchies. Each language to be highlighted will implement a class which extends one of Luminous's core scanning classes. The idea is to make it fairly easy to add support for new languages, while allowing each scanner to be as powerful as it needs to be.
|
||||
|
||||
The class hierarchy looks something like this (please excuse ASCII art):
|
||||
|
||||
{{{lang=plain
|
||||
. Scanner
|
||||
|
|
||||
LuminousScanner
|
||||
|
|
||||
LuminousSimpleScanner
|
||||
|
|
||||
LuminousStatefulScanner
|
||||
}}}
|
||||
|
||||
You will extend LuminousScanner or any of its subclasses.
|
||||
|
||||
In relation to the four classes of language mentioned above, the base scanners we would extend are as follows:
|
||||
|
||||
# LuminousSimpleScanner - a generic string traversal algorithm with no concept of state.
|
||||
# LuminousSimpleScanner again - with some *overrides*, which temporarily grant explicit, fine-grained programmatic control to *you*, the implementer for some tokens
|
||||
# LuminousStatefulScanner - A transition table driven implementation of LuminousSimpleScanner (with overrides available)
|
||||
# LuminousScanner - Gives you some helper methods but you write the actual highlighting (tokenization) logic yourself
|
||||
|
||||
The base classes define the methods init() and main(). init is where any kind of setup information should go, and main is where the lexing happens. If using the simple or stateful scanners, you won't have to override main.
|
||||
|
||||
If you need to use an override or an explicit scanner, you should at least look at the [[Scanning-API]] page to see what methods are available.
|
||||
|
||||
|
||||
== Examples ==
|
||||
For neatness, each scanner has its own page:
|
||||
# [simple-scanner Simple Scanner]
|
||||
# [stateful-scanner Stateful Scanner]
|
||||
# [complex-scanner Complex Scanner]
|
||||
|
||||
== Filters ==
|
||||
|
||||
Filters are an additional technique you can use for highlighting minor details. See the [[filters]] page.
|
||||
|
||||
|
||||
== Using your scanner ==
|
||||
|
||||
Once you have written your scanner, you can use it by either simply passing it as the 'language' parameter of the highlight function, e.g.
|
||||
|
||||
{{{lang=php
|
||||
<?php
|
||||
$scanner = new MyScanner();
|
||||
$out = luminous::highlight($scanner, 'some code');
|
||||
}}}
|
||||
|
||||
|
||||
or, if you have several you can use Luminous's internal scanner table. Let's say you've written a new Python scanner:
|
||||
{{{lang=php
|
||||
<?php
|
||||
|
||||
luminous::register_scanner(
|
||||
array('py', 'python'), // codes - if you only have one, this can be a string
|
||||
'PythonScanner' // the class name of your scanner (as string, yes)
|
||||
'Python', // human readable language name
|
||||
'/path/to/your/scanner/class/file.php'
|
||||
);
|
||||
|
||||
// this will use your new scanner
|
||||
$out = luminous::highlight('py', 'def something(): return 1');
|
||||
}}}
|
||||
|
||||
Using register_scanner() means you don't have to include or instantiate scanner classes and files yourself, luminous performs lazy file inclusion when it needs to.
|
||||
|
||||
There is an optional final argument which is a list of dependencies or null. If you write several scanners which rely on each other, list their codes in the dependencies array. If you end up with circular include requirements*, write a dummy include file which includes everything needed, insert that first with classname=null, and list that insertion's code as a dependency in your real insertion.
|
||||
|
||||
* this can happen: you may have a 'compile time' dependency like a superclass's definition, and a 'runtime' dependency like a sub-scanner which needs to be instantiated (at runtime). These are conceptually different but handled in the same way, hence minor problems can occur.
|
163
3rdparty/luminous/docs/site/cache
vendored
Executable file
163
3rdparty/luminous/docs/site/cache
vendored
Executable file
@ -0,0 +1,163 @@
|
||||
= The Cache =
|
||||
|
||||
\contents 2
|
||||
|
||||
== Introduction ==
|
||||
|
||||
Highlighting source code is fairly expensive from a computational point of view. Luminous is probably one of the slower highlighters around, for two main reasons:
|
||||
|
||||
# It's doing a lot of work to try to get highlighting as correct as possible
|
||||
# Apart from the regular expression library, most of the logic is self-implemented in PHP, which seems to be a fairly slow language for this kind of thing.
|
||||
|
||||
For this reason Luminous includes a caching system so that highlighting need only be calculated once, and it should be largely invisible to you after you've set it up.
|
||||
|
||||
Since Luminous 0.6.3 this can either be stored on the filesystem or in a MySQL table (support for other RDBMSs will hopefully come later).
|
||||
|
||||
== Enabling/Disabling Caching==
|
||||
|
||||
Caching is enabled and disabled at a per-highlight level, and is done in the call to `highlight` by the third argument:
|
||||
|
||||
{{{lang=php_snippet
|
||||
$use_cache = FALSE;
|
||||
luminous::highlight('c', 'printf("hi\n");', $use_cache);
|
||||
}}}
|
||||
|
||||
The default value is TRUE.
|
||||
|
||||
With either SQL or the filesystem, if the cache is unusable (i.e. there's a permissions problem or the database queries fail), Luminous will throw a PHP warning (not an exception), and the highlight will be calculated as normal.
|
||||
|
||||
== Filesystem vs SQL ==
|
||||
|
||||
For most use cases the file-system is the most obvious choice: it's simpler and it's faster.
|
||||
|
||||
The main reasons to use the SQL cache are:
|
||||
# It's neater to have everything hidden in an SQL table instead of lying around the filesystem.
|
||||
# It may be more secure on your setup, as the cache directory may require 777 permissions.
|
||||
# You are handling vast numbers of highlights and have file-system constraints on the number of small files you can have. e.g. ext3 formatted with -T largefile4 expects an average file size on the partition of 4MB, if there are a lot of smaller files, you risk running out of inodes.
|
||||
# Simply that you're storing a lot of highlights: there's a 24-hourly purge which removes filesystem files based on their most recent cache hit (internally we use mtime), this involves iterating over every file in the directory and looking at its mtime. If you're storing thousands+ of highlights this could cause a brief IO bottleneck. The SQL database instead purges on every write, but we can probably expect the database to handle this a lot better when there are a lot of cached items.
|
||||
|
||||
In summary: if you don't see the benefit of the SQL cache then don't use it, but if you do, don't be put off by the fact it's a bit slower.
|
||||
|
||||
== Enabling the File System cache ==
|
||||
|
||||
The filesystem is used by default, but Luminous might not be able to create it.
|
||||
|
||||
Luminous uses the directory luminous/cache, which you might have to create and assign writable permissions to (probably chmod 777, but your server might accept other values).
|
||||
|
||||
|
||||
|
||||
|
||||
== Enabling the SQL cache ==
|
||||
|
||||
Using the SQL cache is a little more complex and some of the responsibility falls on the programmer. To enable the SQL cache a configuration setting 'sql_function' must be set, which must be a function that performs SQL queries. The reason that this is left to the programmer is that Luminous doesn't want to tie itself to one family of SQL functions (e.g. MySQL vs PostgreSQL), and doesn't want to have to worry about your passwords, database names and so on.
|
||||
|
||||
The function should take a string (the SQL query) and return:
|
||||
# *false* if the query had some kind of error
|
||||
# *true* if the query was successful but didn't return anything
|
||||
# An array of results if the query was a SELECT (or similar). The results should a numerically indexed set of rows, and each row should be an associative array keyed by field name
|
||||
|
||||
An example implementation is:
|
||||
|
||||
{{{lang=php_snippet
|
||||
function query($sql) {
|
||||
$connection = mysql_connect('server', 'user', 'password');
|
||||
$db = mysql_select_db('your_database');
|
||||
$r = mysql_query($sql);
|
||||
if (is_bool($r)) return $r;
|
||||
$results = array();
|
||||
while ($row = mysql_fetch_assoc($r)) {
|
||||
$results[] = $row;
|
||||
}
|
||||
return $results;
|
||||
}
|
||||
|
||||
luminous::set('sql_function', 'query');
|
||||
}}}
|
||||
|
||||
If you are using a framework, you should be able to plug this into your framework's database library without too much effort. Here's an example for CodeIgniter, implemented as a [http://codeigniter.com/user_guide/general/helpers.html helper] for simplicity but you could also plug it into a model or controller:
|
||||
|
||||
{{{lang=php_snippet
|
||||
function luminous_sql($sql) {
|
||||
$CI =& get_instance();
|
||||
$CI->load->database();
|
||||
$q = $CI->db->query($sql);
|
||||
if (is_bool($q)) return $q;
|
||||
$ret = array();
|
||||
if ($q->num_rows()) {
|
||||
foreach($q->result_array() as $row)
|
||||
$ret[] = $row;
|
||||
}
|
||||
return $ret;
|
||||
}
|
||||
|
||||
luminous::set('sql_function', 'luminous_sql');
|
||||
}}}
|
||||
|
||||
Note in CI that the database library will throw a fatal error if debug mode is enabled and the query fails (application/config/database.php, change the line `$db['default']['db_debug']`)
|
||||
|
||||
|
||||
=== Security ===
|
||||
|
||||
You might be concerned that Luminous doesn't want an SQL-escaping function specific to your RDBMS. This isn't actually a problem: Luminous has strict control over the data being used in queries and doesn't need to escape it: string data is encoded as either b16 or b64 (which have only 'clean' characters). This is double-checked when building each query and if somehow the data has been polluted the query is silently aborted.
|
||||
|
||||
|
||||
== Cache FAQ ==
|
||||
|
||||
=== How is the cache ID calculated? ===
|
||||
|
||||
The ID used for a cache element is a checksum calculated from a number of things, including the input source code, the language, and other less obvious properties such as the version number of Luminous, and the various options which you set at runtime.
|
||||
|
||||
This has the result that if you use a non-versioned copy of Luminous (i.e if you just keep pulling the master branch), your highlights may not be recalculated, and you may not see the benefit of bug fixes or improvements unless you clear your cache when you update. It is easier to use versioned copies of Luminous.
|
||||
|
||||
=== How often is the cache purged? ===
|
||||
|
||||
Elements in the cache are purged after a certain time of inactivity. The exact time is the 'cache-age' setting, `luminous::set('cache-age', $age)`, which is given in seconds, and defaults to 90 days.
|
||||
|
||||
The timeout is calculated by the last time the cache element was accessed. If using the file-system, the last access is stored in the file's mtime.
|
||||
|
||||
=== How can I clear the cache manually? ===
|
||||
|
||||
For an SQL cache, just empty the table.
|
||||
|
||||
For the filesystem, remove everything in the luminous/cache/ directory.
|
||||
|
||||
|
||||
|
||||
== Troubleshooting ==
|
||||
|
||||
|
||||
If the cache fails for some reason, a PHP warning is generated so:
|
||||
|
||||
{{{lang=plain
|
||||
Notice: Luminous cache errors were encountered.
|
||||
See luminous::cache_errors() for details. in /home/mark/projects/luminous/src/luminous.php on line 631
|
||||
}}}
|
||||
|
||||
and will be printed to your page (you can disable this warning by setting luminous::set('verbose', false), but you should only do this if you have some other error logging set up - keep reading).
|
||||
|
||||
*Since 0.6.6* `luminous::cache_errors()` will contain more information:
|
||||
|
||||
{{{lang=php_snippet
|
||||
$highlighted = luminous::highlight($lang, $code, true);
|
||||
if ($e = luminous::cache_errors()) {
|
||||
echo '<pre>';
|
||||
echo implode("<br/>", $e);
|
||||
echo '</pre>';
|
||||
}
|
||||
}}}
|
||||
|
||||
*NOTE*: `luminous::cache_errors()` returns either an array or, if the cache is disabled, `FALSE`. It will only keep information for the most recent highlight.
|
||||
|
||||
Will print something similar to this:
|
||||
|
||||
{{{lang=plain
|
||||
Error writing to "/home/mark/projects/luminous/cache/7a7b9073efe10b64c322de36db0f0a09"
|
||||
File exists: false
|
||||
Readable?: false
|
||||
Writable?: false
|
||||
Your cache dir ("/home/mark/projects/luminous/cache/") is not writable!
|
||||
}}}
|
||||
|
||||
What does this mean? It shows that Luminous cannot write into a file in the cache directory. The file doesn't exist (and it's not readable or writable), but in this case the last line is more insightful, which shows the cache directory is not writable. The solution is to give the cache directory write permissions.
|
||||
|
||||
If you don't want all this information being spammed out to your visitors, call `luminous::set('verbose', false)` to disable the warning, and set up your logging or email handler to create an entry when `luminous::cache_errors()` returns problems.
|
41
3rdparty/luminous/docs/site/complex-scanner
vendored
Executable file
41
3rdparty/luminous/docs/site/complex-scanner
vendored
Executable file
@ -0,0 +1,41 @@
|
||||
# parent: Writing-a-language-scanner
|
||||
=Complex hand-written scanners=
|
||||
|
||||
Hand written scanners should subclass LumiousScanner.
|
||||
|
||||
Scanning occurs in the main() method. There are two things you have to worry about:
|
||||
# advancing the scan pointer, which is done by calls to scan(), get(), etc
|
||||
# recording the string segments you're matching as their relevant tokens. This is done by calling `record($string, $token_name, $escaped?=false)`
|
||||
|
||||
By the time you exit main(), the string should have been fully recorded. main() doesn't return anything.
|
||||
|
||||
Imagine in your langauge you need to keep track of a context (state) by tracking curly braces. The basic workflow looks something like this:
|
||||
|
||||
{{{lang=php_snippet
|
||||
|
||||
class MyScanner extends LuminousScanner {
|
||||
|
||||
function init() {
|
||||
// set up any last-minute stuff in here
|
||||
}
|
||||
|
||||
function main() {
|
||||
while (!$this->eos()) {
|
||||
if ($this->scan('/some_pattern/') !== null) {
|
||||
$this->record($this->match(), 'TOKEN_NAME');
|
||||
}
|
||||
elseif($this->scan('/some_other_pattern/') !== null ) {
|
||||
$this->record($this->match(), 'SOME_OTHER_TOKEN');
|
||||
}
|
||||
...
|
||||
else { // ensure we advance the scan pointer
|
||||
$this->record($this->get(), null);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}}}
|
||||
|
||||
Obviously, to make it worthwhile to use an explictly written scanner, you will be evaluating quite a lot of logic inbetween the calls to scan().
|
||||
|
||||
*Examples*: languages/json.php is a fairly simple scanner which implements its own loop explicitly and records stack based state information.
|
153
3rdparty/luminous/docs/site/filters
vendored
Executable file
153
3rdparty/luminous/docs/site/filters
vendored
Executable file
@ -0,0 +1,153 @@
|
||||
= Filters =
|
||||
|
||||
\contents 2
|
||||
|
||||
== Introduction ==
|
||||
|
||||
A filter is a simple piece of code which applies minor detail highlighting that's difficult or long winded to do during scanning, and should be usable by all scanners.
|
||||
|
||||
== Rationale ==
|
||||
|
||||
The highlighting engine Luminous provides is relatively difficult to generalise because it expects each scanner to be different. This leads to the situation where it can be cumbersome to apply similar minor highlighting details consistently across languages. An example of this is documentation comment annotations: in many languages and doc-comment systems they look something like this:
|
||||
|
||||
{{{lang=java
|
||||
/**
|
||||
* @brief some method, which does something
|
||||
* @param arg1 An argument
|
||||
* @returns something
|
||||
* @throws Exception, if you call it.
|
||||
*
|
||||
* Here's a method which does something.
|
||||
*/
|
||||
public int f(int arg1) {
|
||||
throw new Exception();
|
||||
return 1;
|
||||
}
|
||||
}}}
|
||||
|
||||
As you can hopefully see, the doc-comment tags are highlighted. But as you can imagine, since each scanner is separate, it's not practical to implement this level of detail by hand across all languages.
|
||||
|
||||
The answer to this is called a filtering system (a name I think I poached from the Python syntax highlighter Pygments, but I'm not sure if our filters are exactly the same as theirs). A filter defines some common code for highlighting small details. As well as doc-comments, they are used to highlight escape sequences in strings, special characters in regular expressions. A less obvious use for filters is to map identifier names to different token types, for example, most scanners just define an identifier as `[a-zA-Z_]\w*`; a filter is then responsible for mapping each generic identifier token to a more specific token (like 'KEYWORD', or 'TYPE', or 'FUNCTION').
|
||||
|
||||
Other possible uses for filters might be to enforce consistent casing in case insensitive languages, or to add hyperlinks to function names.
|
||||
|
||||
|
||||
== Definition ==
|
||||
|
||||
We have two distinct forms of filter: individual filters (usually just referred to as 'filters') and stream filters.
|
||||
|
||||
* An individual filter is a function which takes and returns a token object. This filter will only be called on the tokens which it is registered for.
|
||||
* A stream filter is a function which takes and returns an ordered array of token objects.
|
||||
|
||||
A token object is actually an array (tuple) and is composed with its indices as so:
|
||||
* 0: TOKEN_NAME (string or null),
|
||||
* 1: TOKEN_TEXT (string),
|
||||
* 2: ESCAPED? (bool)
|
||||
|
||||
|
||||
Escaped refers to XML-escaping: because the end result of the token stream is a piece of XML, we need to keep track of whether or not the actual text is escaped.
|
||||
|
||||
The way that a filter actually works to manipulate text and add in extra highlighting is to embed XML directly into the string.
|
||||
|
||||
|
||||
== Examples ==
|
||||
|
||||
=== Changing the type of a token based on its content ===
|
||||
|
||||
A filter to map UPPER CASE IDENTIFIERS to 'constant' tokens:
|
||||
|
||||
{{{lang=php_snippet
|
||||
function upper_to_constant($token) {
|
||||
// check for this because it may have been mapped to a function or something already
|
||||
if ($token[0] === 'IDENT' && preg_match('/^[A-Z_][A-Z_0-9]{3,}$/', $token[1]))
|
||||
$token[0] = 'CONSTANT';
|
||||
return $token;
|
||||
}
|
||||
}}}
|
||||
|
||||
=== Changing the content of a token ===
|
||||
|
||||
A simple filter to highlight escape sequences in strings (i.e. a backslash followed by any character):
|
||||
|
||||
{{{lang=php_snippet
|
||||
function string_filter($token) {
|
||||
$token = LuminousUtils::escape_token($token);
|
||||
$token[1] = preg_replace('/ \\\\. /x',
|
||||
'<ESC>$0</ESC>', $token[1]);
|
||||
return $token;
|
||||
}
|
||||
}}}
|
||||
|
||||
*Note*: since we change the content of the string, we make sure the token is escaped first. LuminousUtils::escape_token() does this for us.
|
||||
|
||||
=== Stream filters ===
|
||||
|
||||
For the purpose of creating a simple example, let's say you wanted to use a stream filter to do your string filtering. Assume the string_filter function (above) is defined:
|
||||
|
||||
{{{lang=php_snippet
|
||||
function string_stream_filter($tokens) {
|
||||
foreach($tokens as &$t) {
|
||||
$t = string_filter($t);
|
||||
}
|
||||
return $tokens;
|
||||
}
|
||||
}}}
|
||||
|
||||
== Adding your filter to your scanner ==
|
||||
|
||||
To use the above filters, in your scanner's constructor or init method, insert the following code:
|
||||
|
||||
{{{lang=php_snippet
|
||||
$this->add_filter('constant', // name of the filter
|
||||
'CONSTANT', // token the filter applies to
|
||||
'upper_to_constant' // reference to the filter
|
||||
);
|
||||
$this->add_stream_filter(
|
||||
'strings', // name of the filter
|
||||
'string_stream_filter' // reference to the filter
|
||||
);
|
||||
}}}
|
||||
|
||||
|
||||
|
||||
== Important stuff you should know ==
|
||||
|
||||
The filters are invoked in the last stage of highlighting by a scanner, i.e. directly before the final XML string is produced.
|
||||
|
||||
Stream filters are handled before individual filters.
|
||||
|
||||
Individual filters stack, you can have many bound to a single token type.
|
||||
|
||||
If you change the type of a token in an individual token, any remaining filters for the original type will still be applied. So if you have a 'KEYWORD' token and a filter to change it to a 'COMMENT' token, it won't automatically inherit the COMMENT token's filters. But you can call them manually from your filter.
|
||||
|
||||
If you insert XML into a token, make sure the token is escaped first (use LuminousUtils::escape_token to escape it)
|
||||
|
||||
Be careful of escaped tokens, try to avoid letting your regular expressions match XML tags.
|
||||
|
||||
Be very careful of multi-line XML strings. If you need to use this, you should split the XML tag to close at the end of each line and re-open after the line break, because otherwise it is difficult to apply line numbering in the HTML formatter. See LuminousUtils::tag_block() if there's any danger of this.
|
||||
|
||||
=== Predefined Filters ===
|
||||
|
||||
Luminous defines a number of filters and many of these are already bound to LuminousScanner (which you will subclass). some of these might not apply to your language, in which case you can disable them with LuminousScanner::remove_filter($name) or LuminousScanner::remove_stream_filter($name);
|
||||
|
||||
|
||||
You should consult the source code to the constructor of LuminousScanner, but a possibly incomplete list is:
|
||||
|
||||
<table style='width:100%;text-align:center;'>
|
||||
<tr class='header'>
|
||||
<td>Token Name</td>
|
||||
<td>Rule Name</td>
|
||||
<td style='max-width:100px'>Description</td>
|
||||
</tr>
|
||||
<tr><td> N/A (stream) </td><td> *rule-map* </td><td> Renames token based on the LuminousScanner::$rule_tag_map map </td></tr>
|
||||
<tr><td> N/A (stream) </td><td> *oo-syntax* </td><td> Adds OO (object.property) highlighting using '.', '::' or '->'</td></tr>
|
||||
<tr><td> IDENT </td><td> *map-ident* </td><td> Renames identifiers based on the LuminousScanner::$ident_map map </td></tr>
|
||||
<tr><td> COMMENT </td><td> *comment-note* </td><td> Highlights 'NOTE', 'TODO', 'FIXME', etc in comments </td></tr>
|
||||
<tr><td> COMMENT </td><td> *comment-to-doc* </td><td> Tries to convert COMMENT to DOCCOMMENT and apply Javadoc-like tag highlighting </td></tr>
|
||||
<tr><td> STRING </td><td> *string-escape* </td><td> Highlights generic escape sequences in strings </td></tr>
|
||||
<tr><td> CHARACTER </td><td> *char-escape* </td><td> Highlights generic escape sequences in 'char' types </td></tr>
|
||||
<tr><td> REGEX </td><td> *pcre* </td><td> Highlight special characters in regular expression literals </td></tr>
|
||||
<tr><td> IDENT </td><td> *user-defs* </td><td> Tries to apply highlighting to identifier strings which have been marked as special during scanning (user defined classes, functions) </td></tr>
|
||||
<tr><td> IDENT </td><td> *constant* </td><td> Tries to convert upper case identifiers to 'CONSTANT' types </td></tr>
|
||||
<tr><td> IDENT </td><td> *clean-ident* </td><td> Remaps any remaining 'IDENT' type to the null type </td></tr>
|
||||
</table>
|
121
3rdparty/luminous/docs/site/hacking
vendored
Executable file
121
3rdparty/luminous/docs/site/hacking
vendored
Executable file
@ -0,0 +1,121 @@
|
||||
= Hacking =
|
||||
|
||||
\contents 2
|
||||
|
||||
== Intro ==
|
||||
|
||||
If you're interested in modifying Luminous to work a bit differently or do something new (or if you want to fix a bug), you're in the right place.
|
||||
|
||||
Firstly you'll need a local copy of Luminous to work on. You'll want to get the most recent development version from the [https://github.com/markwatkinson/luminous Git repository]:
|
||||
|
||||
`git clone git://github.com/markwatkinson/luminous.git`
|
||||
|
||||
*note*: if you're planning to contribute your changes to Luminous, you'll probably want to fork the project on GitHub. You can do this by [https://github.com/markwatkinson/luminous visiting the repository page] and pressing 'fork' (you will need to be logged in first).
|
||||
|
||||
The rest of this page should provide you with guidelines on how various features work, and an overview of the general process.
|
||||
|
||||
|
||||
== Specific details on common additions ==
|
||||
* Adding highlighting support for a new language: [[Writing-a-language-scanner]]
|
||||
* Add a new output format: [[Writing-a-formatter]]
|
||||
|
||||
|
||||
== How stuff works ==
|
||||
|
||||
Highlighting a string of source code is a fairly long winded process which goes something like this:
|
||||
|
||||
# User API receives some source code and a language name (or possibly a scanner instance)
|
||||
# User API looks up a relevant scanner from the language name (if it wasn't provided one)
|
||||
# User API looks at the settings, scanner, and source code to generate a unique cache ID, and asks the cache module to have a look at it
|
||||
# If it is cached, the cache returns it and we return the fully formatted, highlighted source code (break)
|
||||
# If it's not cached, we pass the source code into the scanner and tell it to work its magic
|
||||
# The scanner returns an XML string which we then pass into the relevant formatter
|
||||
# The formatter returns a string of fully formatted, highlighted code, which we return.
|
||||
|
||||
From this we can see the main separate elements of Luminous are:
|
||||
# The user API
|
||||
# The cache
|
||||
# The scanners (there is one of these for each language highlighting language support), and
|
||||
# The formatters
|
||||
|
||||
The language scanners are stored under languages/, while the rest of the source is under src/. The scanning infrastructure is under src/core/ and formatters are under src/formatters/.
|
||||
|
||||
|
||||
== Testing ==
|
||||
|
||||
Luminous probably isn't as test-oriented as it should be, but it still has a fairly extensive test database that you should make use of if you change anything. The testing directory is 'tests/' in the git repository (this is not present in packaged versions).
|
||||
|
||||
There is a useful paste-interface (tests/interface.php) which can be accessed through a browser if you have a locally running PHP server, which you can use to quickly paste some code and see how it gets highlighted.
|
||||
|
||||
Most other testing scripts are command line PHP scripts, so you will at least need a command line PHP environment (on Ubuntu this is as simple as apt-get install php5-cli).
|
||||
|
||||
The important tests are most easily invoked from the runtests.py Python script:
|
||||
|
||||
{{{lang=plain
|
||||
$ python runtests.py --help
|
||||
Usage: runtests.py [OPTIONS]
|
||||
Valid options:
|
||||
--<test> where test may be: fuzz, regression, unit
|
||||
|
||||
--quiet Only print failures and warnings
|
||||
}}}
|
||||
|
||||
=== Unit tests ===
|
||||
|
||||
Unit tests perform basic low level testing of various modules' APIs. If you plan to change a particular function/method, you should ensure it is covered by the unit test (if not, create one).
|
||||
|
||||
=== Regression tests ===
|
||||
|
||||
The regressions test comprises a large amount of real and contrived source code for most languages. Each source is paired with a file containing XML highlighting information for that file. The regression test consists of checking the highlighting of each file matches what's stored in the expected file. The expected result is not necessarily 'correct', it just represents a snapshot of what Luminous was doing when the file was generated, the point is to make it difficult for changes in highlighting to go undetected.
|
||||
|
||||
This should be run after making any changes to a scanner to see whether or not your change has had any unexpected results. If anything does change, a diff file between the expected and real output is generated.
|
||||
|
||||
See test/regression/README for more information.
|
||||
|
||||
=== Fuzz tests ===
|
||||
|
||||
The idea of a fuzz test is to throw random(ish) data at a system to ascertain how resilient it is. Fuzz tests are important to Luminous because we don't want it to do something strange when it gets some invalid source code.
|
||||
|
||||
The fuzz tests checks two things:
|
||||
# That Luminous actually halts (i.e. does not go into an infinite loop) in a given amount of time
|
||||
# That when stripped of the highlighting data, the output source string is equal to the input source string, i.e. that Luminous does not add or remove any extra data.
|
||||
|
||||
There are two fuzz tests. One is fully random, and one distorts real source code from the regression database. Generally speaking, they should both be used but the latter is much faster and has so far been a lot more effective at identifying errors.
|
||||
|
||||
|
||||
== Contributing your additions ==
|
||||
|
||||
If you plan to write some extra stuff for Luminous and want to see it included, the easiest way to go about this is follow the process on GitHub.
|
||||
|
||||
Luminous is stored in a git repository on GitHub. If you've never used GitHub, it allows you to 'fork' a repository (which means to create an independent copy of it, which you can work on), and then request that I 'pull' your repository and merge your changes into the Luminous repository.
|
||||
|
||||
GitHub has plenty of excellent [http://help.github.com/fork-a-repo/ documentation on how to do this].
|
||||
|
||||
|
||||
|
||||
|
||||
=== Guidelines for inclusion ===
|
||||
|
||||
Some of these things are more of a general guide to not making a giant mess in 'relaxed' languages like PHP, but generally, Luminous is written to these principles and so should any extensions be:
|
||||
|
||||
==== Tests ====
|
||||
* Your code should pass existing tests unless they are wrong (it's okay to regenerate regression tests if you've improved them). See below for advice on how to deal with PCRE errors on fuzz tests.
|
||||
* If you add functionality, add a unit test for your new function(s).
|
||||
* If you fix a problem in a language scanner, add in a test case in the regression database.
|
||||
* If you fix something outside of a language scanner, add a unit test.
|
||||
|
||||
==== Scanners ====
|
||||
* New scanners should be high quality before they're included: they should highlight correctly in as many cases as practically possible. This aim starts to become unrealistic in some modern dynamic languages (like Ruby) that have very complex grammars, but as a general guideline you should correctly detect/highlight things like:
|
||||
* String escape sequences ("I said \"hello\"")
|
||||
* Nested comments (if the language supports them, e.g. MATLAB, Haskell)
|
||||
* Complex, nestable string interpolation (e.g. Ruby, Groovy?). In these cases, we use 'sub-scanners' to recurse into the code.
|
||||
* Try to avoid reliance on easily breakable regular expressions, instead don't be scared to implement things yourself:
|
||||
* Don't use recursive regular expressions; they are too fragile and will crash PCRE (stack overflow) on nonsense code. See LuminousScanner::nestable_token() to help you.
|
||||
* Try to write your patterns using possessive modifiers where possible to avoid backtracking issues.
|
||||
* Using potentially long non-greedy patterns (e.g. a big multiline `.*?` to match a heredoc) can be risky for PCRE. It's best to split the pattern up and implement it in stages using Scanner's methods.
|
||||
==== General ====
|
||||
* Your code should be compatible with PHP 5.2, which means no closures or namespaces or gotos.
|
||||
* Globals are ugly; put procedural code in static classes.
|
||||
* Functions/variables which are exposed publicly to other modules should be documented with Doxygen (specify their aim, their parameters and their return value, and any special cases)
|
||||
* Follow the general naming and syntactic conventions, e.g. $a_variable not $aVariable.
|
||||
* Performance is always nice, but don't sacrifice code clarity for it!
|
39
3rdparty/luminous/docs/site/index
vendored
Executable file
39
3rdparty/luminous/docs/site/index
vendored
Executable file
@ -0,0 +1,39 @@
|
||||
=Documentation=
|
||||
|
||||
To see a map of the documentation area: [docs/map documentation sitemap].
|
||||
|
||||
\contents 2
|
||||
|
||||
==Quick Usage==
|
||||
|
||||
# Extract your archive to some directory (or clone the repo, or whatever). This should be inside your document root, as the style/ directory needs to be web-visible.
|
||||
# If you downloaded a development version from GitHub (as opposed to an archive from this site), remove the tests/ directory, this is not something you want to expose on a public machine.
|
||||
# Create a directory called 'cache' inside your luminous directory and make sure it is writeable to your server (this probably involves 777 permissions).
|
||||
# Now test everything is working by creating a new file, the hello world of highlighting:{{{lang=php
|
||||
<?php
|
||||
require_once '/path/to/luminous/luminous.php';
|
||||
echo luminous::head_html(); // outputs CSS includes, intended to go in <head>
|
||||
echo luminous::highlight('c', 'printf("hello world\n");');
|
||||
}}}
|
||||
# Point your browser at the page you just created and it should show a single line of highlighted source code.
|
||||
|
||||
== Problems? ==
|
||||
Check out the [[troubleshooting]] guide.
|
||||
|
||||
== Advanced Usage ==
|
||||
* Consult the [[cache]] page to use an SQL table as your cache, or learn more about the cache's behaviour.
|
||||
* Check the examples/ directory for a few examples of how you might use Luminous
|
||||
* Have a look at the [[User-API-Reference]] for setting up runtime configuration settings.
|
||||
|
||||
==Hacking==
|
||||
|
||||
If you want to change how Luminous works or add new features, check out the [[hacking]] section.
|
||||
If you want to contribute but are stuck for ideas (hint hint), check out the [[todo]] page.
|
||||
|
||||
== Local API Docs==
|
||||
|
||||
The packaged distros come with Doxygen API documentation in the docs/html/ directory.
|
||||
|
||||
These documents are intended to be a class/method specification. The documentation on this site is supposed to be a little more high level and it is not exhaustive; if you find it insufficient the Doxygen pages should provide more detail. It covers both the public callers' API and the internal scanning API.
|
||||
|
||||
A copy of these is held [!http://luminous.asgaard.co.uk/assets/luminous/docs/html/ online] for a recent development/git version (which might not be the same as _your_ version).
|
13
3rdparty/luminous/docs/site/map.meta
vendored
Executable file
13
3rdparty/luminous/docs/site/map.meta
vendored
Executable file
@ -0,0 +1,13 @@
|
||||
index
|
||||
cache
|
||||
User-API-Reference
|
||||
troubleshooting
|
||||
hacking
|
||||
Writing-a-language-scanner
|
||||
simple-scanner
|
||||
stateful-scanner
|
||||
complex-scanner
|
||||
Scanning-API
|
||||
filters
|
||||
Writing-a-formatter
|
||||
todo
|
102
3rdparty/luminous/docs/site/simple-scanner
vendored
Executable file
102
3rdparty/luminous/docs/site/simple-scanner
vendored
Executable file
@ -0,0 +1,102 @@
|
||||
# parent : Writing-a-language-scanner
|
||||
= Writing a simple scanner (with LuminousSimpleScanner) =
|
||||
|
||||
A simple scanner should be used when there are no state-transitions or awkward requirements to worry about.
|
||||
|
||||
For a simple scanner, the basic workflow is this:
|
||||
# override `init()`
|
||||
# Add tokens using add_pattern
|
||||
# If necessary, add overrides for individual tokens using `$this->overrides['TOKEN_NAME'] = function`
|
||||
|
||||
== Completely Automated ==
|
||||
|
||||
Here's a VERY simple and completely automated scanner for a small Python-style language:
|
||||
|
||||
{{{lang=php_snippet
|
||||
class MyScanner extends LuminousSimpleScanner {
|
||||
function init() {
|
||||
$this->add_pattern('COMMENT', '/#.*/');
|
||||
$this->add_pattern('STRING', '/"([^\\\\"]+|\\\\.)*("|$)/');
|
||||
$this->add_pattern('SHELL_CMD', '/`([^\\\\`]+|\\\\.)*(`|$)/');
|
||||
$this->add_pattern('IDENT', '/[a-z_]\w*/');
|
||||
|
||||
$this->add_identifier_mapping('KEYWORD', array('def', 'else', 'elif',
|
||||
'for', 'return', 'while'));
|
||||
$this->rule_tag_map = array(
|
||||
'SHELL_CMD' => 'FUNCTION'
|
||||
);
|
||||
}
|
||||
}
|
||||
}}}
|
||||
|
||||
When main() is called, the given tokens will be observed. Simples!
|
||||
|
||||
Notes:
|
||||
# Patterns are checked in order. That means the first-defined pattern has precedence if two patterns match (instead of the max-munch-rule).
|
||||
# If your given patterns don't fully describe the source code then segments will simply be recorded as a 'null' token.
|
||||
# The identifier mappings are a 'filter', which looks at anything recorded as an 'IDENT', and converts them into another token. If you don't specify an 'IDENT' pattern, this has no effect.
|
||||
# SHELL_CMD is a made-up token which isn't defined as a CSS class. We use this because it's more readable than calling it some unrelated token name, but we map it to 'FUNCTION' later.
|
||||
|
||||
*Examples*: Java and C# (java.php and csharp.php in languages/)
|
||||
|
||||
|
||||
== With Overrides==
|
||||
|
||||
Let's say you've got a type that can't be matched by a simple regular expression. For the sake of example we'll use the obvious idea of a '/' as a regex delimiter and a division operator.
|
||||
|
||||
Insert this into your init:
|
||||
|
||||
{{{lang=php_snippet
|
||||
class MyScanner extends LuminousSimpleScanner {
|
||||
|
||||
function init() {
|
||||
$this->add_pattern('OPERATOR', '@[!%^&*\\\\-=+;:\\|,\\./?]+@');
|
||||
$this->add_pattern('SLASH', '%/%'); // special case
|
||||
// tokenizing these helps us figure out the slash
|
||||
$this->add_pattern('OPENER', '/[\\(\\[\\{]+/');
|
||||
$this->add_pattern('CLOSER', '/[\\)\\]\\}]+/');
|
||||
//... but they aren't real tokens, as far as highlighting is concerned.
|
||||
$this->rule_tag_map['OPENER'] = null;
|
||||
$this->rule_tag_map['CLOSER'] = null;
|
||||
|
||||
$this->overrides['SLASH'] = array($this, 'slash_override');
|
||||
}
|
||||
|
||||
}
|
||||
}}}
|
||||
|
||||
Now, when LuminousSimpleScanner finds it's at the 'SLASH' token, it will stop and call `$this->slash_override`. It expects that function to record and consume some string and will throw an exception if it doesn't (because it would be an infinite loop).
|
||||
|
||||
An override to disambiguate '/' might look something like this:
|
||||
|
||||
{{{lang=php_snippet
|
||||
class MyScanner extends LuminousSimpleScanner {
|
||||
|
||||
function slash_override($matches) {
|
||||
// to disambiguate it we go backwards over the token array and see what
|
||||
// was preceding it.
|
||||
$is_regex = false;
|
||||
for($i = count($this->tokens)-1; $i >= 0; $i--) {
|
||||
// A token is a tuple:
|
||||
list($name, $content, $escaped) = $this->tokens[$i];
|
||||
|
||||
if ($t[0] === 'COMMENT' || $t[0] === null)
|
||||
continue; // unimportant, ignore
|
||||
elseif($t[0] === 'OPERATOR' || $t[0] === 'OPENER')
|
||||
$is_regex = true;
|
||||
break;
|
||||
}
|
||||
if ($is_regex) {
|
||||
// get and consume the regex pattern
|
||||
$str = $this->scan('% / (?> [^\\\\\\\\/]+ | \\\\\\\\.)* ($|/[iogmx]*)%x');
|
||||
assert($str !== null); // this must have matched, else our regex is skewy
|
||||
$this->record($str, 'REGEX');
|
||||
} else {
|
||||
$this->record($this->get(), 'OPERATOR');
|
||||
}
|
||||
}
|
||||
}
|
||||
}}}
|
||||
|
||||
|
||||
*Examples*: languages/perl.php is a language which uses several overrides, to handle 'quote-like delimiters', heredoc, and regex/slash disambiguation.
|
40
3rdparty/luminous/docs/site/stateful-scanner
vendored
Executable file
40
3rdparty/luminous/docs/site/stateful-scanner
vendored
Executable file
@ -0,0 +1,40 @@
|
||||
# parent: Writing-a-language-scanner
|
||||
=Stateful Scanners (with LuminousStatefulScaner)=
|
||||
|
||||
Stateful scanners use a transition table. The LuminousStatefulScanner is an extension of LuminousSimpleScanner, so the approach is very similar and overrides are available.
|
||||
|
||||
Inside the scanner's init() method, we define a set of patterns. The patterns represent syntax rules. The patterns can be either complete, or a pair of delimiters. In the latter case, the pattern is split into its start and end delimiters, and it becomes 'stretchy' (transitions can occur within the state and they can 'stretch' the token).
|
||||
|
||||
The pattern names represent state names, which are referenced in the transition table. The initial state is a special state called 'initial'. If you omit the 'initial' key from the transition table, every state is a legal transition from initial.
|
||||
|
||||
== Simple Example ==
|
||||
|
||||
For the sake of simplicity, we'll consider the case of standard string escaping as a language.
|
||||
|
||||
In BNF(ish), our language looks like this:
|
||||
|
||||
{{{lang=bnf
|
||||
escape := "\\" <anything>
|
||||
string := '"' (<anything except '\\' or '"'> | escape)* '"'
|
||||
}}}
|
||||
|
||||
This maps fairly directly to a stateful scanner as so:
|
||||
|
||||
{{{lang=php_snippet
|
||||
class MyScanner extends LuminousStatefulScanner {
|
||||
|
||||
public function init() {
|
||||
$this->add_pattern('STRING', '/"/', '/"/');
|
||||
$this->add_pattern('ESCAPE', '/\\\\./');
|
||||
|
||||
$this->transitions = array(
|
||||
'initial' => array('STRING'),
|
||||
'STRING' => array('ESCAPE'),
|
||||
);
|
||||
}
|
||||
}
|
||||
}}}
|
||||
|
||||
In more 'real' usage, the transitions can nest as deeply as you like, so if you had a scanner which needed to handle balanced bracket/parathetical groupings, the stateful scanner would be ideal.
|
||||
|
||||
*Real examples*: see the LaTeX scanner (languages/latex.php) which observes a transition table for LaTeX's math mode.
|
26
3rdparty/luminous/docs/site/todo
vendored
Executable file
26
3rdparty/luminous/docs/site/todo
vendored
Executable file
@ -0,0 +1,26 @@
|
||||
= TODO list =
|
||||
|
||||
This is a list of things I'd like to see Luminous do in future but which I have no immediate plans to implement. These are therefore things that could be picked up by a contributor, if anyone wishes.
|
||||
|
||||
== Luminous's internals ==
|
||||
|
||||
* Tokens:
|
||||
# Luminous internally represents a token stream as an [Writing-a-formatter XML string]. This comes from the scanner and is passed to the formatter. This is a leftover from early versions and isn't ideal: it's not strictly necessary and the string conversion/parsing is ugly and probably slows things down for the formatter. The token stream is stored using PHP data structures until the very last point before the scanner hands over control. It would be simple to remove this were it not for the fact that [filters filters] sometimes go into the token structures and start nesting other tokens, by embedding XML. Therefore a better way of handling hierarchical tokens is needed as well. If I recall correctly, the stateful scanner may have something along these lines.
|
||||
# The token structure in Luminous is pretty much non-existent. Tokens are just arbitrary strings. These could be pulled out into more central definitions and thereby form a half-way coherent specification of what CSS classes should be respected.
|
||||
# It would be good to structure them better; at the moment tokens are entirely individual, whereas sometimes tokens can be seen as a kind of hierarchy - e.g. most programming languages have different classes of keywords such as keywords (e.g. 'if') and keyword operators (e.g. 'and'). These could be better transcribed into the css as .keyword and .keyword.operator.
|
||||
* The [cache cache] can use MySQL as a storage location, but no other RDBMS due to reliance on `INSERT IGNORE`. It may be that the queries or logic can be rewritten using standard SQL, or failing that, RDBMS specific queries and a settings parameter would be acceptable.
|
||||
* Performance and optimisation: Luminous is slow. It 'seems' slower than it should be, but I haven't managed to identify any easily removable bottlenecks. Therefore performance improvements would be welcome (but not at the expense of code readability or quality).
|
||||
|
||||
|
||||
== Languages ==
|
||||
|
||||
* Virtually all languages can be improved!
|
||||
* I'd like to see the Ruby scanner improved, or at least tested, by someone who actually understands Ruby's grammar (I've written about 10 lines of Ruby, ever).
|
||||
* The CSS scanner should ideally handle dialects like [http://sass-lang.com/ SASS] and [http://sandbox.pocoo.org/clevercss/ CleverCSS]. (Note: SASS apparently has a legacy syntax which makes it very similar to CleverCSS).
|
||||
|
||||
== Output ==
|
||||
* The line numbered HTML output uses a somewhat overly complicated markup, utilising a table and having the line numbers and code in two adjacent cells. The reason for this is that it provides good control over both the code and numbering, and makes the numbering seem transparent (i.e. copying and pasting the code won't get the numbering too). There may be better ways to achieve this, and the markup could probably be made cleaner.
|
||||
|
||||
== Other ==
|
||||
* Various mature syntax highlighting plugins exist for various blogging platforms, providing things like integration into the rich text editor. Most of these are hard coded to use a specific highlighter (SyntaxHighlighter and Geshi seem to be the most popular). License permitting it may be possible to abstract away their dependence on a particular highlighter.
|
||||
* The [/page/codeigniter-syntax-highlight-hook CodeIgniter plugin] ([https://github.com/markwatkinson/ci-syntax-highlight GitHub]) uses regular expressions to extract code blocks from the page's markup. We all know the dangers of parsing XML with regular expressions. A good alternative would be using xpath, however, it also recognises `[code] ... [/code]`, which is invisible to XML.
|
47
3rdparty/luminous/docs/site/troubleshooting
vendored
Executable file
47
3rdparty/luminous/docs/site/troubleshooting
vendored
Executable file
@ -0,0 +1,47 @@
|
||||
#parent:index
|
||||
= Troubleshooting =
|
||||
|
||||
\contents 2
|
||||
|
||||
== Examples ==
|
||||
|
||||
If you have problems, point your browser at examples/example.php and see if it works. It should display a few different PHP highlights. This particular example allows you to toggle caching and will spew errors at you if there is a permissions problem with your cache.
|
||||
|
||||
== Plain/non-highlighted output ==
|
||||
|
||||
If the output you are seeing is semi-reasonable, but it doesn't seem to actually be highlighted, the stylesheets have probably not been included.
|
||||
|
||||
`luminous::head_html()` tries to guess the public URL to the style directory and may get it wrong in some circumstances (specifically those involving symbolic links, due to PHP limitations). You can verify that this is the problem by looking at your page's HTML source from your browser: there should be two <link> elements whose href attributes point to your style/ directory. If the path detection has failed, their href attribute will be ugly.
|
||||
|
||||
If the href URL is indeed wrong, then you can either include the CSS files by hand in your PHP/HTML file and omit the call to head_html(), or you can override the 'relative-root' setting: `luminous::set('relative-root', 'real/url/to/luminous/')` (before you call head_html()).
|
||||
|
||||
== Directory Structure ==
|
||||
|
||||
Some people seem to want to deploy their setup strangely, which might make things unnecessarily complicated and more likely to break. *Pages which need highlighting do not need to reside in the luminous directory*. Luminous is a library, not a framework, and should be left in a subdirectory somewhere and forgotten about.
|
||||
|
||||
The only modification of Luminous's directory you should perform is to create the cache.
|
||||
|
||||
Let's say you want to highlight some code from your index.php, an example directory structure would be this:
|
||||
|
||||
{{{lang=plain
|
||||
htdocs/
|
||||
index.php
|
||||
luminous/ -- don't edit the contents of this
|
||||
cache/ -- except to create this
|
||||
...
|
||||
}}}
|
||||
|
||||
and index.php should include the line:
|
||||
{{{lang=php_snippet
|
||||
include(dirname(__FILE__) . '/luminous/luminous.php');
|
||||
}}}
|
||||
|
||||
|
||||
== Cache errors ==
|
||||
|
||||
See the [cache cache] page.
|
||||
|
||||
|
||||
== Still doesn't work... ==
|
||||
|
||||
Please file a bug report on [!https://github.com/markwatkinson/luminous/issues/new the issue tracker] if you're on GitHub or send an email to mark (at) asgaard co uk.
|
19
3rdparty/luminous/docs/site/usage
vendored
Executable file
19
3rdparty/luminous/docs/site/usage
vendored
Executable file
@ -0,0 +1,19 @@
|
||||
==Quick Usage==
|
||||
|
||||
# Extract your archive to some directory (or clone the repo, or whatever). This should be inside your document root, as the style/ directory needs to be web-visible.
|
||||
# If you downloaded a development version from GitHub (as opposed to an archive from this site), remove the tests/ directory, this is not something you want to expose on a public machine.
|
||||
# Create a directory called 'cache' inside your luminous directory and make sure it is writable to your server (this probably involves 777 permissions).
|
||||
# Now test everything is working by creating a new file, the hello world of highlighting:{{{lang=php
|
||||
<?php
|
||||
require_once '/path/to/luminous/luminous.php';
|
||||
echo luminous::head_html(); // outputs CSS includes, intended to go in <head>
|
||||
echo luminous::highlight('c', 'printf("hello world\n");');
|
||||
}}}
|
||||
# Point your browser at the page you just created and it should show a single line of highlighted source code.
|
||||
|
||||
== Problems? ==
|
||||
Check out the [[troubleshooting]] guide.
|
||||
|
||||
== Advanced Usage ==
|
||||
* Check the examples/ directory for a few examples of how you might use Luminous
|
||||
* Have a look at the [[User-API-Reference]] for setting up runtime configuration settings.
|
Reference in New Issue
Block a user