(Generated by  docutils(1))

S-Web42 v0.9.3


Author: Steffen (Daode) Nurpmeso
Contact: steffen@sdaoden.eu
Copyright: ISC license
Date: 1997 - 2005, 2010, 2012 - 2020
Version: 0.9.3
Status: Red and Hot!


S-Web42 is one more option to manage your website. It offers the possibility to expand a directory hierarchy of input files, some of which may consist of a mixture of PIs, (uninterpreted) (X)HTML or XML markup and normal text. In those which do, PIs can be defined and undefined, they may be defined to take arguments (think format strings), which can be, e.g., a handy way to create complex tables; perl(1) code can be evaluated, files can be included, also recursively; simple control statements can be used to conditionalize file content, and some simple MarkLo tags aid in make editing life a bit easier, too.

The s-web42 converter is a perl(1) script, i.e., it needs an installed Perl, version 5.8.1 or above. It also requires the authors S-SymObj module -- you may install it as a regular part of your perl(1) installation by issuing the command $ cpan S-SymObj. Alternatively you may also get it from http://www.CPAN.org or clone the git repository from https://git.sdaoden.eu/scm/s-symobj.git (browse it at https://git.sdaoden.eu/browse/s-symobj.git).


s-web42 [-v[v]]
s-web42 [-v[v]] [--no-rc] --no-update-cache (or: --nuc)
s-web42 [-v[v]] [--no-rc] --force-rebuild [--nuc]
s-web42 [-v[v]] [--no-rc] --expand-one [FILE] (or: --eo [FILE])

perl -C -I/PATH/TO/SymObj.pm s-web42


Use --no-rc to suppress reading of an existent config.rc file. The --no-update-cache option can be used to suppress an update of the cache database -- i.e., rerunning s-web42 again will generate the very same output. With the --force-rebuild option an existing cache database can be ignored so that effectively everything is rebuild from scratch; this mode may be combined with --no-update-cache.

The --expand-one option will read the programs standard input or FILE, if one was specified, expand and filter it just as described below, and write the resulting content to the programs standard output. This is an isolated and special mode in that none of the described website management actions are performed, except of reading in the optional configuration file, as described below.

The -v option can be used to gain some verbosity, using it twice will be even more verbose; if used in conjunction with --expand-one these messages will go to the standard error instead. The trailing two examples show how to extend the perl(1) @INC path so that the required S-SymObj module will be found without installing that in a regular place, a task that often requires administrator privileges. And please be aware that no effort is put in parsing command line arguments: one needs to use the very format shown above.


S-Web42 is charset agnostic. It reads and writes files, and simply reuses the actual character encoding that perl(1) has chosen to use. It is therefore highly desirable to either use the -C command line option of perl(1) or to set the PERL5OPT environment variable to this value, because only like that perl(1) tries to reflect the users locale settings in respect to Unicode-awareness of its input and output! Please see the perlrun(1) manual for more documentation, and the above examples on how to do that in a POSIX-compatible, Bourne or Korn shell.

If S-Web42 detects that perl(1) does not use UTF-8 I/O even though the users locale is UTF-8 aware it will complain remarkably.


S-Web42 consists of only one file: s-web42. In the repository there is also the script test.sh, which is the unit test of S-Web42, and header, footer, template.html, hook and config.rc, but these form a (somewhat primitive and rather identical to the authors very own website) usage example only, they are not required for operation.

Once invoked, the converter script s-web42 requires the subdirectory site to exist in the current working directory, since that is used as the input tree. It will also assume it can use the filenames cache.dat (the cache database), cache.old (database state before last run), and, temporarily, *.tmp in there. The generated shell archive will always be w42-update.sh, and it will not have any executable bits set.

If the file config.rc is found in the working directory, and the --no-rc command line option has not been used, it will be read -- the Assignments of PI variables seen there will form the outermost context and will thus be inherited by all files under site (PIs can be overridden and undefined for anything deeper in the hierarchy only). Here, and only here is it possible to specify some very Special PI variables.

The presence of a file hook is also recorded, and it will be used as a per-directory fallback hook in all those site subdirectories which do not provide their own. The optional per-directory hooks can be used to create and/or modify the contents of the directory (and subdirectories and only) on the fly. It is a fatal error if such a hook is not an executable program or if it does not exit successfully. Hooks will be given one command line argument, and that is the current working directory from within which the hook is run.

Then the converter will enter the site directory and recursively parse the tree therein. An existent per-directory hook is run at that point, and before deeper levels of the hierarchy are entered. Note that the directory entries SCCS, CVS and anything which starts with a dot (thus also .git and .RCS) are completely ignored by S-Web42, except that files may be included when placed in the WHITELIST variable of config.rc. It is not possible to savely create directories on the fly except in deeper, not yet parsed hierarchy levels below the current directory, too.


S-Web42 will not handle paths with embedded quotation marks, i.e., neither " nor ' characters may be used for directory- and filenames. You should possibly be conservative about filenames in general, mostly in respect to embedded whitespace, as the implemented quoting rules are primitive, yet sufficient.

After the entire tree has been traversed like that once, the content of all directories will be reread, and the resulting tree will be compared against the version recorded in the cache database. Anything which is missing or which seems to be modified will be rebuild, either as a bitwise copy of the source or as the result of S-Web42 filtering, as below. Filtering will be applied to all files which' name includes the string -w42, or, to be exact, which name' end with a string that matches (.*?)(-w42(?:-(x|[icewpatsm]+))?(-new)?)$.

S-Web42 assumes that a file is modified if its modification timestamp is different from that recorded in the cache database (but see IGNORE_MODTIME), and if its MD5 checksum is different from the recorded one. One can force rebuilding of S-Web42-filtered files on a per-file base by using a filename that includes the suffix -new, as in index.html-w42-new -- neither modification time nor input checksums will matter for such files, they will always be rebuild. Input checksum of S-Web42-filtered files means that the configuration and the Assignments part have been fully expanded, but anything thereafter is treated bitwise.


Files included via the <?include?> PI are not checked, i.e., no dependency tracking is performed. The Assignments directive ?include? will however be resolved and therefore affects the input checksum that may cause file rebuilding.

Any file rebuilt non-bitwise is then checksummed again, and will only be part of the result if the MD5 checksum of the rebuilt target is different from the checksum stored in the cache database, but not otherwise. (The only exception to this rule is if the user uses the --force-rebuild command line option, as that will rebuild everything from scratch.)

So, what will the result look like? One may be astonished, but the result will be generated as a shell archive with uuencode(1)d (and, optionally, COMPRESSed) members. These archives can either be used to update a local mirror ($ sh w42-update.sh --local TARGET-PATH) or as a sftp(1) batchfile ($ sh w42-update.sh --sftp TMP-PATH | sftp -b - user@host[:dir]; rm -rf TMP-PATH). In the latter case TMP-PATH must be some temporary path that can be used by the archive to unpack its contents therein (it will issue a mkdir TMP-PATH, i.e., create that directory as necessary).

During operation, the generated shell archive will try to ignore any errors, continuing its operation until all commands have been issued. E.g., in sftp(1) batch mode, the termination-on-error is suppressed by prefixing commands with a hyphen -. Any error, and some other informational messages will be logged to standard error, however.

File and directory removals will also be properly handled. However, if the cache database (a textfile) is lost, then the only solution is to delete the target directory manually and to rebuild it with a new from-scratch archive.

On filenames and -content

S-Web42 does not look into files unless their name ends with a special suffix. If that is not seen, files will be treated as bins of undefined binary content and handled bitwise. If, on the other hand, the filename ends with the suffix (.*?)(-w42(?:-(x|[icewpatsm]+))?(-new)?)$, then it will be subject to content filtering, as described in the rest of this document. Let us inspect that cryptic expression:

S-Web42 does not care, simply name any file the way you want, with or without a file extension. It does not matter.
(-w42 ... )$ -- S-Web42 filter trigger

If, after the ignored part, the string -w42 is seen, as in index.html-w42, then this file is flagged as being subject to S-Web42 content filtering.


The trailing S-Web42 specific part will be stripped from the filename so that, e.g., an input file index.html-w42-new will produce an output file index.html.

(?:-(x|[icewpatsm]+))? -- Mode configuration
It is possible to fine tune the behaviour of S-Web42 content filtering and expansion by continuing the -w42 suffix with a hyphen - and then either the letter x or any combination of the letters icewpatsm. This is described in detail in the section About.Filter. An example of such a filename would be index.html-w42-cea.
(-new)? -- Forced rebuild trigger
If the suffix ends with a trailing -new part, then both, the modification time and the files input checksum will not be used to decide whether a file has been modified or not. Instead it will always be rebuild, and the decision whether to include the file in the result or not is based upon the checksum of the generated file. A filename example would be index.php-w42-x-new.

Moreover, all files that are subject to S-Web42 content filtering (and config.rc but this is a special case in that S-Web42 will complain if it contains anything else but the Assignments part) have to comply to a very specific content layout scheme, henceforth called a S-Web42 context:



These filter operations will be performed on the Assignments part of all contexts which are subject to filtering. They will also be applied to the Content part unless the mode configuration suffix was -x or otherwise excluded these actions:

Drop of trailing whitespace (cannot be disabled)
A lines' trailing whitespace is discarded.
Drop of introductional whitespace (disable mode: i)
A lines' leading whitespace is discarded. This step will always be performed on follow lines after escaped newlines, as below.
Handling of shell style comments (disable mode: c)
If the first non-whitespace character of a line is a number sign #, then this line is a comment and as such discarded.
Escaping of newlines (disable mode: e)
If the last character of a non-comment line is a backslash that is not itself escaped by a(n uneven number of) backslash(es), then the next line is joined with the current line after its leading whitespace has been discarded.
Wiping away empty lines (disable mode: w)
If a line is empty then remember it was there but ignore it otherwise.

These filter operations will be performed on the Content part unless the mode configuration excluded them:

PI expansion (disable mode: p)
Processing Instructions will be expanded.

These filter operations will be performed on the Content part unless the mode configuration suffix was -x or otherwise excluded these actions:

Automatic paragraphs (disable mode: a)

If a textblock is surrounded by empty lines it will be enclosed in a <p></p> pair unless the block seems to be enclosed in a tag, or unless so-called "mode-switching" PIs are used within it. I.e., no automatic paragraph will be provided for a otherwise perfectly legal textblock if an <?include?> directive is contained therein, or <?perl?> or even a <?pre?>. Automatic paragraphs are ment for human friendly editing of, well, paragraphs, not for fancy markup. By starting such a paragraph with a special trigger character sequence several different kinds of markup can be generated automatically:

= Text
Generates a heading; = generates a h1, == a h2 etc., and ====== generates a h6.
_ Text
Generates a blockquote.
* Text
Generates a bullet list.
DecimalDigits. Text
Generates a numbered list with an item that uses a value that equals "DecimalDigits".
@ Text1 @ Text2
Generates a definition list. "Text1" will be the content of the definition term, and "Text" will form the body of the item.
5 or more hyphen characters create a separating horizontal rule. This is a bit special because the textblock must solely consist of this single line.


== It is not a Wiki!

You would not believe what i saw:

* Cats

* Mice

* Birds


_ Wow!

_ Or Wuff-Wuff!


<h2>It is not a Wiki!</h2>
<p>You would not believe what i saw:</p>
<hr />
<blockquote><p>Wow!</p><p>Or Wuff-Wuff!</p></blockquote>
Tagsoup joining (disable mode: t TODO: not yet implemented)

The DOM standard (http://www.w3.org/TR/DOM-Level-3-Core) is used (by browsers and such poor software) to transform site content to DOM objects. Unfortunately code like:


(may) result(s) in a useless DOM object covering the newline in between the two tags. To circumvent that S-Web42 tries to join tags like these together.

Whitespace normalization (disable mode: s)
Once the line content is fully expanded (leading and) trailing whitespace is removed (again) and multiple adjacent whitespace characters are squeezed to a single (ASCII) space character.
MarkLo expansion (disable mode: m)

Some MarkLo markers will be converted to markup. Expanded content is reevaluated until no more expansion is possible. MarkLo detection and expansion is neither performed across newline boundaries nor PI occurrences, and there is no possibility to escape MarkLo expansion (via backslash escaping for example) except by turning it off entirely. It is possible to embed a closing brace by escaping it like that, however:

\c{CONTENT} -> <tt>CONTENT</tt>
\i{CONTENT} -> <em>CONTENT</em>
\b{CONTENT} -> <strong>CONTENT</strong>
\u{CONTENT} -> <u>CONTENT</u>

# (These are bit special, but nice to use)
\a{NAME}    -> <a name="NAME"></a>
\l{LINK}    -> <a href="LINK">LINK</a>

\i{I \b{really \u{{love\}} you}, baby!}
<em>I <strong>really <u>{love}</u> you</strong>, baby!</em>

Finally a -w42 file content example:

WHO = Ziggy

Yo S-Web42.

# This is a comment line.
   # Yet another comment line
 This LN \
      s \i{NL} escaping.\\\\

<p>Yo S-Web42.</p><p>Ziggy. This LN uses <em>NL</em> escaping.\\</p>

But -- feel free:

<?begin?>\i{Hello}, S-Web42.<?end?>
<em>Hello</em>, S-Web42.


The content is prefixed by the (PI) variable assignment block.

var1 = content of var1
var2 ?= conditional assignment (if not yet defined)
var3 += content (assigned or) added to var3
var4 @= var4 will be an array, this is the first value
var4 @= the var4 array gains another value
var5 ?@= conditional array assignment (if not yet defined)
var5 @= append a member to the now anyway existent var5 array
var6 = <em>markup</em><?def foo<>stupid example?><?foo?><?undef foo?>
?include? = path

All PI variables defined like that may contain any content, including complete PIs and markup. There is no technical difference in between these and stuff defined via def and defa (and defx) with the exceptional possibility to include the PI start and close tags <? and ?>, respectively (as shown in the var6 example). S-Web42 PI and variable names may consist of alphabetical characters, digit characters and the hyphen (-). Note that the case matters, just as usual for XML processing.

Assignment to variable.
Assign to non-existent variable, else append value to it.
Assign to variable, but only if that not yet exists.
Array assignment -- create array as necessary and push a value onto it.
Push value onto array, but only if that is to be newly created.

The ?include? directive can be used to include assignment directives from other files, also recursively. It is not allowed to start the Content section whilst doing so.


If the path starts with a tilde ~, that will be replaced by the value of the environment variable HOME. Likewise, if the path starts with a plus sign +, then that will be replaced with the content of the environment variable WEB42INC. Note it is really that simple.


By definition there is really no notion of a "chroot". One may leave site absolutely or relatively, just as desired.


Everything in between the <?begin?> and <?end?> PIs is expanded and will thus produce real output. Line content is recursively expanded until no further expansion is possible.


Any content after the <?end?> PI is not parsed at all.

Special PI variables

There are a few PI variables which are treated in a special way, either because they can only be set in config.rc, or they are readonly PIs that are provided automatically, or because they are used by Processing Instructions (PIs) as content-injection hooks.

Only recognized in config.rc. If used, it can be set to any of gzip, bzip2, xz and lzma; it thus specifies a un-/compression method to be used before uuencode(1)ing shell archive members. Note that the corresponding perl(1) module is lazy loaded upon request: at the time of this writing only gzip and bzip2 are shipped with a standard perl(1) installation.
Only recognized in config.rc. It set, causes file modification times to be ignored when deciding whether a file has to be updated or not. Maybe necessary if git(1) is used as a source code control system.
Only recognized in config.rc. A set of file globs of files to include in the output even if they would be normally ignored, eg, because they start with a dot.

Readonly. Here _A stands for "array" and _S for "string". These are readonly PIs that correspond to the modification time of the currently processed (outermost) file, in UTC and LOCAL time, respectively. The order of the arrays is: 0=year, 1=month, 2=day, 3=hour, 4=minute, 5=second. The entry at index 6 is the string "UTC" for the UTC versions and the offset from UTC in the ISO 8601:2000 standard format (+hhmm or -hhmm) otherwise. The strings use the format "YYYY-MM-DD HH:MM:SS" and, again, the UTC version appends the string " UTC" whereas the local version appends the string " +-ZONEOFFSET".


If the Time::Piece perl(1) module cannot be loaded (it is believed to be a standard module since Perl version 5.10), then S-Web42 will only provide UTC values, even for the local versions. I.e., the local ones are only aliases, then.

Readonly. Identical to MODTIME_AUTC and friends, as above, but expand to the current time instead.
Injection. Will be injected before and after expansions of href and hreft, respectively. Default to the empty string.

Processing Instructions (PIs)

It follows the list of predefined processing instructions. PIs marked "paired" in the list below are special in that they need a closing end tag -- to, e.g., end the paired PI <?perl?> the PI <?perl end?> is necessary.

For def, defa (also via implicit array vivification, as below) and defx the same name restrictions apply to the introduced PI variable name as has been documented for Assignments. Note that the content that is assigned to PI variables created by those PIs may not contain PIs themselves, of course. It is however possible to use the pseudo-tags <^ and ^> as aliases there, which will be automatically converted to <? and ?>, respectively, and via a simple regular expression, once the PI variable is expanded; to create an empty tag <>, the alias <^> has been provided for completeness sake. See def and defx for examples.


Define a PI which expands to its value part when used.

<?def var1<>varcontent, may contain <em>tagsoup</em>?>
<?def var2<>any <> content but PI start and end tags?>
<?def var3<><^def var4<>es^><^var4^><^undef var4^>?>
T<?var3?><?pi-if var4?>T
varcontent, may contain <em>tagsoup</em>
any <> content but PI start and end tags

Define or extend a PI that serves as an array. Separate members are indicated by placing the empty tag <> in the value content (which is different to def, which would simply expand the empty tag). Individual array members may be accessed by "calling" the array with the desired member index (starting at 0) as an argument:

<?defa arrnam<>m 1?>
<?defa arrnam<>m 2?>
<?defa arrnam<>m 3<>m 4?>
<p><?arrnam 0?><?arrnam 1?>\
   <?arrnam 2?><?arrnam 3?></p>
<p>m 1m 2m 3m 4</p>

One may loop over an entire array by giving loop as the first argument. In this mode two additional, optional arguments may be given; the first will be injected before the value, the second after the value. E.g.:

<p><?arrnam loop<><b><></b>?></p>
<p><b>m 1</b><b>m 2</b><b>m 3</b><b>m 4</b></p>

If the first argument is instead one of unshift and push, then the remaining arguments are joined to a single one and inserted at the front/the back of the array, respectively, auto-vivificating the array as necessary. Likewise, an argument of one of shift and pop removes the first/the last member of the array, causing a log message in verbose mode if the array does not have any members. And an argument undef-empty will undef the array if it is empty. E.g.:

<?test push<>entry 1?>
<?test unshift<>entry 0?>
<?test push<>entry 2?>
<?test loop<><> ?><br />
<?test pop?>
<?test shift?>
<?test loop<><> ?><br />
<?test pop?>
<?test loop<><> ?><br />
<?test undef-empty?>
entry 0 entry 1 entry 2 <br />entry 1 <br /><br />

Define a PI which takes arguments (think format strings). Arguments are indicated by placing the empty tag <> in the value content -- those tags will be replaced by the corresponding user-supplied argument when used (which is different to def, which would simply expand the empty tag).

<?defx var2<>Expanded <> var <> content <>?>
   <?defx note<><em><></em>: <>.?>
<?defx subscription<><^note For subscribers only<^><>^>?>

<?var2 arg1<>arg2<>arg3?>
<?subscription nonexistent list?>
Expanded arg1 var arg2 content arg3
<em>For subscribers only</em>: nonexistent list.
href, hreft, lref, lreft

These purely convenience PIs expand to (X)HTML hyperlinks; the first two should be used to create links which leave the site (see WWW_PREFIX and WWW_SUFFIX), the other two for site-local ones. The versions without the t suffix take one argument, the others will also create a title attribute and thus require two arguments.

<?href http://www.netbsd.org?>
-> <a href="http://www.netbsd.org">www.netbsd.org</a>
<?hreft http://www.opensource.org<><em>Lots</em> of licenses?>
-> <a href="http://www.opensource.org" title="Lots of licenses"
   ><em>Lots</em> of licenses</a>
ifdef, ifndef, else, fi

Simple conditional control statements which test for (non)existence of a variable (PI) and process the enclosed block only if the condition is true. (You may also "test" for 0, which evaluates as not defined.)

<?ifndef HOMEPAGE?>
 <?lreft index.html<>[HOME]?>
 Jo-ho hooo -- welcome on my homepage, dude!
include, xinclude, raw_include, frank_include

The PI <?include?> can be used to include a file that itself will be subject to the same expansion and filtering that is in use for the including file, i.e., it must form a valid S-Web42 context. Paths are interpreted relative to the source directory of the file which uses the include directive.

<?xinclude?> is similar to <?include?>, except that the included file is expected to (implicitly) consist of Content only, so that its inclusion may modify the PI environment of the current context.

The <?raw_include?> PI will simply include the given file raw and without any S-Web42 processing. <?frank_include?> will also include the given file raw, but it will process the lines and escape &, < and > characters, so that HTML parsers will be able to display the (rather) raw content as desired.

HOME = ../index.html
<?include ../../header?>

Hello World, v2.

<?frank_include /etc/passwd?>

<?include ../../footer?>


If the path starts with a tilde ~, that will be replaced by the value of the environment variable HOME. Likewise, if the path starts with a plus sign +, then that will be replaced with the content of the environment variable WEB42INC. Note it is really that simple.


By definition there is really no notion of a "chroot". One may leave site absolutely or relatively, just as desired.

This PI can be used to change the modes described in About.Filter on the fly; the required argument must either be %, in which case the previously active mode is restored, or a combination of the filter mode configuration characters. Mode changes are inherited by deeper contexts, but they will not be propagated to outer ones. Note that PI expansion mode cannot be disabled like that. Be aware - you should really know what you are doing when you use this PI.
perl, sh, xperl, xsh

These are paired statements which can be used to embed perl(1) or sh(1) code, respectively. The code will be executed in a subprocess, with its working directory set to that of the topmost context, i.e., the file that is currently being produced output for.

The output that the subprocess produces on its standard output channel will be subject to the same expansion and filtering that is in use for the surrounding context. In fact the output is expected to form a valid S-Web42 context except for the x-versions, which are expected to produce Content output (implicitly) only, and which thus modify the PI environment of the current context.

The code is subject to the normal S-Web42 processing before it is passed through to the subprocess that runs the interpreter. I.e., PI expansion can be used to "pass arguments" through. This raises the question how a valid S-Web42 context can be produced if PIs are expanded during code evaluation. Well, it turns out that S-Web42 injects four variables automatically:

PIS <?
PIE ?>

(Since it is not necessary to quote the closing ?> of a PI $PIE is provided only for completeness sake.) So -- unfortunately one has to perform the uncomfortable task of PI building via string concatenation to avoid unwanted data expansions. E.g.:

my $gmd = gmtime;
my $lmd = localtime;
my $user = fetch_user();

print <<__EOT__;
USER = $user
${PIS}include +header?>
<h1>Hi ${PIS}USER${PIE}!</h1>

It is ${lmd}.
That is ${gmd} UTC!

${PIS}include +footer?>
<?perl end?>

# This works:
$ echo '<?begin?>START<?sh?>' \
> 'printf "${PIS}BEGIN?>IN${PIS}END?>" | tr [:upper:] [:lower:]' \
> '<?sh end?>END<?end?>' | s-web42 --no-rc --eo

# An <?eval PERL-EXPR?> may at some time be a regular PI..
$ echo '<?begin?><?defx eval<><^xperl^><><^xperl end^>?>\
> <?eval my $t=gmtime; print $t?><?end?>' | s-web42 --no-rc --eo


The entire -- expanded -- script is read into memory before being passed to the interpreter that runs in the subprocess and will eval the script. These PIs cannot be nested (especially not with each other).

If it is unsure whether a PI (variable) exists, use this PI to "invoke" it, instead of the PI itself. I.e., pass arguments etc. just as you would do if you would use the PI directly, but give the name of the PI as the first argument. pi-if will not cause errors but only produce some log messages if the used PI does not exist.
pre, xpre, xcdata

The "preformatted" paired PIs. While these PIs are active none of the filters that have been documented in About.Filter are active except for the PI expansion filter (indeed they create a new context for their content). Since that is mostly of interest inside of HTML <pre> tags, the expansion of the <?xpre?> PI contains this tag already. And the <?xcdata?> PI enwraps the content in a CDATA section in addition.


Leading whitespace on the line of the <?xpre end?> will of course be copied through to the output. This may look odd in an otherwise beautifully indented source.


Undefine an user defined array or PI.


This undefines in the current context only! I.e., if file1 defines the PI DOIT and then includes file2, then an <?undef DOIT?> from within file2 will not affect file1, but only file2 and deeper contexts.



  • The NOW_* series of PI variables has been added.
  • New PIs: <?frank_include?>, <?xcdata?>.
  • Fixed bug in the template header file.


  • Automatic paragraphs have been diversed and now support different target markups, like headings, blockquotes and lists.


  • Automatic paragraphs now also support definition lists.
  • Added anchor and hyperlink MarkLo support.


  • Thanks to the suggestion of Dave Mitchell from perl-porters the MarkLo regular expression is now proof against user input syntax errors and no longer enters a rather endless loop if a closing brace has been forgotten.


  • The unit tests should now work with all shells.
  • Automatic paragraphs are now able to join successive instances: e.g., X successive bullet list items are now joined into a single bullet list, instead of producing X bullet lists.


  • The automatic paragraph commit and tag v0.8.5 were premature. A bit better now.


  • The special PI variable WWW variable has been removed, instead there are now WWW_PREFIX and WWW_SUFFIX.
  • We are a bit more relaxed regarding uudecode(1) POSIX compatibility, and now give a -o /dev/stdout command line option instead of requiring that the in-stream /dev/stdout variant is properly understood.


  • Fix the test which was broken in v0.8.7 after not being updated to reflect the change to WWW_PREFIX and WWW_SUFFIX.
  • MarkLo is now implemented recursively, and closing braces can be embedded by escaping them with a backslash.


  • Bugfix! Or, at least that is what i think today, it may have been desired (we explicitly warn), but let's just not transport mode changes to outer contexts.


  • S-Web42 no longer needs external programs for creating the shell archive. (My performance increasement was pretty dramatical.)

  • The <?mode?> PI has been rethought and using it for adjustments no longer works on a global basis, but changes will not affect outer contexts.

    Before that rewrite a long standing bug of <?mode?> had been fixed which mattered for false use cases where the reverting <?mode %?> had been forgotten.


  • Add support for variables with empty values in the assignment block, as well as for empty arrays.
  • Improve recursive expansion of <^> constructs within normal PIs. Especially handy for arrays. The shipped config templates make use of that and use <^TOPDIR^> prefix for URLs, so that an entire site can be driven with a single template instance.
  • We no longer ignore files starting with a dot when they are placed in the new WHITELIST variable of config.rc.


  • Improve recursive expansion of <^> constructs even further. This time with test case. Should now just do fine everywhere, but i am very far out this perl code, on the other hand.


  • Add an ?include? directive for the Assignments section.


Thanks to Ian Abbott, who alluded the missing <?xcdata?> and <?frank_include?> PIs (at least indirectly through statements on the tz AT iana DOT org mailing list).

Thanks to Dave Mitchell from perl-porters for giving the hint on backtracing that made the fancy recursive regular expression that is used for MarkLo proof against user syntax misses. (I never use and do not really understand these super-fancy extended regular expressions beyond assertions, but remembered the example from the documentation and was too lazy to write the hand-driven recursive parser for that, so i tried it.)

Future Directions

Add a Markdown compatible syntax block.

With automatic parapraphs too many flushs occur, and tagsoup joining is not yet implemented.

The convertion phase can be parallelized without much (in fact almost no) effort; add a special MAXJOBS or similar special config.rc PI variable to control this, then.

An <?eval? PERL-EXPR?> could at some time become a regular PI, that simply evaluates in the current context. Maybe lazy start a concurrent child and communicate with that for this purpose (pay once once needed). If parallelized, one such evalizer for each worker thread/process.

Copyright (c) 1997 - 2022, Steffen Nurpmeso <steffen@sdaoden.eu>
@(#)code-web42.html-w42 1.38 2022-12-22T18:41:43+0000