EmPy 4.0 release announcement

Announcement

I’m pleased to announce the release of EmPy 4.0.

EmPy is a powerful, robust and mature templating system for inserting Python code in template text. EmPy takes a source document, processes it, and produces output. This is accomplished via expansions, which are signals to the EmPy system where to act and are indicated with markup. Markup is set off by a customizable prefix (by default the at sign, @). EmPy can expand arbitrary Python expressions, statements and control structures in this way, as well as a variety of additional special forms. The remaining textual data is sent to the output, allowing Python to be used in effect as a markup language.

EmPy also supports hooks, which can intercept and modify the behavior of a running interpreter; diversions, which allow recording and playback; filters, which are dynamic and can be chained together; and a dedicated user-customizable callback markup. The system is highly configurable via command line options, configuration files, and environment variables. An extensive API is also available for embedding EmPy functionality in your own Python programs.

EmPy also has a supplemental library for additional non-essential features (emlib), a documentation building library used to create this documentation (emdoc), and an extensive help system (emhelp) which can be queried from the command line with the main executable em.py (-h/--help, -H/--topics=TOPICS). The base EmPy interpreter can function with only the em.py/em file/module available.

EmPy can be used in a variety of roles, including as a templating system, a text processing system (preprocessing and/or postprocessing), a simple macro processor, a frontend for a content management system, annotating documents, for literate programming, as a souped-up text encoding converter, a text beautifier (with macros and filters), and many other purposes.

Getting the software

The current version of EmPy is 4.0.1.

The official URL for this Web site is http://www.alcyone.com/software/empy/.

The latest version of the software is available in a tarball here:
http://www.alcyone.com/software/empy/empy-latest.tar.gz.

The software can be installed through PIP via this shell command:

% python3 -m pip install empy
...

For information about upgrading from 3.x to 4.x, see
http://www.alcyone.com/software/empy/ANNOUNCE.html#changes.

Requirements

EmPy works with any modern version of Python. Python version 3.x is expected to be the default and all source file references to the Python interpreter (e.g., the bangpath of the .py scripts) use python3. EmPy also has legacy support for versions of Python going back all the way to 2.3, with special emphasis on 2.7 regardless of its end-of-life status. It has no dependency requirements on any third-party modules and can run directly off of a stock Python interpreter.

EmPy will run on any operating system with a full-featured Python interpreter; this includes, but is probably not limited to, Linux, Windows, and macOS (Darwin). Using EmPy requires knowledge of the Python language.

For more details, see http://www.alcyone.com/software/empy/README.html#requirements.

License

This software is licensed under BSD (3-Clause).


Changes

Major new features added

Here is a list of the major new features introduced in EmPy 4.0. See the documentation for more information.

Added markup

Inline comments @*...*

EmPy now supports comments which can be embedded anywhere and do not consume the whole line.

Backquote literals: @`...`

EmPy now has a mechanism for quoting any literal text, including EmPy markup. Note that this replaces the old “repr” markup.

Chained if-then-else expressions: @(...?...!...?...!...)

If-then-else expressions can now be chained indefinitely.

Functional expressions: @f{...}

Simple expressions have been extended to support functional expressions, which allow calling objects with arguments which are expanded EmPy markup (and thus strings), rather than Python expressions. The syntax can be repeated for multiple arguments, e.g., @f{argument1}{argument2}....

Full support for @[try] control with @[except ...], @[else], and @[finally]

All legal @[try] control markup equivalents of the Python try control structure are now supported.

Support for @[else] within @[while ...] control

All legal @[while ...] control markup equivalents of the Python while control structure are now supported.

@[with ...]...@[end with] control

The EmPy equivalent of the Python with control structure is now supported.

@[defined ...]...@[end defined] control

There is now a new control markup which allows testing for the existence of variables in the globals or (optionally) locals dictionaries.

Stringized significators: @%!... NL

Significators now have a “stringized” form, which allows their values to be unquoted strings, rather than arbitrary Python expressions.

Multiline significators: @%%...%% NL, @%%!...%% NL

There are now multiline forms of significators, as well as stringized variants.

Named escapes: @\^{...}

There is now an extension of the escape markup which allows specifying Named escapes rather than having to use the ASCII/Unicode code point value, e.g., @\^{ESC} for the escape character.

Diacritics: @^...

There is now support for joining characters with Unicode combiners and normalizing the results, allowing the inclusion of accented characters without the need for a Unicode keyboard, e.g., @^e' is a lowercase E with an acute accent.

Icons: @|...

There is now support for user-specified icons, a set of key-value pairs which can be used by the user for arbitrary means. A default set is included, e.g., @|:) represents the smiling face emoji.

Emojis: @:...: with third-party module support

There is now support for specifying emojis by name, as well as suport for third-party emoji modules, e.g., @:VOLCANO: for the volcano emoji.

Other additions

Configuration objects

Instead of a primitive options dictionary, there is now a full-fledged Configuration object which encapsulates all the configurable behavior of an EmPy interpreter. This can be created separately and shared between interpreters; if now configuration is specified, a default one is created.

Full support for Unicode

Previously “Unicode” support was awkward and strange and was specified with the -u/--unicode command line option. Now it is seamless (whether or not using open in text mode or codecs.open in binary mode) and that command line option is no different from specifying --binary for binary output.

Full support for file buffering

Proper file buffering (none, line, fixed, full) is now fully supported for both input and output. The default fixed buffering size has now also been significantly increased.

sys.stdout proxy now reference counted

The sys.stdout proxy is now reference counted when multiple interpreters are in use, rather than needing one proxy per interpreter.

Error handlers

There are now explicit error handlers which can be specified by the user to handle EmPy errors as they occur.

Hooks expanded and now can be used to override default interpreter behavior

Hooks have been significatly overhauled and have now been extended to allow return values for pre... and before... methods to override the standard behavior of the EmPy interpreter.

Serious bugs fixed

Several serious bugs have been fixed, including a nasty \(O(n^2)\) complexity problem when parsing EmPy files containing statement markup containing many lines of text.

Full unit and system tests system

There is a now a full-fledged unit and system test regimen available via the test.sh script.

Extensive builtin help system

There is now a help system (implemented with the emhelp module if present) for getting help from the command line.

Documentation rewritten and expanded

The documentation has been completely rewritten and expanded to be as comprehensive as possible.

Upgrading from 3.x to 4.0

EmPy 4.0 is largely compatible with 3.x, especially with regards to basic syntax, but some incompatibilities were necessary to move forward, particularly when using the embedding API. If you are upgrading from 3.x to 4.0, here are the changes in 4.0 which may affect you. See the documentation for more information.

Changed markup

“repr” markup replaced backquote literal markup: @`...`

The “repr” markup has been removed and replaced with backquote literal markup. If you were using “repr” markup, do so explicitly with the expression markup, e.g., @(repr(...)).

Removed literal close parenthesis, bracket and brace markup: @), @], @}

These served no real purpose and have been removed. Just use an actual close parenthesis, bracket or brace instead.

If-then-else expression no longer supports : for “else”: @(...?...!...)

The use of : for the else delimiter in extended expressions was previously deprecated and has been removed in EmPy 4.0; use ! instead.

In-place markup replaced with emoji markup; in-place markup is now @$...$...$

In-place markup has changed form; it is now @$...$...$. @:...: is now used for emoji markup.

Context line markup (@!...) no longer attempts to pre-adjust line

Specifying the context line via markup previously attempted to adjust the line number so that the next line was the one specified in the markup. This was error-prone and confusing; now no such attempt is made, and the context affected is the line containing the markup, rather than the next one.

Custom markup now parsed more sensibly: @<...>

Custom markup previously did not support contents containing a right angle bracket > except if it was preceded by a backslash; now custom markup is parsed by matching the same number of left angle brackets to start the markup that end it, allowing both left and right angle brackets to appear in its contents, e.g., @<<<This contains <angle brackets>.>>>.

Other changes

Relicensed to BSD

EmPy 4.0 has been relicensed from LGPL to BSD. If your use of EmPy is affected by its license, you may need to take this into account.

Interpreter constructor and expand function call now require keyword arguments

The interpreter Interpreter(...) constructor’s (and expand’s) arguments have changed over time, causing confusion. As of EmPy 4.0, the use of keyword arguments is required. This will generate a clear error the first time the change is encountered, but will cause no further problems even with additional changes.

Errors when calling expand now raise by default

When calling the standalone expand function, the behavior is now that exceptions raised during the expansion will be passed up to the caller. To change this behavior so that the ephemeral interpreter handles the exception, set distpacher to True.

Specifying locals when calling expand has changed

Previously, extra keyword arguments to the standalone expand function were treated as a locals dictionary. Since expand has been changed to use keyword arguments to specify all the arguments for compatibility reasons, locals now need to be specified with an (optional) locals dictionary argument.

Use -d/--delete-on-error instead of “fully buffered files”

“Fully buffered files” was a method of deferring all output until the file was closed successfully to assist supporting the use of EmPy in a build system such as GNU Make. This was awkwardly named (it had nothing to do with actual file buffering) and was error-prone; use -d/--delete-on-error instead.

Cleaned up environment variable names

The environment variable names have been cleaned up and expanded. If you are using environment variables with EmPy, check to see whether the variables you are using have changed.

Options dictionary replaced with full-fledged configurations

If you are using the options dictionary, this has been replaced with a Configuration class in EmPy 4.0.

Filter shortcuts removed and filter API revised

Filter “shortcuts” (special objects representing certain types of filters) are un-Pythonic and have been removed. Also, the API has changed to be more clear.

Contexts now track name, line, column, and character (Unicode code point) count

The identify pseudomodule interpreter now returns a 4-tuple (including name, line number, column number, and character count), and formatted contexts including three items separated by colons (including name, line number, and column number). Custom context formats can be specified if desired.

Diversions API method names changed to be more clear

The API for diversions has been slightly changed, in particular distinguishing between methods which apply to diversions vs. diversion names.

Hook API completely revised; many hook events added

The hook API has been substantially revised and rationalized; it now supports overriding standard behavior.

Hook, Filter classes are now in emlib

The auxiliary Hook and Filter classes are now in a dedicated emlib module.

New emdoc module

There’s a new emdoc module used for creating this documentation.

Exposed global attributes on interpreter simplified; now only version, major, minor and compat

Previously, the interpreter exposed many auxiliary attributes; this has been simplified to just the ones relating identifying the running EmPy system.

Use argv interpreter attribute instead of args

Previously the interpreter exposed both argv and args attributes to represent EmPy script arguments, with argv corresponding to sys.argv (i.e., it includes the script name as argv[0]) and args being equivalent to argv[1:]. This was redundant and so args has been removed; use argv instead.

Full list of changes between EmPy 3.x and 4.0

  • Re-licensed from LGPL to BSD

  • Completely rewrote and expanded this documentation

  • Some serious \(O(n^2)\) inefficiencies in parsing and re-parsing after transient errors have been fixed

  • Some environment variable name cleanup

  • Added an optional library module for non-essential support classes (emlib)

  • Added a full-fledged, but optional, help system (emhelp)

  • Added a module to assist with generating (this) documentation (emdoc)

  • Interpreter constructor redesigned; recommend always using keyword arguments when creating an Interpreter

  • Configuration objects: If unknown configuration attributes are set or if invalid configurations are detected, the interpreter will raise a ConfigurationError; this replaces the under-used “options” concept from the interpreter

  • Configuration resource files: -c/--config-file=FILENAME

  • The “Unicode subsystem” backend (originally needed for seamless Unicode compatibility when Python 2 was released) was completely reworked; -u/--binary/--unicode still exists but now means nothing more than -u/--binary/--unicode

  • Full support for selecting Unicode encodings and error handlers for both input files and output files; exceptions are raised in the event that incompatible options are detected

  • Added full support for specifying file buffering

  • Specifying no EmPy script on the command line, or using the -i/--interactive option, goes into interactive mode, which is always line-buffered

  • Cleaned up escape codes and added a few extensions

  • The context line token (@!...) no longer attempts to adjust the line number so that the following line is the specified line number; this was error prone and potentially confusing

  • The stdout proxy file object is now installed only when needed, is now reference counted, and is checked for consistency (calling out interfering interpreters)

  • Filter shortcuts have been removed as they were un-Pythonic; corresponding classes exist in emlib

  • Filters can now also be prepended

  • Context strings now include the filename, line number and the column number

  • Context identifiers (empy.identify()) are now a 4-tuple consisting of the filename, line number, column number, and character count

  • Added support for context formatting

  • Some interpreter methods was inconsistently handling the context stack; this was addressed and approved methods are now documented

  • “Fully buffered files” (itself something of a misnomer) have been removed; use -d/--delete-on-error instead

  • Added an option for no output (-q/--no-output)

  • Added -S/--string=STR option

  • Added -Q/--postprocess=FILENAME option

  • Added -G/--postfile=FILENAME option

  • Diversions now have a name attribute

  • Added support for diversions spooling to files

  • Pseudomodule routines which manipulate diversion names rather than diversions are now so named

  • Completely revised hook API

  • Added many hook events

  • Having a missing custom callback function is now an error by default

  • Added error handlers

  • Hooks named pre... or before... can return a true to indicate they’ve handled the event and that the normal processing should be skipped

  • The ability to use : in if-then-else expression markup to delimit the “else” condition, which was previously deprecated, has been removed; use ! instead (e.g., @(...?...!...))

  • If-then-else expression markup has been extended so it can be chained indefinitely (if-then-if-then-…-else): @(...?...! ...?...!...)

  • Newlines in expressions (say, due to word wrap) are replaced with spaces before evaluation if --no-replace-newlines is not specified

  • Removed literal close parenthesis bracket, and brace markup: @), @], @}

  • Added inline comments: @*...*

  • Changed unnecessary “repr” markup to a backquote literal markup: @`...` (for the equivalent of “repr” markup, just use @repr(...))

  • Added functional expressions: @f{...}

  • Added multiline significators: @%%...%%

  • Added stringized significators @%!... and multiline stringized significators @%%!...%%

  • In-place markup was changed from @:...:...: to @$...$...$

  • Added more escape codes (@\...)

  • Added named escapes (@\^{...})

  • Added diacritic markup: @^...

  • Added icon markup: @|...

  • Added emoji markup: @:...:

  • Added support for all legal usages of [finally] and @[else] within @[try]...@[end try] control markup

  • Added support for @[else] within @[while ...]...@[end while] control markup

  • Added new @[dowhile ...]...@[else]...@[end dowhile] control markup

  • Added new @[with ...]...@[end with] control markup

  • Added new @[defined ...]...@[end defined] control markup

  • Newlines are allowed within control markup; they are treated as spaces

  • Changed custom markup (@<...>) to be parsed more simply and usefully

  • Automatically swap markup symbols for an alternate prefix if there is a conflict (e.g., when specifying that the prefix is $ instead of @, @$...$...$ markup becomes $@...@...@)

  • Removed VERSION (EmPy version) interpreter attribute and replaced it with version; also added major, minor (detected Python version) and compat (list of compatibility features that needed to be enabled)

  • Removed Interpreter, Hook and Filter aliases as interpreter attributes; use em or emlib instead

  • Removed args interpreter attribute; use argv instead

  • Raise a ConsistencyError for problematic issues detected before and after running

  • Add system information reporting: -W/--info

  • Add a details system for submitting bug reports: -Z/--details

  • … And did a great deal of refactoring!