Writing a custom fix

It is easy to write new fixes to integrate them in your latexpp flow. A fix instance simply performs actions on the document structure in an internal representation with data nodes, and alters them, removes selected nodes or produces new nodes that change the LaTeX code of the document.

Quick start

Say you have a LaTeX document that you’d like to process with latexpp, and say that you feel the need to write a particular fix for this document. Let’s try to get you started in 30 seconds.

In the document’s folder, create a new folder which we’ll call here myfixes (this is your fix python package folder, you can give it any valid python package name). In that folder, create an empty file called __init__.py. Finally, create your fix python file, say mycustomfix.py, in that folder and paste in there the following contents:

import logging
logger = logging.getLogger(__name__) # log messages

from pylatexenc.macrospec import MacroSpec, EnvironmentSpec
from pylatexenc import latexwalker

from latexpp.fix import BaseFix

class MyGreetingFix(BaseFix):
    r"""
    The documentation for my custom fix goes here.
    """

    def __init__(self, greeting='Hi there, %(name)s!'):
        self.greeting = greeting
        super().__init__()

    def specs(self, **kwargs):
        return dict(macros=[
            # tell the parser that \greet is a macro that takes a
            # single mandatory argument
            MacroSpec("greet", "{")
        ])

    def fix_node(self, n, **kwargs):

        if (n.isNodeType(latexwalker.LatexMacroNode) 
            and n.macroname == 'greet'):

            # \greet{Someone} encountered in the document

            # Even if we declared the \greet macro to accept an
            # argument, it might happen in some cases that n.nodeargd
            # is None or has no arguments.  This happens, e.g. for
            # ``\newcommand{\greet}...``.  In such cases, leave this
            # \greet unchanged:
            if n.nodeargd is None or not n.nodeargd.argnlist:
                return None # no change

            # make sure arguments are preprocessed, too, and
            # then get the argument as LaTeX code:
            arg = self.preprocess_contents_latex(n.nodeargd.argnlist[0])
            
            # return the new LaTeX code to put in place of the entire
            # \greet{XXX} invocation.  Here, we use the string stored
            # in self.greeting.  We assume that that string has a
            # '%(name)s' in it that can replace with the name of the
            # person to greet (the macro argument that we just got).
            # We use the % operator in python for this cause it's
            # handy.

            # use logger.debug(), logger.info(), logger.warning(),
            # logger.error() to print out messages, debug() will be
            # visible if latexpp is called with --verbose
            logger.debug("Creating greeting for %s", arg)

            # don't forget to use raw strings r'...' for latex code,
            # to avoid having to escape all the \'s
            return r'\emph{' + self.greeting % {"name": arg} + '}'

        return None

You can then use your new fix by adding to your lppconfig.yml:

...
fixes:
  ...
  - name: 'myfixes.mycustomfix.MyGreetingFix'
    config:
      greeting: "I've been expecting you, %(name)s."

In this way, whenever your document contains a macro instruction such as:

\greet{Mr. Bond}

it gets replaced by:

\emph{I've been expecting you, Mr. Bond.}

To complete your quick start, here are some key points.

Key points

  • Any configuration items specified in config: in your lppconfig.yml file are passed directly as arguments to the fix class constructor. You can specify booleans, ints, strings, or even full data structures, all using standard YaML syntax.

  • Your fix class should inherit latexpp.fix.BaseFix. You can check out the documentation of that class for various utilities you can make use of in your fix. (It can also inherit from latexpp.fix.BaseMultiStageFix, see further below.)

  • Perform transformations in the document by reimplementing the fix_node() method. The argument is a “node” in the document structure. The node is one of pylatexenc’s LatexNode document node subclasses (e.g., LatexMacroNode). (See also Implementation notes for pylatexenc usage.)

  • Make sure you always preprocess all child nodes such as macro arguments, the environment body, etc. so that fixes are also applied to them. As a general rule, whenever fix_node() returns something different than None then it is also responsible for applying the fix to all the child nodes of the current node as well. This can be done conveniently with self.preprocess_contents_latex() and self.preprocess_latex() which directly return LaTeX code that can be inserted in your new replacement LaTeX code.

  • The parser will assume that a macro does not take any arguments, unless the parser is told in advance about that macro. The parser already knows about a set of standard latex macros (e.g., \emph, \textbf, etc.). Specify futher macros with their argument signatures by reimplementing the specs() method. (See the doc for specs() for more info. Also, it never hurts to specify a macro, even if it was already defined.)

  • If your fix needs multiple passes through the document, you should inherit the class latexpp.fix.BaseMultiStageFix instead of BaseFix. In this case you can subdivide your fix into “stages,” which you define by subclassing latexpp.fix.BaseMultiStageFix.Stage for each stage in your fix process. Each stage object is itself a fix (meaning it indirectly inherits from BaseFix) on which you can reimplement fix_node() etc. Each stage is run sequentially. The “parent” fix object then manages the stages and can store data that is accessed and modified by the different stages.

    See the documentation for BaseMultiStageFix for more details, and check the fix latexpp.fixes.labels.RenameLabels for an example.

  • If you want your fix to work with latexpp pragmas, you should to subclass latexpp.pragma_fix.PragmaFix instead. See the documentation for that class.

  • The preprocessor instance, available as self.lpp, exposes some methods that cover some common fixes’ special needs: