Module latexpp.preprocessor — the main preprocessor class

This module provides the main preprocessor engine.

class latexpp.preprocessor.LatexPreprocessor(*, output_dir='_latexpp_output', main_doc_fname=None, main_doc_output_fname=None, config_dir=None)

Main preprocessor class.

This class collects together various fixes and applies them to a LaTeX document or parts of the document.

Arguments:

  • output_dir is the folder where the resulting processed document should be placed.

  • main_doc_fname is the main document that we need to process.

  • main_doc_output_fname is the name to give to the processed main document inside the output_dir folder.

  • config_dir is the root directory (when using the command-line latexpp tool, this is where the lppconfig.yml resides). Relative paths specified to some helpers such as copy_file() are interpreted as relative to this directory.

The fixes can be installed directly via a configuration data structure with install_fixes_from_config() (as extracted from a YaML reader from a lppconfig.yml file, for instance), or fix instances can be installed manually with install_fix().

Initialization tasks should be run by calling initialize() after all fixes have been installed, but before execute_main() (or friends) are called.

The actual processing is performed by calling one of execute_main(), execute_file(), or execute_string(). These parse the corresponding LaTeX code into nodes and runs all fixes.

After calling the execute_*() methods as required, you should call finalize() to finish the processing and carry out final tasks that the fixes need to do at the end. You’ll also get a warning for files that are in the output directory but that weren’t generated by latexpp, etc.

This preprocessor class also exposes several methods that are intended for individual fixes’ convenience. These are make_latex_walker(), create_subpreprocessor(), check_autofile_up_to_date(), register_output_file(), copy_file() and open_file(). See their doc below.

Attributes:

parent_preprocessor

This attribute is used for sub-preprocessors. See create_subpreprocessor().

Methods:

install_fix(fix, *, prepend=False)

Register the given fix instance to be run after (respectively before if prepend=True) the existing list of fixes.

The type of fix must be a subclass of latexpp.fix.BaseFix.

install_fixes_from_config(lppconfig_fixes)

Load all the fixes from the given configuration data structure. The lppconfig_fixes are a list of dictionaries with keys ‘name’ and ‘config’. It’s the same as what you specify in the lppconfig.yml in the fixes: configuration.

This automatically calls install_fix() for all the loaded fixes.

initialize()

Perform essential initialization tasks.

Must be called after all fixes are installed, but before execute_main() is called.

finalize()

Calls the finalize() routine on all fixes. Fixes have the opportunity to finish up stuff after the document has been processed.

Must be called after execute_main() is called.

execute_main()

Main execution routine. Call this to process the main document with all our installed fixes.

execute_file(fname, *, output_fname, omit_processed_by=False)

Process an input file named fname, apply all the fixes, and write the output to output_fname. The output file name output_fname is relative to the output directory.

Unless omit_processed_by is set to True, the output file will start with a brief comment stating that it was the result of preprocessing by latexpp.

execute_string(s, *, pos=0, input_source=None, omit_processed_by=False)

Parse the string s as LaTeX code, apply all installed fixes, and return the preprocessed LaTeX code.

The input_source argument is a short descriptive string of the source of the LaTeX content for error messages (e.g., the file name).

Unless omit_processed_by is set to True, the output file will start with a brief comment stating that it was the result of preprocessing by latexpp.

preprocess(nodelist)

Run all the installed fixes on the given list of nodes nodelist.

make_latex_walker(s)

Create a pylatexenc.latexwalker.LatexWalker instance that is initialized to parse the string s.

Returns an instance of a customized version of pylatexenc.latexwalker.LatexWalker. The custom latex walker adds some functionality to the node classes generated by the latex walker. See Implementation notes for pylatexenc usage for more information.

Bottom line is that fix classes should never create pylatexenc.latexwalker.LatexWalkers directly, but rather, they should use this method to creater a latex walker.

create_subpreprocessor(*, lppconfig_fixes=None)

Create a sub-preprocessor (or child preprocessor) of this preprocessor.

Sub-preprocessors are used in some fixes in order to apply a separate set of fixes for instance to parts of the document. (See, e.g., latexpp.fixes.regional_fix.Apply or latexpp.fixes.usepackage.InputLocalPkgs)

A sub-preprocessor is itself an instance of a LatexPreprocessor. You install fixes (or load the fixes from a config data structure), initialize() it, run exec_*() methods as required, then finalize() it.

check_autofile_up_to_date(autotexfile, *, what_to_run='(pdf)latex')

autotexfile is a file automatically generated by LaTeX in the original directory (e.g., .aux, .bbl).

This function raises an error if autotexfile doesn’t exist, and generates a warning if its modification time stamp is earlier than that of the main TeX file.

Arguments:

  • what_to_run. If the auxiliary file autotexfile does not exist, then an error is emitted telling that they have to run what_to_run first. By default, what_to_run=”(pdf)latex”.

register_output_file(fname)

Take note that the given file fname is part of the output of this latexpp run. The file name fname should be relative to self.output_dir.

The point of this is that the preprocessor will inspect the output directory at the end of the whole process and will emit a warning if it finds any file that wasn’t generated by latexpp. This method is how a fix can tell the preprocessor that it is responsible for a specific new file in the output and that that file should not be part of the “foreign files warning”.

copy_file(source, destfname=None)

Copy the file specified by source (either an absolute path, or a path relative to config_dir) to the output directory, and rename it to destfname. If destfname is a path, it must be relative to inside the output directory.

The file is registered as an output file, i.e., you don’t need to call register_output_file() for this file.

open_file(fname, **kwargs)

Open the file fname for reading and return a handle to the open file. Should be used in a context manager as with lpp.open_file(xxx) as f:

(Use this function instead of open() directly so that the fixes can be integrated more easily in the tests with mock inputs.)