Skip to content

Introduction

If you are looking for a how-to guide on changing headers and footers please visit our help page because this article focuses on explaining low-level mechanisms used by TeX engines, and LaTeX, to produce headers and footers.

  • Note (update): This article explores the core LaTeX mark mechanism in use prior to June 2022, at which time a new and more versatile implementation of LaTeX’s mark handling was introduced. See Overview of LaTeX’s new mark mechanism for more details. For reasons of backward compatibility, LaTeX continues to support its legacy mark mechanism. Article material discussing the behavior of TeX engines is unaffected by these changes to LaTeX.

To provide useful background/context we start with an overview of some relevant low-level areas of TeX-based typesetting—including temporary (internal) content storage, page-breaking and construction. Discussions have been streamlined/simplified by considering a typeset page which contains a “body” of text plus some headers and footers. Page items such as figures (floats) and footnotes are not addressed because they involve a complex TeX mechanism called insertions, which is far outside the scope of this article.

We’ll start with the basic notion of nodes: the fundamental building blocks used by TeX engines for temporary (internal) content storage.

Some notes on storing content: TeX nodes

As the underlying TeX engine processes your LaTeX code to produce typeset material—such as paragraphs, tables and mathematics—it needs to temporarily store that content within the memory of whichever TeX engine is being used to process your LaTeX code. To store typeset content TeX engines use a sequence of so-called nodes, which you can think of as variably-sized “chunks” of computer (device) memory, linked together to form a list (see this Wikipedia article on linked data structures).

Schematic representation of how TeX engines store content using linked nodes

To represent the fundamental elements of TeX-based typesetting—such as characters, glue, boxes, penalties, kerns and marks (to name but a few)—TeX engines use different types of node. Some nodes, such as those for boxes, need to store more data compared to simpler node types such as character nodes; consequently, some nodes are larger than others—requiring more bytes to store them in memory. When reading about TeX you may encounter terms such as character nodes, glue nodes, mark nodes and so forth: you now know these are just names given to chunks of memory allocated to store data representing an element of TeX’s typesetting.

Why we’re here: understanding mark nodes

Of the many node types the one we need to discuss is called a mark node which is created by the \mark command:

\mark{stuff to store}

The \mark command does not directly produce any typeset content: its argument stuff to store is saved to memory for later use and a mark node is created to contain the memory location of stuff to store. As we’ll see, the \mark command was created specifically to assist with producing headers and footers.

Mark nodes created by \mark commands are “embedded” into the page content ready for use during the final stages of page composition when the TeX engine is ready to add headers and footers to the typeset page (body). Typically, stuff to store will contain material such as page numbers and chapter or section titles/numbers—the building blocks of headers and footers.

ε-TeX adds the \marks command

Version 2 of ε-TeX, an extension to Knuth’s original TeX, was published in 1998 and provides new capabilities together with enhancements to existing features. Today, most of ε-TeX’s enhancements have been incorporated into the mainstream TeX engines: pdfTeX, LuaTeX and XeTeX—see the -etex command line option of TeX engines which enables use of ε-TeX extensions.

Knuth’s original TeX software provides the \mark command which creates just 1 “class”, or “type” of mark node. Your document could include many \mark commands but they all create mark nodes of the same “type” or “class”—there’s no built-in mechanism to group or classify them. This restriction was removed by an ε-TeX enhancement: the \marks command which is supported by pdfTeX, LuaTeX and XeTeX:

\marks n {stuff to store}

where n is an integer (0 <= n < 32768) that determines the mark node class.

Notes:

  • writing \marks 0 (i.e., n=0) is synonymous with using TeX’s original \mark command
  • ε-TeX also introduced mark variables (see later) for each class (n) of mark: \firstmarks n, \topmarks n and \botmarks n. Selected classes of mark variables could be used to provide mark data for specific typesetting tasks.
  • LuaTeX increased the maximum value of n to 65535, enabling \(2^{16}\) mark classes

Reasons for TeX’s use of marks

To understand why (and how) TeX engines use the \mark (or \marks) command to produce headers and footers we need to appreciate some “quirks” of TeX engine mechanisms for constructing pages—finding page breaks and shipping out completed pages to a PDF file. In reality, those page-construction processes are complex but the core principles/concepts can be simplified to provide sufficient background explanations which aid our understanding of marks.

As a TeX engine completes typesetting individual content items such as paragraphs, tables or mathematics, it calls an internal routine (a function called build_page(...)) which tries to add those newly typeset items to the page currently being constructed in memory. When that new material is added to the page being built, TeX also checks whether that newly-added content has made the current page become “suitably full”. If the page is full TeX can create a page break—sending that page’s worth of content for final processing (“packaging”) and subsequent output to the PDF file.

One key feature of the final stage of page packaging (composition) is something called an output routine which is, in effect, a user-defined sequence of commands used to package-up the page of content to make it ready for sending to the PDF file. One example activity of “packaging-up” is to add headers and footers, but there are numerous ways to use output routines—for example, see this article in TUGboat. Output routines are a fairly complex area of TeX so we won’t go into detail here: for present purposes just remember that the final stage of page composition involves using some user-defined sequence of commands (collectively called an output routine).

But let’s consider what could happen when TeX completes typesetting a lengthy paragraph of text, breaking it into a sequence of individual typeset lines. TeX will want to add that newly-typeset paragraph—line-by-line—to the page currently being built but, perhaps, only some of those lines can fit on the current page. Note by “page” we are referring to the main text area (page body) which does not include other page elements such as headers and footers which are added later (by the output routine).

After several paragraph lines have been added to the current page, it could become “suitably full” and time for TeX to create a page break. However, we have some “leftover” material because only part of the paragraph made it to the current page: the “excess paragraph content” is saved for later, ready for creation of the next page. The following graphic illustrates this idea: part of the paragraph is on the current page and part is held back for the next page.

Schematic representation of a paragraph straddling a page break

This paragraph example demonstrates that a TeX engine will often process (typeset) pieces of content which cannot, in their entirety, fit on the current page—because that content reaches beyond the point at which a page break becomes necessary. Recall that TeX engines find linebreaks by inputting, then typesetting, whole paragraphs and, unlike some applications, do not adopt a line-by-line approach to typesetting paragraphs. In other words, TeX typesets the entire paragraph whether or not it fits on the current page (that’s worked out later). Incidentally, the principle of “typeset-in-full then add to the current page” holds true for other types of TeX content, including tables which are fully read-in from the .tex file and typeset in their entirety prior to any attempt at adding them to the current page.

For our paragraph example, the complete text of that paragraph, including any commands within that text, has been fully processed—meaning that TeX’s typesetting has reached a certain location in your .tex file. The “current internal state” of the TeX engine, such as the value of variables or stored parameter settings, will reflect the material, including any commands, processed up to the current location—the end of the paragraph.

Although the complete paragraph has been typeset, some of it cannot be accommodated by the current page, resulting in “excess paragraph content” being saved for the next page. Within the LaTeX code (text plus commands) which produced the “excess paragraph content”, the TeX engine could have called any number of commands and macros which changed values and variables stored internally by the TeX engine—i.e., made changes to TeX’s “current state”.

The following diagram illustrates an extremely important TeX “quirk”, one which impacts several aspects of TeX-based typesetting. Looking at the right-hand side of the diagram you see a representation of nodes comprising all the content currently stored in memory: the light grey area contains nodes forming the current page and the dark grey area is typeset content destined for the next page. This scenario reflects our paragraph example: the dark grey area represents the “excess paragraph content”: lines of text held back because they don’t fit on the current page.

Schematic representation of the node location where TeX finds a page break, also showing content held back for the next page

Recall that once TeX has a “suitably full” page (body) it passes that content to something called an output routine which is a user-defined sequence of commands. Among other things, the output routine adds headers and footers to the page body, finalising the page in readiness for shipping it out to the PDF file.

Our graphic (above) represents the instance TeX has chosen to make a page break at some particular node, such as \baselineskip glue between typeset lines, and is ready to begin packaging the current page using the output routine—a sequence of built-in commands and user-defined macros. However, in addition to content comprising the current page (stored in memory) TeX has already processed and stored (in memory) additional material, from your .tex file. That extra material extends beyond the page break and could have included commands which changed important page-related values or variables, affecting TeX’s “current internal state”— which includes values of internal parameters or data stored by macros.

When TeX starts to package the current page, its “current internal state” is not defined by everything typeset up to the page-break location because TeX has already gone past that point in your file and processed additional material. If you consider the location in your file corresponding to where the page break takes place, and the location TeX has actually reached in your .tex file, the two corresponding internal states of TeX, including stored data and the value of variables, could be very different. The current state of TeX’s typesetting is usually ahead of where the page break occurs; i.e., TeX's typesetting activities are not synchronised with the process of outputting typeset pages via the output routine.

Hypothetical example: how to get the wrong headers

The following diagram illustrates hypothetical commands \myheader and \myfooter used to store the desired text of headers and footers; for example, using basic definitions such as

\newtoks\headertoks% a new token list variable for the header text
\newtoks\footertoks% a new token list variable for the footer text
% Use \global to ensure stored header or footer data is 
% accessible everywhere, including commands in the 
% output routine
\newcommand{\myheader}[1]{\global\headertoks={#1}}
\newcommand{\myfooter}[1]{\global\footertoks={#1}}

in which \headertoks and \footertoks are token lists used to store the text of headers and footers.

In addition, our hypothetical example also has an output routine which obtains the header and footer text from the current values of \myheader and \myfooter. As shown in the diagram, within the content destined for the current page, \myheader and \myfooter set the appropriate (correct) values. However, it would be possible for \myheader and/or \myfooter to be called within a paragraph which straddles a page break. Because TeX has to process the entire paragraph, \myheader and \myfooter would be called again to reset the header/footer text saved in “TeX’s internal state”, producing values which are not correct for the current page. When the output routine produces the header and footer they will contain text intended for the next page.

Graphic depicting incorrect headers and footers produced if the output routine uses TeX macros only to retrieve header and footer text

TeX’s asynchronous behaviour: \mark command to the rescue

TeX’s typesetting (content construction) activities are not synchronised with the final process of outputting a typeset page via commands contained in the output routine. When it is time to output a page, the actual internal state of TeX is ahead of any state it might have had at the location of the page break—where it typeset the final piece of content on the current page being output.

In The TeXbook, Donald Knuth, TeX’s creator, describes this situation as

... TeX’s output routine lags behind its page-construction activities

Because of this unsynchronized relationship, “special facilities” in the form of the marks mechanism, applied through the \mark (or \marks n) command, are required to ensure headers and footers contain material relevant to the actual page being output. The \mark command embeds mark nodes into the page content, storing the memory location of content which can be used for the host page’s header and footer.

The author of the well-known and respected book TeX by Topic describes the \mark mechanism as

... the main mechanism through which the output routine can obtain information about the contents of the currently broken-off page, in particular its top and bottom.

The \mark command is a mechanism designed to circumvent the asynchronous nature of TeX’s page-breaking and output routine algorithm.

Schematic representation of mark data (nodes) embedded in the page body content

Pre-output checks: looking for mark nodes

After the TeX engine has determined the location of a page break, it calls an internal routine (called fire_up()) which does a lot of pre-processing prior to the output routine completing page composition and writing the finished page to the PDF file. An important part of those pre-processing steps is to set the value of three global mark variables: \botmark, \topmark and \firstmark which are used to provide data for constructing headers and footers. Of course, a page can contain multiple mark nodes but these are filtered to produce the three mark variables which act as follows:

  • \botmark is the mark in effect at the current page break; i.e., the last mark seen on the current page
  • \topmark is the value of \botmark from the previous page
  • \firstmark is the first mark on the current page—the first mark between \topmark and \botmark

The following schematic shows the basic principles of pre-processing the typeset page content to determine values for the three global mark variables \botmark, \topmark and \firstmark:

Schematic representation of basic principles used to determine values for the three global mark variables \botmark, \topmark and \firstmark

The next section uses some of the explanations provided above to outline how LaTeX typesets headers and footers.

LaTeX’s classes: page styles and headers/footers

When you write \documentclass{class} where class might be article, book, report or letter, LaTeX loads a file named class.cls; for example, book.cls, article.cls and so forth. Those .cls files contain LaTeX code to implement class-specific variations of various commands and features expected to be present in, and provided by, all document classes—the behaviours shared by all document classes. Examples might include class-specific page layout margins together with class-specific versions of commands, such as \section, \subsection and so forth.

Another important element of .cls files is class-specific implementation of standard page styles which define the headers and footers of documents produced using a particular class and page style. For example:

  • book.cls and article.cls each contain implementations of the page styles headings and myheadings
  • letter.cls contains definitions for page styles headings, empty, plain and one called firstpage
  • report.cls contains definitions for the page styles headings and myheadings

Implementations of page styles often depend on the twoside option of \documentclass—whether the document is single- or double-sided—so that headers and footers are defined appropriately for left-facing (even) pages and right-facing (odd) pages.

Document classes may rely on default definitions of the plain or empty page styles provided elsewhere in LaTeX—external to the .cls file being used.

More on page styles

When you change the style of a page using \pagestyle{somestyle} or \thispagestyle{somestyle} LaTeX expects to find an internal command named ps@somestyle—usually provided by the document class .cls file or perhaps a package such as fancyhdr. Incidentally, by “internal command” we mean a command that is part of the LaTeX source code and whose name contains an @ symbol, which ordinarily prevents casual use (due to its category code).

The definition of \ps@somestyle is responsible for implementing the features/behaviours of that page style, which includes providing definitions of:

  • commands used during the final stages of page composition (to add headers and footers, see later in the article):
    • \@oddhead
    • \@oddfoot
    • \@evenhead
    • \@evenfoot
  • commands to insert section-related mark data for use in generating document headers and footers which contain section titles, numbers and similar. For example, the headings page style (within book.cls) defines the following mark-producing commands:
    • \chaptermark
    • \sectionmark

We will further explore mark-generating commands, such as \chaptermark, but just by way of example: when you write \chapter{chapter title} the \chaptermark command is also executed to insert chapter-related mark data for subsequent production of page headers.

An anatomy of header and footer commands based on page style

This section explores some of the core LaTeX commands/processes used to create headers and footers—drawing on explanations or concepts presented earlier in this article.

LaTeX’s “model” of headers and footers

The content of LaTeX’s (default) headers and footers is based on a two-tier hierarchy of document sections: a “higher level” section containing multiple “lower level” sections. For the book and article classes this translates to:

  • book class: “higher level” sections are produced by \chapter commands and “lower level” sections via \section, reflecting chapters containing many sections.
  • article class (assuming two-sided documents): “higher level” sections are produced using \section commands and “lower level” ones via \subsection, reflecting multiple subsections appearing within the same section.

LaTeX applies its “model” of headers and footers through its sectioning commands which “inject” mark nodes (via the \mark command) to store data relevant to the document section being created. For example, each time you write \section{some section title} a mark node will be created, containing data which reflects some section title and the section number.

LaTeX uses a system of marks that contain data of the form {left}{right}, ultimately generated via \mark commands

\mark{{left}{right}}

encompassed within layers of macros designed to shield users from the lower-level details—which are explored in subsequent sections of this article.

Exploring the book class

When the book.cls file is loaded, one of its final actions is to execute \pagestyle{headings} which sets the default page style to headings, so we’ll look at that page style in more detail.

As noted above, the features/properties of the headings page style, as implemented by the book class, will be defined by the (internal) command \ps@headings contained in the file book.cls. The precise definition of \ps@headings depends on whether the document is single- or double-sided (default for the book class).

“Mark commands” and “mark data”

For two-sided documents produced by the book class, the definition of \ps@headings creates (defines) two “mark commands”: \chaptermark and \sectionmark which produce “mark data”, of the form {left}{right}, for use in headers and footers. Those mark commands reflect LaTeX’s model of headers and footers—based on the hierarchy of document sections:

  • the “higher level” section command (here, \chapter) is used to provide “mark data” for left-facing (even-numbered) page headers. When you start a new chapter, by writing \chapter{chapter title}, \chaptermark is called to create “mark data” which reflects that new chapter—containing its chapter title and number.
  • the “lower level” section command (here, \section) is used to provide “mark data” for right-facing (odd-numbered) pages. When you write \section{section title} the \sectionmark command will be called to create “mark data” which reflects the new section—containing its section title and number. Any mark data provided by the previous \chapter is unaffected by any subsequent \section commands.

\markboth and \markright

LaTeX’s section-specific mark commands, such as \chaptermark and \sectionmark, utilise two further commands called \markboth and \markright that provide the actual mark data for headers and footers. \markboth and/or \markright can include commands to style header or footer text, such as making it uppercase.

For documents produced using book class defaults (double-sided, headings page style), the mark-generating commands work as follows:

  • \chaptermark generates mark data using \markboth
  • \sectionmark generates mark data \markright

\markboth and \markright are macros which, ultimately, use the primitive (built-in, low-level) \mark command to actually insert marks generated from within sectioning commands.

\markboth takes the form

  • \markboth{left}{right}: inserts a mark (node) using \mark{{left}{right}} resulting in a mark node containing a pair of braced values {left}{right}.

\markright takes the form

  • \markright{newright}: inserts a mark (node) using \mark{{currentleft}{newright}}; i.e., it changes the current value of right to newright but the current value of left (i.e., currentleft) is unchanged, resulting in a mark node containing a pair of braced values {currentleft}{newright}.

The behaviour of \markboth and \markright support the two-tier section hierarchy, such as a document containing multiple sections within each chapter.

By way of examples for the (two-sided) book document class (using the headings page style):

  • \chaptermark uses \markboth to set:
    • a {left} field which typesets CHAPTER <number>. <chapter title> in uppercase.
    • a {right} field which is blank ({})
  • \sectionmark uses \markright to set the {right} mark field to typeset <section number>. <section title>, also in uppercase.

To understand why \sectionmark uses \markright we can observe that each new \section command should:

  • insert a new mark containing data for this \section (such as its title and number).
  • but not affect mark data values for the current \chapter in which that particular \section command appears

For these reasons, the \section command calls \sectionmark which uses \markright to create a \section-related mark: it updates the {right} mark-data field but does not affect the current {left} mark-data field value which was set by the current \chapter.

Note: the LaTeX source code contains the advisory:

The marking commands work reasonably well for right marks ‘numbered within’ left marks—e.g., the left mark is changed by a \chapter command and the right mark is changed by a \section command. However, it does produce somewhat anomalous results if 2 \markboth’s occur on the same page.

Commands for the output routine

The \ps@headings command implements definitions of the following commands

  • \@oddhead
  • \@oddfoot
  • \@evenhead
  • \@evenfoot

which are used by the output routine to add headers and footers during the final stage of page composition. For example, book.cls sets \@oddfoot and \@evenfoot to the value \@empty, which is defined as \def\@empty{}, thus \ps@headings produces empty footers on left- and right-facing pages:

\let\@oddfoot\@empty
\let\@evenfoot\@empty

The headers are defined as:

\def\@evenhead{\thepage\hfil\slshape\leftmark}%
\def\@oddhead{{\slshape\rightmark}\hfil\thepage}%

where

  • \leftmark extracts the {left} value from a {left}{right} mark pair
  • \rightmark extracts the {right} value from a {left}{right} mark pair
  • \thepage outputs the current page number

In essence, these implementations of \@evenhead and \@oddhead produce the following results:

  • \@evenhead: the page number appears on the left of the header and the content of \leftmark, styled with font command \slshape, is output at the right of the header—the intervening white space is provided by the very flexible \hfil glue.
  • \@oddhead: the content of \rightmark, styled with font command \slshape, appears at the left of the header and the current page number is output at the right of the header—again, the intervening white space is provided by \hfil glue.

However, a key question remains: which {left}{right} mark pair are used by \leftmark and \rightmark: in other words, from where do they obtain their mark-data values?

As noted above but explained below, during the final page-composition stages, any marks contained in the document pages are “filtered” and used to set the value of three global mark variables: \botmark, \topmark and \firstmark. Due to the structure of LaTeX’s marks, each of those 3 mark variables will eventually contain a value of {left}{right} for some pair of values {left} and {right} and it is these which provide the actual {left}{right} mark pair (fields) for \leftmark and \rightmark:

  • \leftmark extracts the left value from the {left}{right} mark pair (fields) provided by \botmark
  • \rightmark extracts the right value from the {left}{right} mark pair provided by \firstmark

Note \markboth and \markright can contain commands to style the header and footer text—such as conversion to uppercase.

Examples using lower-level LaTeX commands

The following examples demonstrate making changes to page styles or headers and footers by redefining some of the low-level (internal) LaTeX commands discussed in this article (ones which contain the @ symbol). We are not advocating this method for changing headers and footers, but it is available to those who need it (e.g., package authors). The preferred solution to changing headers and footers is using the fancyhdr package, which is discussed in an Overleaf help article.

Defining a minimal page style

The following example defines a new, extremely minimal, page style called demostyle which uses static text to define the headers and footers and does not rely on sectioning commands (\chapter, \section etc) to set the header and footer content.

The example starts by changing the category code of the @ character to 11 so that it can be used within macro names. The \ps@demostyle command implements our minimal page style by redefining the commands used (in the output routine) to produce the headers and footers: \@oddhead, \@oddfoot, \@evenhead and \@evenfoot. Note the twoside option in our \documentclass declaration—which uses the article class.

\documentclass[twoside]{article}
\catcode`@=11
\newcommand{\ps@demostyle}{%
\renewcommand\@oddfoot{\hfil The odd-page footer\hfil}%
\renewcommand\@evenfoot{\hfil The even-page footer\hfil}%
\renewcommand\@evenhead{\thepage\hfil The even-page header}%
\renewcommand\@oddhead{The odd-page header\hfil\thepage}}
\catcode`@=12
\title{Demonstrating a page style}
\author{Overleaf}
\date{August 2022}
\begin{document}
\pagestyle{demostyle}
\maketitle
\newpage
\section{Introduction}
\newpage
\section{More material}
\end{document}

 Open this example in Overleaf

Changing headers for the book class

The following example changes the headers for the book class by redefining \@oddhead and \@evenhead so that left- and right-facing page headers contain the current page number and chapter title. Note how the second \chapter command uses the optional short-form chapter title [The short title] which now provides the header text, instead of the long title.

\section commands have no effect on header content because our header redefinitions use only \leftmark which obtains details of the current chapter. To access details of the current section we’d need to use \rightmark in the definition of \@oddhead and/or \evenhead.

\documentclass{book}
% NB: category code of '@' temporarily changed to 11
% to enable its use in command names
\catcode `@=11
% Let \@oddfoot and \evenfoot be equivalent to \@empty
% to make them blank (empty)
\let\@oddfoot\empty\let\evenfoot\empty
% Redefine \@oddhead and \@evenhead
\renewcommand{\@oddhead}{\leftmark\hfil\thepage}
\renewcommand{\@evenhead}{\thepage\hfil\leftmark}
\catcode `@=12
% Use a conveniently small page size
\usepackage[paperheight=16cm,paperwidth=12cm,textwidth=10cm]{geometry}
\title{Memoirs of a \TeX{} user}
\author{Overleaf}
\begin{document}
\frontmatter
\maketitle
This is frontmatter which uses Roman numerals.
\mainmatter
\chapter{Where do I start?}
Chapter 1: A short chapter.
\newpage
\section{In the beginning...}
A section.
\chapter[The short title]{A chapter with a very long title, making it unsuitable for headers}
\newpage
\section{Another section}
With little content.
\newpage
\section{What, another section?}
Also with little content.
\end{document}

 Open this example in Overleaf

Using ε-TeX’s extended marks commands

The book class uses the following definitions of \@evenhead and \@oddhead to produce headers:

\def\@evenhead{\thepage\hfil\slshape\leftmark}
\def\@oddhead{{\slshape\rightmark}\hfil\thepage}

where the commands \leftmark and \rightmark are defined as

\def\leftmark{\expandafter\@leftmark\botmark\@empty\@empty}
\def\rightmark{\expandafter\@rightmark\firstmark\@empty\@empty}

Using ε-TeX’s extended commands, and the equivalence between ε-TeX’s mark class 0 and the original TeX commands, we can:

  • rewrite \leftmark to replace \botmark with ε-TeX’s \botmarks0 equivalent, and
  • rewrite \rightmark to replace \firstmark with ε-TeX’s \firstmarks0 equivalent

These redefinitions produce

\renewcommand{\leftmark}{\expandafter\@leftmark\botmarks0\relax\@empty\@empty}
\renewcommand{\rightmark}{\expandafter\@rightmark\firstmarks0\relax\@empty\@empty}

Note the use of \relax to terminate TeX’s hunt for further digits—you can also replace \relax with a space to act as the terminator.

The following example produces output which is identical to the original definitions of \leftmark and \rightmark.

\documentclass{book}
% Redefine \leftmark and \rightmark to use 
% \botmarks0 and \firstmarks0 respectively
% NB: category code of '@' temporarily changed to 11
% to enable its use in command names
\catcode `@=11
\renewcommand{\leftmark}{\expandafter\@leftmark\botmarks0\relax\@empty\@empty}
\renewcommand{\rightmark}{\expandafter\@rightmark\firstmarks0\relax\@empty\@empty}
\catcode `@=12
% Use a conveniently small page size
\usepackage[paperheight=16cm,paperwidth=12cm,textwidth=10cm]{geometry}
\title{Memoirs of a \TeX{} user}
\author{Overleaf}
\begin{document}
\frontmatter
\maketitle
This is frontmatter which uses Roman numerals.
\mainmatter
\chapter{Where do I start?}
Chapter 1: A short chapter.
\newpage
\section{In the beginning...}
A section.
\newpage
Another section.
\end{document}

 Open this example in Overleaf

Notes on ε-TeX

As we noted above, ε-TeX introduces the \marks command:

\marks n {stuff to store}

which extends the original \mark feature of Knuth’s TeX engine and is available in all three mainstream TeX engines: pdfTeX, LuaTeX and XeTeX.

ε-TeX’s implementation also introduces new global mark variables \firstmarks n, \botmarks n and \topmarks n: one for each class n. These class-based mark variables are determined using the mechanisms outlined above, and can all be used in the production of headers and footers.

You can use any of the \(2^{15}\) = 32768 mark classes, ranging from 0 to 32767, within redefinitions of \leftmark or \rightmark. That assumes you have provided (inserted) suitable mark data for your chosen mark class, n, via \marks n{{left}{right}}.

Note that LuaTeX provides \(2^{16}\) = 65536 mark classes, with n ranging from 0 to 65535.

Example using marks class 10

This example demonstrates using mark class 10 (chosen randomly) to create document headers. The sample document typesets 6 pages which contain the sequence of marks presented in the worked example at the end of this article. A study of that example will show why the TeX engine has selected a particular mark, \(\alpha\), \(\beta\), \(\gamma\) and \(\delta\), for use in each of the headers.

We start by defining a convenience macro, \domark, which uses \marks 10 to insert mark data that follows the LaTeX {left}{right} convention:

\newcommand{\domark}[2]{\marks 10{{#1}{#2}}}

Marks of class 10 are made accessible to LaTeX's header and footer mechanism by redefining \leftmark and \rightmark—with a bit of extra info added in:

\renewcommand{\leftmark}{\expandafter\@leftmark\botmarks10 \@empty\@empty{} (via \texttt{\string\botmarks10})}
\renewcommand{\rightmark}{\expandafter\@rightmark\firstmarks10 \@empty\@empty{} (via \texttt{\string\firstmarks10})}

After inserting a number of (class 10) marks, the TeX engine (here, pdfTeX) obligingly creates global mark variables for class 10—\botmarks10, \topmarks10 and \firstsmarks10 which \leftmark and \rightmark use to produce the headers.

\documentclass{book}
% A short command to use marks with class 10
% and following LaTeX's mark structure {left}{right}
\newcommand{\domark}[2]{\marks 10{{#1}{#2}}}
% Redefine \leftmark and \rightmark to use 
% \botmarks10 and \firstmarks10 respectively
\catcode`@=11
\renewcommand{\leftmark}{\expandafter\@leftmark\botmarks10 \@empty\@empty{} (via \texttt{\string\botmarks10})}
\renewcommand{\rightmark}{\expandafter\@rightmark\firstmarks10 \@empty\@empty{} (via \texttt{\string\firstmarks10})}
\catcode`@=12
\title{Demonstrating \(\varepsilon\)-\TeX’s enhanced marks}
\author{Overleaf}
\date{August 2022}
\begin{document}

Page 1: No marks added so all mark variables remain in their initialized state: empty (NULL).

\newpage
Page 2: $\alpha$-mark added to this page via

\verb|\domark{$\alpha$-left}{$\alpha$-right}|\domark{$\alpha$-left}{$\alpha$-right}

\newpage
Page 3: No new marks added to this page.

\newpage
Page 4: $\beta$-mark followed by $\gamma$-mark added to this page via

\verb|\domark{$\beta$-left}{$\beta$-right}|

\verb|\domark{$\gamma$-left}{$\gamma$-right}|.
\domark{$\beta$-left}{$\beta$-right}
\domark{$\gamma$-left}{$\gamma$-right}

\newpage
Page 5: $\delta$-mark added to this page via

\verb|\domark{$\delta$-left}{$\delta$-right}|
\domark{$\delta$-left}{$\delta$-right}

\newpage
Page 6: No marks added to this page.
\end{document}

 Open this example in Overleaf

The following graphic shows the headers produced by this example:

Graphic showing the output from LaTeX code using Using ε-TeX’s extended marks commands

  • Note: The header on page 1 is created at a time when all three mark variables for class 10—\botmarks10, \topmarks10 and \firstsmarks10—are empty. The end result is that \rightmark (for page 1) inserts just the last header fragment: (via \texttt{\string\firstmarks10}), as shown in the graphic above.

A worked example to show how TeX engines determine values for \botmark, \topmark and \firstmark

Db.gifDb.gif It feels appropriate to replicate Knuth’s use of double dangerous-bend signs (image courtesy of this website) because the material is somewhat low-level and “peeks under the hood”—although we hope it may be of interest to intrepid readers wishing to understand a few more details.

The following discussion is closely based on the last example on page 258 of The TeXbook. Here, we have edited and re-typeset that example for those without access to The TeXbook:

An edited and  re-typeset version of an example from page 258 of the TeXbook

The table presented in this graphic is reproduced from one provided by Knuth but, like the original, there is no explanation of how it was derived. Here we’ll address that to provide additional details which show how those values of \botmark, \topmark and \firstmark were determined—obtaining these details required exploring the source code of a TeX engine!

The process of setting values for \botmark, \topmark and \firstmark can be summarised in 3 short pseudocode fragments as shown in the following diagram and subsequent explanations:

  1. Pre-content check
  2. Content loop
  3. Post-content check

Note:

  • The tests in step 1 “Pre-content check” and step 3 “Post-content check” are performed once per page.
  • Step 2 “Content loop” involves TeX scanning through the page content searching for mark nodes, and using them to set the mark-variable values as shown.

Image showing pseudocode versions of code used to set values for the three global mark variables used by TeX engines

Each time TeX finds a suitable page break it sends the content of that page for final processing, performing these (three) mark-processing steps for every page in your document. The processing steps labelled 1, 2 and 3 are referenced in the final diagram within this section.

Note that:

  • before the first page has been processed, all three global mark variables \botmark, \topmark and \firstmark have been initialised to be empty (NULL).
  • the final values of \botmark, \topmark and \firstmark determined at the end of the previous page become the (input) values for processing the current page—the one being processed.

The following code fragments are pseudocode derived (summarized) from actual TeX engine source code. The goal is to provide a concise code summary that helps to demonstrate the key principles involved.

Step 1: Pre-content check
Before TeX looks through the actual content of the current page, it checks if the current \botmark value—i.e., resulting from the previous page—is empty (NULL):

if(\botmark != NULL)
{
   \topmark = \botmark;
   \firstmark= NULL;
}

In this test, if \botmark (from the previous page) is not empty:

  • \topmark for the current page is set to the previous page’s \botmark value
  • \firstmark, for the current page (being processed), is set to empty (NULL): start this page assuming there are no \mark nodes.

Note that this test is performed once for each page being processed.

Step 2: Content loop
Next, TeX loops over the page content; part of that loop includes looking for \mark nodes:

if(type(node)==mark_node)
{
   if(\firstmark==NULL)
   {
      \firstmark=node(text);
   }
   \botmark= node(text);
}

This test is applied to every mark node located within (contained in) the current page. The previous test (Step 1, above) could have set \firstmark to empty so the very first mark found in the current page becomes the value of \firstmark and \botmark. Any subsequent \mark node detected whilst executing this loop will not change the value of \firstmark, because it is no longer empty (NULL), but \botmark is updated—because that is the purpose of \botmark: to store the last seen \mark node for this page.

Step 3: Post-content check Once it has looped over the page content TeX applies the following, final, test which is also performed once per page:

if((\topmark != NULL)&&(\firstmark==NULL))
{
	\firstmark= \topmark;
}
  • If \topmark is not empty (NULL) we know, from Step 1, that \botmark for the previous page was not empty (NULL).
  • If \firstmark is still empty then we haven’t seen any \mark nodes because the test in Step 2 wasn’t triggered; consequently, \firstmark is set to the value of \topmark which is also the value of \botmark from the previous page.

After sequential application of these three steps for all 6 pages of our example, the values of \botmark, \topmark and \firstmark are derived using transitions shown in the diagram below where:

  • light blue rounded rectangles are the final mark variable values for each page
  • light grey rounded squares represent interim values that arise during processing
  • labelled arrows represent changes in marker variable values according to steps 1, 2 or 3 in the diagram above—or a value has not been changed by any of the steps; labelled “no change”.

The following diagram shows how the three fragments of pseudocode generate the final values of \botmark, \topmark and \firstmark listed in the table contained on page 258 of The TeXbook.

A transition diagram to accompany the example from page 258 of the TeXbook, showing internal transitions as the global mark variables are assigned their values

Overleaf guides

LaTeX Basics

Mathematics

Figures and tables

References and Citations

Languages

Document structure

Formatting

Fonts

Presentations

Commands

Field specific

Class files

Advanced TeX/LaTeX