Sunday, March 27, 2011

LaTeX's failure with floats

It's probably fairly uncontroversial to say that floats are one of the main areas where LaTeX performs poorly in comparison to WYSIWYG editors. The basic complaint is that floats just don't go where we want them.

To compensate for this, people often use the h float specifier or H from the float package to say, “place the figure here!” This is often a poor approach since there's no real idea of where here really is. This leads to moving the code around between paragraphs, trying to find a reasonable place to put it. Since this is a bad idea, I'm not going to focus on it. Instead, I'm going to talk about real floating material.

Part of the problem is that TeX produces output one page at a time. Once a page is finished, it is shipped out (i.e., written to the dvi or pdf) and not touched again. What this means for LaTeX is that by the time it has seen the \begin{figure} (or other floating material), it has already finished with all of the pages before it. So what it does is it performs a complicated interaction with the output routine which will try to place the figure (subject to the specifiers) on the current page. If that fails, it gets stuffed onto a defer list to be inserted later.

Often, what we really want is for the image to go onto the previous page since that leads to better overall placement. But since LaTeX cannot handle that, we're forced to move the image code ourselves, just like we had to do with the H specifier.

One partial solution is to put each float in a separate file, floatfoo.tex and then move \input{floatfoo} around until a reasonable placement is found. This is not entirely satisfactory since we still have this guess and check procedure.

What I would like is a solution that allows the author to specify a page number (and position) and have the float placed there, if at all possible. I haven't fully thought through what I'd like in an interface, so here are some thoughts about requirements and challenges.
  • The interface should work with twocolumn documents at the very least and it would be better if it supported the multicol package. Something like
    might be nice.
  • There are tokenization issues so it is probably not acceptable to tokenize the body of the float, store it somewhere, and then reproduce it when needed since category codes will be assigned at tokenization time. This almost certainly requires writing the output to another file.
  • One idea is to use the filecontents environment to write the body of the figure to separate file with appropriate \if... guards. I'm envisioning something like
    being written to \jobname.figs as
    Then, for each page, \jobname.figs is \input in a manner similar to \afterpage or \AtBeginPage. This would need something extra for twocolumn documents.
  • There's the question of trying to keep figures in order if some are specified with particular page requirements and others are not.
  • There's an issue if a float depends on a macro being defined but the page specifier puts it before the definition. I don't see how to get around that.
  • There's an issue with trying to work with other floating environments such as the excellent lstlisting from the listings package.
I'm sure there are more issues I haven't considered. This does seem doable though.

Sunday, March 20, 2011

Knuth quote V

Somewhat mysteriously, in the middle of the chapter on macros in The TeXbook, Knuth defines \rhead—the macro he uses to keep track of the running headline. The definition itself is a little odd in that when \rhead is executed, it globally redefines \rhead to be almost the same text sans the definition.
\def\rhead{Chapter \chapno: Definitions (aka Macros)% my little joke
  \gdef\rhead{Chapter \chapno: Definitions (also called Macros)}}
However, it is the comment here that interests me. What is his joke? My best guess is that he used macros to change the running headline after the page on which this \def appears. He never calls attention to this change in the text and never explains the joke.

Saturday, March 19, 2011

Random numbers in TeX

Recent versions of pdfTeX contain primitives for generating random integers.
  • \pdfuniformdeviate num generates a uniformly distributed random integer in the range [0, num).
  • \pdfnormaldeviate generates a normally distributed random integer with mean 0 and “a unit of 65536”. (I've never seen unit used that way, so I'm not sure exactly what the manual means.)

These are both expandable and can be used if you need random numbers for some reason. Here's one toy example that generates coinflips with a biased coin.
\newcount\n \n=0
\newcount\heads \heads=0
\newcount\tails \tails=0
                \advance\heads by 1
                \advance\tails by 1
        \advance\n by 1

Heads: \number\heads\par
Tails: \number\tails
This generates 1000 coin flips with a bias of p = 0.327. It prints the results of each coin flip as well as counting the number of heads and tails.

Sunday, March 13, 2011

Knuth quote IV

When quoting Lamport about writing Greek letters being as easy as writing “... as easy as $\pi$” in The TeXbook, Knuth cites the book as LaTeX Document Preparation System. He comments,
Note: the final manual has a slightly different wording on p43.
It's now called "LaTeX: A Document Preparation System" (1986)
But I decided to cite the original, partly because I have
no smallcaps sans-serif `A' to match the new LaTeX logo!