Saturday, May 5, 2012

et al., i.e., e.g., etc.

Only time for a short post. Foreign words are italicized frequently in English; however, et al., i.e., e.g., and etc. probably should not be. It draws unnecessary attention to the words and modern style manuals will tell you not to italicize them.

Of more relevance to the TeX world: make sure the space after the period is an interword space unless the abbreviation is actually ending the sentence. The length of the space is control by a variety of factors in TeX, the most important of which is the \spacefactor. LaTeX provides \@ to reset the space factor and a control space \ can be used to produce a normal space. Examine TeX by Topic for details.

Saturday, July 2, 2011

Using TeX to merge \input from top level files

This is silly. There are better ways to do it. Still, it was sort of fun to write.
\endlinechar=-1
\newread\in
\newwrite\out
\message{Please enter input file name: }
\read16to\inname
\openin\in=\inname \relax
\ifeof\in
        \immediate\write16{Failed to open \inname.}
        \expandafter\end
\fi
\message{Please enter output file name: }
\read16to\outname
\immediate\openout\out=\outname \relax
\begingroup
\catcode`@0
\catcode`(1
\catcode`)2
\catcode`\{12
\catcode`\}12
\catcode`I12
\catcode`N12
\catcode`P12
\catcode`U12
\catcode`T12
\catcode`\\12
@lowercase(
        @gdef@dosplitline#1\INPUT{#2}#3@splitsentinal(@def@ante(#1)@def@file(#2)@def@post(#3))
        @gdef@splitline(@expandafter@dosplitline@line\INPUT{@sentinal}@splitsentinal)
)
@endgroup
\def\splitpost{\expandafter\dosplitline\post\splitsentinal}
\def\sentinal{\sentinal}
\catcode`\%12
\def\processline{
        \ifx\file\sentinal
                \immediate\write\out{\ante}
                \let\temp\relax
        \else
                \immediate\write\out{\ante%}
                \let\temp\processline
                \copyfile
                \splitpost
                \ifx\empty\ante
                        \ifx\file\sentinal
                                \let\temp\relax
                        \fi
                \fi
        \fi
        \temp
}
\newread\f
\def\copyfile{
        \openin\f=\file\relax
        \ifeof\f
                \immediate\write16{Failed to open \file. Continuing.}
        \else
                \begingroup
                \loop
                        \readline\f to\line
                        \unless\ifeof\f
                        \immediate\write\out{\line}
                \repeat
                \endgroup
                \closein\f
        \fi
}

\loop
        \readline\in to\line
        \unless\ifeof\in
        \splitline
        \processline
\repeat
\closein\in
\immediate\closeout\out
\end

Friday, June 3, 2011

Inverted pyramid typesetting

University thesis committees are fairly well-known in the typesetting world for having the most absurd requirements. One example requires heading text to be typeset centered and no more than 4.5 in wide in which the lines become progressively shorter.

This is stupid. Still, I've got a solution, based on egreg's:
\newcommand\stupid[1]{%
        \vbox{%
                \hsize=4.5in
                \parindent=0pt
                \leftskip=0pt plus.5fil
                \rightskip=0pt plus-0.5fil
                \parfillskip=0pt plus1fil
                \emergencystretch=1in
                \parshape6
                0.00in 4.50in
                0.25in 4.00in
                0.50in 3.50in
                0.75in 3.00in
                1.00in 2.50in
                1.25in 2.00in
                \huge
                \bfseries
                \strut
                #1%
        }%
}
The \parshape specifies what to do for the first 6 lines by giving pairs of numbers. The first in the pair is the indentation and the second is the line length. The settings of \leftskip, \rightskip, and \parfillskip come from TeX by Topic and are used to center the last line.

Thursday, May 26, 2011

Typesetting on a grid 1: heightrounded

One thing that I dislike about LaTeX's output—especially in two columns—is that lines of prose are not typeset on a grid. I'm hoping to do a series of posts on little things that can be done to improve the situation. (There is the grid package, but I've never had it work for me.)

One thing that really stood out like a sore thumb to me, especially with two columns and large paragraphs is that frequently there is some small space between paragraphs. The reason for this is quite simple. Often one has fixed margin and leading requirements, say 1" margin on each side and a 12 pt leading. At 72.27 pt per inch and an 8.5" x 11" paper, one cannot get an integral number of lines of text per page. In fact, with a 1" margin and a 12 pt leading, one can fit 54 lines of text on the page with 2.43 pt to spare.

There are two things that can be done about this. The first is to change \topskip so that the top line of each page is moved down. The second is to change the \textheight so that the bottom of the page is moved up. The right way to do the second is to use the heightrounded option from the geometry package. Since it will be useful later, we will also set \topskip to \baselineskip. In essence, we will be doing both at once.

This looks something like this.
\topskip=\baselineskip
\usepackage[margin=1in,heightrounded]{geometry}
Note that we need to change \topskip before we use geometry since it uses the value of \topskip that is in force when the option is loaded. (Alternatively, one can use \geometry{heightrounded} to use the new value, for example if geometry has already been loaded.)

Sunday, March 27, 2011

LaTeX's failure with floats

It's probably fairly uncontroversial to say that floats are one of the main areas where LaTeX performs poorly in comparison to WYSIWYG editors. The basic complaint is that floats just don't go where we want them.

To compensate for this, people often use the h float specifier or H from the float package to say, “place the figure here!” This is often a poor approach since there's no real idea of where here really is. This leads to moving the code around between paragraphs, trying to find a reasonable place to put it. Since this is a bad idea, I'm not going to focus on it. Instead, I'm going to talk about real floating material.

Part of the problem is that TeX produces output one page at a time. Once a page is finished, it is shipped out (i.e., written to the dvi or pdf) and not touched again. What this means for LaTeX is that by the time it has seen the \begin{figure} (or other floating material), it has already finished with all of the pages before it. So what it does is it performs a complicated interaction with the output routine which will try to place the figure (subject to the specifiers) on the current page. If that fails, it gets stuffed onto a defer list to be inserted later.

Often, what we really want is for the image to go onto the previous page since that leads to better overall placement. But since LaTeX cannot handle that, we're forced to move the image code ourselves, just like we had to do with the H specifier.

One partial solution is to put each float in a separate file, floatfoo.tex and then move \input{floatfoo} around until a reasonable placement is found. This is not entirely satisfactory since we still have this guess and check procedure.

What I would like is a solution that allows the author to specify a page number (and position) and have the float placed there, if at all possible. I haven't fully thought through what I'd like in an interface, so here are some thoughts about requirements and challenges.
  • The interface should work with twocolumn documents at the very least and it would be better if it supported the multicol package. Something like
    \begin{figure}[page=4,column=2,position=tb]
    might be nice.
  • There are tokenization issues so it is probably not acceptable to tokenize the body of the float, store it somewhere, and then reproduce it when needed since category codes will be assigned at tokenization time. This almost certainly requires writing the output to another file.
  • One idea is to use the filecontents environment to write the body of the figure to separate file with appropriate \if... guards. I'm envisioning something like
    \begin{figure}[page=4,position=tb]
        \centering
        \includegraphics{foo}
        \caption{bar}
        \label{fig:foo}
    \end{figure}
    being written to \jobname.figs as
    \ifnum4=\count0
    \begin{figure}[tb]
        \centering
        \includegraphics{foo}
        \caption{bar}
        \label{fig:foo}
    \end{figure}
    \endif
    Then, for each page, \jobname.figs is \input in a manner similar to \afterpage or \AtBeginPage. This would need something extra for twocolumn documents.
  • There's the question of trying to keep figures in order if some are specified with particular page requirements and others are not.
  • There's an issue if a float depends on a macro being defined but the page specifier puts it before the definition. I don't see how to get around that.
  • There's an issue with trying to work with other floating environments such as the excellent lstlisting from the listings package.
I'm sure there are more issues I haven't considered. This does seem doable though.

Sunday, March 20, 2011

Knuth quote V

Somewhat mysteriously, in the middle of the chapter on macros in The TeXbook, Knuth defines \rhead—the macro he uses to keep track of the running headline. The definition itself is a little odd in that when \rhead is executed, it globally redefines \rhead to be almost the same text sans the definition.
\def\rhead{Chapter \chapno: Definitions (aka Macros)% my little joke
  \gdef\rhead{Chapter \chapno: Definitions (also called Macros)}}
However, it is the comment here that interests me. What is his joke? My best guess is that he used macros to change the running headline after the page on which this \def appears. He never calls attention to this change in the text and never explains the joke.

Saturday, March 19, 2011

Random numbers in TeX

Recent versions of pdfTeX contain primitives for generating random integers.
  • \pdfuniformdeviate num generates a uniformly distributed random integer in the range [0, num).
  • \pdfnormaldeviate generates a normally distributed random integer with mean 0 and “a unit of 65536”. (I've never seen unit used that way, so I'm not sure exactly what the manual means.)

These are both expandable and can be used if you need random numbers for some reason. Here's one toy example that generates coinflips with a biased coin.
\def\coinflip#1{%
        \ifnum#1>\pdfuniformdeviate1000
                H%
        \else
                T%
        \fi
}
\tt
\parindent=0pt
\raggedright
\newcount\n \n=0
\newcount\heads \heads=0
\newcount\tails \tails=0
\loop\ifnum\n<1000
        \if\coinflip{327}H%
                \advance\heads by 1
                H
        \else
                \advance\tails by 1
                T
        \fi
        \advance\n by 1
\repeat

\vskip\baselineskip
\rm
$p=0.327$\par
Heads: \number\heads\par
Tails: \number\tails
\bye
This generates 1000 coin flips with a bias of p = 0.327. It prints the results of each coin flip as well as counting the number of heads and tails.