Wednesday, January 12, 2011

Better \mbox and \fbox

As currently implemented in LaTeX, \mbox and \fbox (among others) have a bizarre limitation: their arguments cannot change category codes. One practical consequence of this is that \verb is not allowed in the arguments.

This limitation is completely artificial and with a little care can be removed. To see why this limitation exists, it's helpful to take a look at the implementation of \mbox.
\long\def\mbox#1{\leavevmode\hbox{#1}}
From this, it becomes clear why category codes cannot be changed. The reason is simple; parameters to macros are tokenized so the parameter for \mbox has fixed category codes.

The fix for this is simple and has two consequences.
\def\bettermbox{\leavevmode\hbox}
Here, we see that \bettermbox takes no arguments and so \bettermbox{foo} will expand to \leavevmode\hbox{foo}. The first consequence is that the parameter is not tokenized before being executed so we are free to write
\bettermbox{\verb!&^%$#!}
The second consequence is slightly more subtle. We can give box specifications such as to 3in or spread 12pt before the left brace.

Okay, fixing \mbox was easy, but what about something more complicated like \fbox? First we need to look at its definition.
\long\def\fbox#1{%
        \leavevmode
        \setbox\@tempboxa\hbox{%
                \color@begingroup
                \kern\fboxsep
                {#1}%
                \kern\fboxsep
                \color@endgroup
        }%
        \@frameb@x\relax
}
The actual framing of the box happens in \@frameb@x. Again, the parameter is tokenized when it need not be. Unfortunately, this time, it's not immediately obvious how to proceed. The trick is to use \afterassignment to insert code just after the opening brace (and before the tokens from \everyhbox, if any, are inserted) and then to use \aftergroup to close the box and typeset it using \@frameb@x.

Here is the code.
\def\betterfbox{%
        \leavevmode
        \afterassignment\bfb@i
        \setbox\@tempboxa\hbox
}
\def\bfb@i{%
        \color@begingroup
        \kern\fboxsep
        \bgroup
        \aftergroup\bfb@ii
}
\def\bfb@ii{%
        \kern\fboxsep
        \color@endgroup
        \egroup
        \@frameb@x\relax
}
If we expand this all out, we see that \betterfbox{foo} becomes
\leavevmode
\setbox\@tempboxa\hbox{%
        \color@begingroup
        \kern\fboxsep
        \bgroup
        foo}%
        \kern\fboxsep
        \color@endgroup
\egroup
\@frameb@x\relax
The only difference between that and \fbox is the braces for the \hbox are { \egroup and the braces around foo are \bgroup } whereas for \fbox they are all explicit brace tokens.

So did the LaTeX team make the right choice? I'm not sure. The original definitions are certainly clearer. A general rule of thumb I try to follow when writing TeX code that others might use is to delay tokenization as long as possible.

As a final point, this is much easier to do with environments than macros because the code that ends the box can just go in the \endfoo macro. LaTeX makes extensive use of this.

1 comment: