DOC HOME SITE MAP MAN PAGES GNU INFO SEARCH PRINT BOOK

# (m4.info) Changeword

Info Catalog (m4.info) Changecom (m4.info) Input Control (m4.info) M4wrap

Changing the lexical structure of words
=======================================

The macro changeword' and all associated functionnality is
experimental.  It is only available if the --enable-changeword'
option was given to configure', at GNU m4' installation time.
The functionnality might change or even go away in the future.
same way you would do for bugs.

A file being processed by m4' is split into quoted strings, words
(potential macro names) and simple tokens (any other single character).
Initially a word is defined by the following regular expression:

[_a-zA-Z][_a-zA-Z0-9]*

Using changeword', you can change this regular expression.  Relaxing
m4''s lexical rules might be useful (for example) if you wanted to
apply translations to a file of numbers:

changeword([_a-zA-Z0-9]+')
define(1, 0)
=>1

Tightening the lexical rules is less useful, because it will
generally make some of the builtins unavailable.  You could use it to
prevent accidental call of builtins, for example:

define(_indir', defn(indir'))
changeword(_[_a-zA-Z0-9]*')
esyscmd(foo)
_indir(esyscmd', ls')

Because m4' constructs its words a character at a time, there is a
restriction on the regular expressions that may be passed to
changeword'.  This is that if your regular expression accepts foo',
it must also accept f' and fo'.

changeword' has another function.  If the regular expression
supplied contains any bracketed subexpressions, then text outside the
first of these is discarded before symbol lookup.  So:

changecom(/*', */')
changeword(#$$[_a-zA-Z0-9]*$$')
#esyscmd(ls)

m4' now requires a #' mark at the beginning of every macro
invocation, so one can use m4' to preprocess shell scripts without
getting shift' commands swallowed, and plain text without losing
various common words.

m4''s macro substitution is based on text, while TeX's is based on
tokens.  changeword' can throw this difference into relief.  For
example, here is the same idea represented in TeX and m4'.  First, the
TeX version:

\def\a{\message{Hello}}
\catcode\@=0
\catcode\\=12
=>@a
=>@bye

Then, the m4' version:

define(a, errprint(Hello')')
changeword(@$$[_a-zA-Z0-9]*$$')
=>@a

In the TeX example, the first line defines a macro a' to print the
message Hello'.  The second line defines @ to be usable instead of \
as an escape character.  The third line defines \ to be a normal
printing character, not an escape.  The fourth line invokes the macro
a'.  So, when TeX is run on this file, it displays the message Hello'.

When the m4' example is passed through m4', it outputs
errprint(Hello)'.  The reason for this is that TeX does lexical
analysis of macro definition when the macro is *defined*.  m4' just
stores the text, postponing the lexical analysis until the macro is
*used*.

You should note that using changeword' will slow m4' down by a
`