DOC HOME SITE MAP MAN PAGES GNU INFO SEARCH PRINT BOOK
 

(m4.info) Changeword

Info Catalog (m4.info) Changecom (m4.info) Input Control (m4.info) M4wrap
 
 Changing the lexical structure of words
 =======================================
 
      The macro `changeword' and all associated functionnality is
      experimental.  It is only available if the `--enable-changeword'
      option was given to `configure', at GNU `m4' installation time.
      The functionnality might change or even go away in the future.
      *Do not rely on it*.  Please direct your comments about it the
      same way you would do for bugs.
 
    A file being processed by `m4' is split into quoted strings, words
 (potential macro names) and simple tokens (any other single character).
 Initially a word is defined by the following regular expression:
 
      [_a-zA-Z][_a-zA-Z0-9]*
 
    Using `changeword', you can change this regular expression.  Relaxing
 `m4''s lexical rules might be useful (for example) if you wanted to
 apply translations to a file of numbers:
 
      changeword(`[_a-zA-Z0-9]+')
      define(1, 0)
      =>1
 
    Tightening the lexical rules is less useful, because it will
 generally make some of the builtins unavailable.  You could use it to
 prevent accidental call of builtins, for example:
 
      define(`_indir', defn(`indir'))
      changeword(`_[_a-zA-Z0-9]*')
      esyscmd(foo)
      _indir(`esyscmd', `ls')
 
    Because `m4' constructs its words a character at a time, there is a
 restriction on the regular expressions that may be passed to
 `changeword'.  This is that if your regular expression accepts `foo',
 it must also accept `f' and `fo'.
 
    `changeword' has another function.  If the regular expression
 supplied contains any bracketed subexpressions, then text outside the
 first of these is discarded before symbol lookup.  So:
 
      changecom(`/*', `*/')
      changeword(`#\([_a-zA-Z0-9]*\)')
      #esyscmd(ls)
 
    `m4' now requires a `#' mark at the beginning of every macro
 invocation, so one can use `m4' to preprocess shell scripts without
 getting `shift' commands swallowed, and plain text without losing
 various common words.
 
    `m4''s macro substitution is based on text, while TeX's is based on
 tokens.  `changeword' can throw this difference into relief.  For
 example, here is the same idea represented in TeX and `m4'.  First, the
 TeX version:
 
      \def\a{\message{Hello}}
      \catcode`\@=0
      \catcode`\\=12
      =>@a
      =>@bye
 
 Then, the `m4' version:
 
      define(a, `errprint(`Hello')')
      changeword(`@\([_a-zA-Z0-9]*\)')
      =>@a
 
    In the TeX example, the first line defines a macro `a' to print the
 message `Hello'.  The second line defines @ to be usable instead of \
 as an escape character.  The third line defines \ to be a normal
 printing character, not an escape.  The fourth line invokes the macro
 `a'.  So, when TeX is run on this file, it displays the message `Hello'.
 
    When the `m4' example is passed through `m4', it outputs
 `errprint(Hello)'.  The reason for this is that TeX does lexical
 analysis of macro definition when the macro is *defined*.  `m4' just
 stores the text, postponing the lexical analysis until the macro is
 *used*.
 
    You should note that using `changeword' will slow `m4' down by a
 factor of about seven.
 
Info Catalog (m4.info) Changecom (m4.info) Input Control (m4.info) M4wrap
automatically generated byinfo2html