DOC HOME SITE MAP MAN PAGES GNU INFO SEARCH PRINT BOOK
 

(gawk.info) Leftmost Longest

Info Catalog (gawk.info) Case-sensitivity (gawk.info) Regexp (gawk.info) Computed Regexps
 
 How Much Text Matches?
 ======================
 
    Consider the following example:
 
      echo aaaabcd | awk '{ sub(/a+/, "<A>"); print }'
 
    This example uses the `sub' function (which we haven't discussed yet,
  Built-in Functions for String Manipulation String Functions.)
 to make a change to the input record. Here, the regexp `/a+/' indicates
 "one or more `a' characters," and the replacement text is `<A>'.
 
    The input contains four `a' characters.  What will the output be?
 In other words, how many is "one or more"--will `awk' match two, three,
 or all four `a' characters?
 
    The answer is, `awk' (and POSIX) regular expressions always match
 the leftmost, _longest_ sequence of input characters that can match.
 Thus, in this example, all four `a' characters are replaced with `<A>'.
 
      $ echo aaaabcd | awk '{ sub(/a+/, "<A>"); print }'
      -| <A>bcd
 
    For simple match/no-match tests, this is not so important. But when
 doing text matching and substitutions with the `match', `sub', `gsub',
 and `gensub' functions, it is very important.   Built-in Functions
 for String Manipulation String Functions, for more information on
 these functions.  Understanding this principle is also important for
DONTPRINTYET  regexp-based record and field splitting ( How Input is Split into
 Records Records., and also *note Specifying How Fields are Separated:
DONTPRINTYET  regexp-based record and field splitting ( How Input is Split into
 Records Records., and also  Specifying How Fields are Separated

 Field Separators.).
 
Info Catalog (gawk.info) Case-sensitivity (gawk.info) Regexp (gawk.info) Computed Regexps
automatically generated byinfo2html