(gawk.info) Leftmost Longest
Info Catalog
(gawk.info) Case-sensitivity
(gawk.info) Regexp
(gawk.info) Computed Regexps
How Much Text Matches?
======================
Consider the following example:
echo aaaabcd | awk '{ sub(/a+/, "<A>"); print }'
This example uses the `sub' function (which we haven't discussed yet,
Built-in Functions for String Manipulation String Functions.)
to make a change to the input record. Here, the regexp `/a+/' indicates
"one or more `a' characters," and the replacement text is `<A>'.
The input contains four `a' characters. What will the output be?
In other words, how many is "one or more"--will `awk' match two, three,
or all four `a' characters?
The answer is, `awk' (and POSIX) regular expressions always match
the leftmost, _longest_ sequence of input characters that can match.
Thus, in this example, all four `a' characters are replaced with `<A>'.
$ echo aaaabcd | awk '{ sub(/a+/, "<A>"); print }'
-| <A>bcd
For simple match/no-match tests, this is not so important. But when
doing text matching and substitutions with the `match', `sub', `gsub',
and `gensub' functions, it is very important. Built-in Functions
for String Manipulation String Functions, for more information on
these functions. Understanding this principle is also important for
DONTPRINTYET regexp-based record and field splitting ( How Input is Split into
Records Records., and also *note Specifying How Fields are Separated:
DONTPRINTYET regexp-based record and field splitting ( How Input is Split into
Records Records., and also Specifying How Fields are Separated
Field Separators.).
Info Catalog
(gawk.info) Case-sensitivity
(gawk.info) Regexp
(gawk.info) Computed Regexps
automatically generated byinfo2html