- 论坛徽章:
- 7
|
1996年一个seder提出的lookup tables技术
供参考
---------- Forwarded message ----------
Date: Tue, 19 Nov 1996 05:34:17 -0500 (EST)
From: Greg Ubben <gsu@romulus.ncsc.mil>
To: af137@freenet.toronto.on.ca
Subject: Part 1: Using lookup tables with s///
Fellow seders,
Because the sort/delimit/number script I posted last week is so
complicated, it's going to take far more words than code to explain
it. So I'm going to try to approach the explanation in at least three
parts. I'll first go over a general technique used in both the sort
and the counter, then I'll explain how the counter works (since it is
simpler than the sort), then barring any unforseen pianos, I'll explain
the sort last. My approach to explaining things can be rather lengthy
as I try to generalize a lot, and also go over the many alternative
ways that things could be done, so that you understand the trade-offs
and learn more than just this one silly problem. Hopefully the depth
will be a good compromise for everyone. If there's anything you have
any questions on or need more explanation on, you can e-mail me and
I'll elaborate on that in the next part.
The first thing to note is that both the sort and the counter
algorithms use "lookup tables", just like the case-conversion method
which I described in one of the first newsletters. Lookup tables
rely on using the powerful \( \) and \d (\1,\2, etc.) back-reference
operators in the s/// (substitute) command -- in particular, the fact
that you can use the \d later on in the same pattern itself to find
another instance of a previously matched string.
Basically, you first append the lookup table to the pattern space.
Then you need some kind of * pattern between the \(key\) you're looking
up and the lookup table, to skip over the text in between. You can
think of this * as the search operator. Then once you've looked it up,
you usually want to get something back from the table, so you have
another \(\) next to the \d. (You can equate this to looking something
up in an associative array in awk.) Then you just put the things back
in the right order. You may need additional \(\) to save additional
portions of the pattern space that you need to put back. You can either
choose to put the lookup table back to be used again next time, or you
can delete it and add it back next time you do another lookup.
Here's a short example to help tie the above together. Say you
have a single digit in the pattern space and you want to convert that
to the alphabetic name of the digit using the table lookup method
(vs using 10 substitute commands). For example, convert "5" to "five".
Here's one way to do it:
# append the lookup table
s/$/0zero1one2two3three4four5five6six7seven8eight9nine/
# lookup the key (digit) and replace with the value
s/\(.\).*\1\([^0-9]*\).*/\2/
Take some time to unders |
|