[ < < < Home ]
[ < < Reference Start ]
[ < Reference Contents ]
[ < Previous=PDGREPPE Position ]
[ Next=PDGREPPE Byte Types > ]
A single byte match is based on an exact match to a byte
or a [set] within a range of bytes or ANY (.) byte.
There must be a byte at the current position or after the
current point of data under test to pass.
The point of data under test will increment by one byte if
a byte test is passed unless the "!" NOT repeat
operator is used.
. skip over next byte, matches Any byte
pdgreppe -1 -Hjc "." file_id.diz
..."." finds the first match (option -1) on each line that
is ANY byte.
Any byte "." is often used for "stringy" effects to bind
areas together in an elastic way.
pdgreppe -Hjc "\u.*\u" file_id.diz
..."\u.*\u" finds Zero or More "*" of Any "." bytes
between and including two UpperCase "\u" bytes.
Note:
ANY "." or DOT does not normally find characters
within a newline sequence "\N" or "\r\n", however with
option -x, it will find either "\r" or "\n" in a newline.
Byte Classes [<set>] [^<set>] [<set1>^<set2>]
Byte Classes (or Sets or [Ranges]) give fastest access to
determining if a byte in byte <set> is or is not
present.
[<set>] byte in <set> of bytes
[\<type>] byte in <type> of \<bytes>
[<A>-<Z>] byte in <A> to <Z>
[\<A>\<Z>] byte in multiple types \<A> and \<Z>
A set or range starts with a "[", followed by some specific
bytes, and ends with a "]" like
[abc]
to find the byte
a
or
b
or
c
pdgreppe -1 -Hjc "[\C]" file_id.diz
..."[\C]" finds some single Consonants "\C" in set
[bcdfghjklmnpqrstvwxzB-DF-HJ-NP-TV-XZ].
("\C" is a Type Byte that represents any of the consonant
bytes in Byte Class:
[bcdfghjklmnpqrstvwxzB-DF-HJ-NP-TV-XZ] as detailed later.)
"Notness" ranges or sets of bytes NOT appearing:
[^<set>] byte NOT in <set> of bytes
[^\<type>] byte NOT in <type> of \<bytes>
[^<A>-<Z>] byte NOT in <A> to <Z>
[^\<A>\<Z>] byte NOT in multiple types \<A> and \<Z>
[<set1>^<set2>] byte in <set1> but NOT <set2> of bytes
pdgreppe -1 -Hjc "[^\C]" file_id.diz
..."[^\C]" finds some bytes that are NOT "^" single
Consonants.
pdgreppe -1 -Hjc "[!\C]" file_id.diz
..."[!\C]" also finds some that are NOT "!" single Consonants.
The same applies to these:
[!<set>] same as [^<set>]
[!<A>-<Z>] same as [^<A>-<Z>]
[<set1>!<set2>] same as [<set1>^<set2>]
Mixed Ranges:
[<set1>^<set2>] byte in <set1> but NOT <set2> of bytes
[<set1>!<set2>] same as [<set1>^<set2>]
Mixed [Ranges] are those that have an allowable set of
bytes on the left side and an unallowable set on the right
side. They can help specify exclusions to a [Range] more
easily.
pdgreppe -1 -Hjc "[\C^\u]" file_id.diz
..."[\C^\u]" finds some single Consonants "\C" that are NOT "^"
capital letters "\u".
pdgreppe -1 -Hjc "[\C!\u]" file_id.diz
..."[\C!\u]" also finds some single Consonants "\C" that are NOT
"!" capital letters "\u".
MAGIC bytes like "$[.(){}<>#%/;=?*+&|~`'@:" do not have any
special significance in a [Range]. But once a [Range] has ended
with "]", they regain their magic capabilities.
MAGIC bytes "!^-,]" DO HAVE SPECIAL SIGNIFICANCE in a
[Range]. To search for them with a range, precede them with an
escape byte within the range specification.
e.g.
[\!\^\-\,\]]
will search for any one of
!^-,]
as a byte.
If option -E is used, they should be used "as is"
e.g.
\[!^-,]\]
will search for any one of
!^-,]
as a byte, like the preceding example, with \[ and \]
indicating a range start and stop and escaped to enable range
start "[" and range stop "]".
In mixed [Ranges]
[<set1>^<set2>] or [<set1>!<set2>]
the
"^" or "!"
start of a negative [Range] can only be used ONCE.
A matching "]" to end a [Range] started with "[" MUST be
part of a pattern.
To make entry of [Range] standard byte sets easy, a TYPE
byte about to be mentioned, can also be used.
Also, ranges can have elements specified by escaped number
values like \097 or \x61 for the byte 'a'.
Other:
Every [Range] uses up at least 256 bytes of memory the first
time it occurs in a pattern and a few more bytes for every exact
repeat of itself. Using lots of different ranges can put a load
on the program, but using lots of similar ranges has minimal
effect. A pattern can use up to 256 Ranges.
[ < < < Home ]
[ < < Reference Start ]
[ < Reference Contents ]
[ < Previous=PDGREPPE Position ]
[ Next=PDGREPPE Byte Types > ]
© Intelligence Services 1987 - 2008
GPO Box 9, ADELAIDE SA 5001, AUSTRALIA
EMAIL : intlsvs@gmail.com