[ < < < Home ] [ < < Reference Start ] [ < Reference Contents ]
[ < Previous=PDGREPPE Position ] [ Next=PDGREPPE Byte Types > ]

Intelligence Services

PDGREPPE Single Bytes

A single byte match is based on an exact match to a byte
or a [set] within a range of bytes or ANY (.) byte.

There must be a byte at the current position or after the
current point of data under test to pass.

The point of data under test will increment by one byte if
a byte test is passed unless the "!" NOT repeat
operator is used.

 .	skip over next byte, matches Any byte


	pdgreppe -1 -Hjc "." file_id.diz


..."." finds the first match (option -1) on each line that
is ANY byte.

Any byte "." is often used for "stringy" effects to bind
areas together in an elastic way.


	pdgreppe -Hjc "\u.*\u" file_id.diz


..."\u.*\u" finds Zero or More "*" of Any "." bytes
between and including two UpperCase "\u" bytes.

Note:

ANY "." or DOT does not normally find characters
within a newline sequence "\N" or "\r\n", however with
option -x, it will find either "\r" or "\n" in a newline.

Byte Classes	[<set>]		[^<set>]	[<set1>^<set2>]

Byte Classes (or Sets or [Ranges]) give fastest access to
determining if a byte in byte <set> is or is not
present.

 [<set>]	byte	in <set> of bytes
 [\<type>]	byte	in <type> of \<bytes>
 [<A>-<Z>]	byte	in <A> to <Z>
 [\<A>\<Z>]	byte	in multiple types \<A> and \<Z>

A set or range starts with a "[", followed by some specific
bytes, and ends with a "]" like

	[abc]

to find the byte

	a

or

	b

or

	c


	pdgreppe -1 -Hjc "[\C]" file_id.diz


..."[\C]" finds some single Consonants "\C" in set

[bcdfghjklmnpqrstvwxzB-DF-HJ-NP-TV-XZ].

("\C" is a Type Byte that represents any of the consonant
bytes in Byte Class:

[bcdfghjklmnpqrstvwxzB-DF-HJ-NP-TV-XZ] as detailed later.)

"Notness" ranges or sets of bytes NOT appearing:

 [^<set>]	byte NOT	in <set> of bytes
 [^\<type>]	byte NOT	in <type> of \<bytes>
 [^<A>-<Z>]	byte NOT	in <A> to <Z>
 [^\<A>\<Z>]	byte NOT	in multiple types \<A> and \<Z>

 [<set1>^<set2>] byte	in <set1> but NOT <set2> of bytes


	pdgreppe -1 -Hjc "[^\C]" file_id.diz


..."[^\C]" finds some bytes that are NOT "^" single
Consonants.


	pdgreppe -1 -Hjc "[!\C]" file_id.diz


..."[!\C]" also finds some that are NOT "!" single Consonants.

The same applies to these:

 [!<set>]	 same as [^<set>]
 [!<A>-<Z>]	 same as [^<A>-<Z>]
 [<set1>!<set2>] same as [<set1>^<set2>]

Mixed Ranges:

[<set1>^<set2>]	byte	in <set1> but NOT <set2> of bytes

[<set1>!<set2>] same as [<set1>^<set2>]

Mixed [Ranges] are those that have an allowable set of
bytes on the left side and an unallowable set on the right
side.  They can help specify exclusions to a [Range] more
easily.


	pdgreppe -1 -Hjc "[\C^\u]" file_id.diz


..."[\C^\u]" finds some single Consonants "\C" that are NOT "^"
capital letters "\u".


	pdgreppe -1 -Hjc "[\C!\u]" file_id.diz


..."[\C!\u]" also finds some single Consonants "\C" that are NOT
"!" capital letters "\u".

MAGIC bytes like "$[.(){}<>#%/;=?*+&|~`'@:" do not have any
special significance in a [Range].  But once a [Range] has ended
with "]", they regain their magic capabilities.

MAGIC bytes "!^-,]" DO HAVE SPECIAL SIGNIFICANCE in a
[Range].  To search for them with a range, precede them with an
escape byte within the range specification.

e.g.

	[\!\^\-\,\]]

will search for any one of

	!^-,]

as a byte.

If option -E is used, they should be used "as is"

e.g.

	\[!^-,]\]

will search for any one of

	!^-,]

as a byte, like the preceding example, with \[ and \]
indicating a range start and stop and escaped to enable range
start "[" and range stop "]".

In mixed [Ranges]

	[<set1>^<set2>] or [<set1>!<set2>]

the

	"^" or "!"

start of a negative [Range] can only be used ONCE.

A matching "]" to end a [Range] started with "[" MUST be
part of a pattern.

To make entry of [Range] standard byte sets easy, a TYPE
byte about to be mentioned, can also be used.

Also, ranges can have elements specified by escaped number
values like \097 or \x61 for the byte 'a'.

Other:

Every [Range] uses up at least 256 bytes of memory the first
time it occurs in a pattern and a few more bytes for every exact
repeat of itself.  Using lots of different ranges can put a load
on the program, but using lots of similar ranges has minimal
effect.  A pattern can use up to 256 Ranges.

[ < < < Home ] [ < < Reference Start ] [ < Reference Contents ]
[ < Previous=PDGREPPE Position ] [ Next=PDGREPPE Byte Types > ]

Intelligence Services

© Intelligence Services 1987 - 2008   GPO Box 9,   ADELAIDE SA 5001,   AUSTRALIA
EMAIL   :   intlsvs@gmail.com