{n, k}
matches between n and k repetitions. If k is
omitted, at least n repetitions are matched. If the interval has
the form
{n}
exactly n repetitions are matched.
The examples below will clarify this.
[Boolean expression]
that is, an attribute expression is a boolean expression surrounded by
brackets.
attribute_name operator string
where attribute_name is either word or
pos (in this demo version),
operator is either the ``match'' operator
= or the non-match operator !=, and
string is either a ``plain'' string or a (POSIX egrep)
regular expression, both enclosed in double quotes.
In the full version, there are some additional forms of a boolean expression, but these are not available here.
[word="confus.*"];
but if only the word attribute and the match operator
= are used, this can be abbreviated to
"confus.*";
"on"%c;
(in a similar way, diacritic-insensitive search is possible)
"confus.*" []* "by";
This query finds all sequences of a word beginning with confus,
followed by an arbitrary number of arbitrary forms ([]*), then
followed by the word by. The match-any operator must not be
the first expression in a query.
"confus.*" []* "by" within 10;
The within statement is always the very last statement
directly preceding the semicolon, that is, it must not be followed by any
more attribute expressions. Alternatively, one could express the last query
via the interval operator:
"confus.*" []{0,8} "by";
Here, at most 8 words may lie between confus and
by. Additionally, each structural attribute defined for a
corpus may be used to express a search space limit (in this demo corpus,
the structural attribute s for ``sentence'' is
provided):
"confus.*" []* "by" within s;
The within s statement requires that the whole match lies within
one sentence. The name of the structural attribute may be preceded by a
number, in which case the match must lie within that number of sentences.
<s> [pos="NPS"] []* [pos="VB.*"];
Here, a proper noun (singular) must occur at the beginning of a sentence,
followed by an
arbitrary number of unspecified words, and finally followed by a verb.