> "in" @[pos="DT"] [lemma="case"];
shown in bold font in KWIC display
> [pos="DT"] (@[pos="JJ.*"] ","?){2,} [pos="NNS?"];
> A = [pos="DT"] @[pos="JJ"]? [pos="NNS?"];
> size A;
> size A target;
> sort by attribute on start point .. end point ;
both start point and end point are specified as an anchor, plus an optional offset in square brackets; for instance, match[-1] refers to the token before the start of the match, matchend to the last token of the match, matchend[1] to the first token after the match, and target[-2] to a position two tokens left from the target anchor
NB: the target anchor should only be used in the sort key when it is always defined
> [pos="DT"] [pos="JJ"]{2,} [pos="NNS?"];
> sort by word %cd on match[1] .. matchend[-1];
> sort by word %cd on match[-1] .. match[-42];
whereas the reverse option sorts on the left context by character
> sort by word %cd on match[-42] .. match[-1] reverse;
> sort by word %cd;
> set ExternalSort on;
> sort by word %cd;
> set ExternalSort off;
> count by lemma on match[1] .. matchend[-1];
> A = "behind" @[pos="JJ"]? [pos="NNS?"];
> dump A;
> dump A 9 14;
(10
- 15
match)
the four columns correspond to the match, matchend, target and keyword (see Section 3.6) anchors; a value of -1 means that the anchor has not been set:
1019887 1019888 -1 -1
1924977 1924979 1924978 -1
1986623 1986624 -1 -1
2086708 2086710 2086709 -1
2087618 2087619 -1 -1
2122565 2122566 -1 -1
note that a previous sort or count command affects the
ordering of the rows (so that the
-th row corresponds to the
-th line
in a KWIC display obtained with cat)
>) or
appended (>>) to a file, if the first character of the filename is
|, the ouput is sent to the pipe consisiting of the following
command(s); use the following trick to display the distribution of match
lengths in the query result A:
> A = [pos="DT"] [pos="JJ.*"]* [pos="NNS?"];
> dump A > "| gawk '{print $2 - $1 + 1}' | sort -nr | uniq -c | less";