| Universität Stuttgart : IMS : CWB Homepage | Online CQP Demos |
![]() |
Online CQP Demos |
Search the English sample corpus
DICKENS online.
CQP query -
simple query -
corpus tools
The DICKENS corpus is a collection of novels by Charles Dickens, including
A Christmas Carol, David Copperfield, Dombey and Son, Great Expectations, Hard Times,
Master Humphrey's Clock, Nicholas Nickleby, Oliver Twist, Our Mutual Friend, Sketches by BOZ,
A Tale of Two Cities, The Old Curiosity Shop, The Pickwick Papers, and Three Ghost Stories.
This corpus amounts to a total of 3.4 million running words. It has been part-of-speech tagged
and lemmatised with the
TreeTagger,
using its standard English parameter file.
In addition, noun phrases (NP) and prepositional phrases (PP) were annotated with a
parser developed by Helmut Schmid.
Query German parliamentary debates online.
(BUNDESTAG)
CQP query -
simple query -
corpus tools
The BUNDESTAG corpus contains Hansards of the German Parliament (Bundestag)
from the parliamentary term running from 1994 to 1997.
This corpus amounts to a total of 5.7 million running words. It has been
annotated with a rich variety of linguistic information. The token-level annotations comprise
part-of-speech tags (TreeTagger),
lemmata, and morpho-syntactic information (both IMSLex).
In addition, a partial phrase-structure analysis was performed with the
YAC chunker
developed by Hannah Kermes.