ICARUS: Interactive platform for Corpus Analysis and Research tools, University of Stuttgart
- Markus Gärtner and Gregor Thiele
ICARUS is a search and visualization tool that primarily targets dependency trees. It enables the user to search dependency treebanks given a variety of constraints, including searching for particular subtrees. Emphasis has been placed on a functionality that makes it possible for the user to switch back and forth between a high-level, aggregated view of the search results and browsing of particular corpus instances.
The only requirement to run ICARUS is that a Java 7 runtime environment is installed on the host computer.
ICARUS only ships with a small sample treebank (9 sentences, automatically parsed Wikipedia content). Additional small sample treebanks in CoNLL 09 format can be downloaded from the CoNLL 2009 Shared Task website. You can also try the dependency version of the German TiGer treebank, which can be downloaded here.
ICARUS is open source and freely available under the GNU General Public License, version 3.
Binaries that run out of the box are available here. Note that ICARUS requires Java version 7 or later to run. Included in the distribution are sources, required third party libraries and a small tutorial for getting started.
Since version 1.1.3 the above distribution does no longer contain a documentation folder. The entire set of javadocs has been moved to a separate zip file, available here.
Please send questions and bug reports to firstname.lastname@example.org.
A tutorial how to get started using ICARUS can be downloaded here (also included in the binary distribution).
The documentation which include some tutorials and videos can be found in our wiki. Note that this documentation is intended for end users. Developers are advised to check out the javadocs zip file linked in the Download section.
2016-10-20: Version 1.3.7 released.Overhaul of several search features and the coreference graph visualization. New reader implementations for general tabular formats, making it easier to access data that does not fit into existing (fixed) formats.
2015-07-20: Version 1.2.6 released. New plugin to integrate search and visualization features for text and speech data. Annotations on syllable level can be accessed together with the existing syntax and coreference modules. The new module provides very fine-grained audio playback for multi-modal corpora and introduces a novel export function for search results.
2015-04-23: Version 1.2.5 released. Large number of small fixes. Search UI got an overhaul to the Cell-Editor. Several search constraints received small changes to their semantics and/or features (e.g. edges representing precedence relations now honor the 'distance' constraint). A new allocation reader for the coreference plugin was added that supports using data in the CoNLL 2012 format as allocations.
2014-07-08: Version 1.1.4 released. Small fixes for a couple of minor bugs in the search perspective and coreference visualization. Most notable new feature is the ability to save and restore entire searches. Currently this includes searches on dependency and coreference data, but not the errormining. Both the save and restore functions are available in the search manager view.
2014-06-17: Version 1.1.3 released. Couple of minor adjustments in preparation of the ACL demo session. Improved quantitative breakdowns in the errormining result visualizations. The default distribution no longer contains javadocs files. These have been moved into a separate zip file available in the Download section.
2014-04-24: Version 1.1.2 released. Couple of minor adjustments in preparation of the EACL demo session.
2014-04-10: Version 1.1.0 released. Besides a large number of minor performance improvements, UI overhauls and bugfixes, the new version features 2 additional plugins dedicated to interactive error mining and incorporation of coreference annotated data.
2014-01-14: Version 1.0.5 released. Several minor bug fixes related to the search engine and visualization in general.
2013-11-28: Version 1.0.4 released. Large number of small fixes. New distribution now contains sources and javadoc resources.
2013-10-24: Version 1.0.2 released. Minor bug fixes, including a bug that caused the search engine to crash on certain platforms.
2013-09-13: Version 1.0.1 released. Minor bug fixes.
- Markus Gärtner, Katrin Schweitzer, Kerstin Eckart and Jonas Kuhn. Multi-modal Visualization and Search for Text and Prosody Annotations. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing: System Demonstrations, Beijing, China, July 27--29, 2015.[pdf] [poster] [bibtex] [code]
- Katrin Schweitzer, Markus Gärtner, Arndt Riester, Ina Rösiger, Kerstin Eckart, Jonas Kuhn, Grzegorz Dogil. Analysing automatic descriptions of intonation with ICARUS. In INTERSPEECH-2015, 319-323, Dresden, Germany, September 6--10, 2015.[pdf] [poster]
- Markus Gärtner, Anders Björkelund, Gregor Thiele, Wolfgang Seeker, and Jonas Kuhn. Visualization, Search, and Error Analysis for Coreference Annotations. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Baltimore, Maryland, June 23--25, 2014.[pdf] [poster] [bibtex] [code]
- Gregor Thiele, Wolfgang Seeker, Markus Gärtner, Anders Björkelund and Jonas Kuhn. A Graphical Interface for Automatic Error Mining in Corpora. In Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics, Gothenburg, Sweden, April 28--30, 2014. [pdf] [poster] [bibtex] [code]
- Markus Gärtner, Gregor Thiele, Wolfgang Seeker, Anders Björkelund and Jonas Kuhn. ICARUS – An Extensible Graphical Search Tool for Dependency Treebanks. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Sofia, Bulgaria, August 5--7, 2013. [pdf] [poster] [bibtex] [code]