rCAT – Relational Character Analysis Tool
- Typ
-
Tool
- Autor
-
Florian Barth, Evgeny Kim, Sandra Murr, Roman Klinger
- Beschreibung
-
There is a WEB VERSION of rcat available.
The web version is being currently deployed. GitHub page of the tool: https://github.com/kimikadze/web-rcat
You can still use the desktop version of the program. Please follow the instructions below to get the tool running.
IMPORTANT: The desktop version of the program is already outdated, as web version includes other features not implemented in the desktop version. The description of features and user interface on GitHub and this page may not be identical.
MacOS Users:
- Install LaTex (http://www.tug.org/mactex/). When asked whether to install missing packages automatically, mark "Yes".
- Download Mac version of rCAT tool from the Download section below.
- Unzip rCAT to your Home directory (this is the directory that contains folders like Documents, Downloads, Pictures, etc and it's name corresponds to your user name).
- Open "rcat" folder. Find "rcat" executable file and double-click it.
- The tool launches. See Working with the tool section below.
Windows Users:
- Install LaTex (http://www.tug.org/texlive/acquire-netinstall.html) for Windows. Run the installer. Select simple download method (big).
- Download Windows version of rCAT tool from the Download section below.
- Unzip rCAT.
- Open "rcat" folder. Find "rcat" executable file (type of the file "Application") and double-click it.
- The tool launches. See Working with the tool section below.
Linux Users:
Download the tool from the Downloads section below. The tool can be started from the tool directory by python rcat.py
Install all the dependencies.
Major dependencies (skip to other dependencies if you have these):
- Install Graphviz, a vizualization package (http://www.graphviz.org/Download.php).
- Install LaTex (https://www.latex-project.org/get/).
Other dependecies:
- pip install pylatex
- pip install numpy
- pip install nltk
- pip install stanfordcorenlp
- pip install graphviz
- pip install wordcloud
Then open python environment and do:
- import nltk
- nltk.download()
A window opens: select "popular packages" and download them. Close the window after the download process is finished.
Working with the tool
The GUI has the following elements:
Book: click Open to select the text you want to analyze. By the default the tool opens app internal directory with Goethe's novel ( "Die Leiden des jungen Werthers.txt”). Select this file and click Open. You can select any other text on your hard drive.
Characters: click Open to select the text with character names. By default the tool opens app internal directory with list of characters for Goethe's"Die Leiden des jungen Werthers" (Goethe_Werther_characters.txt"). Select this file and click Open.
- The file with character names should be formatted as follows: each line starts with a canonical name for a single character. Separated by tab are aliases of this character. Each character and list of his/her aliases should be entered on separate lines.
Distance measure: integer. How many words between mentions of two characters are considered as proximity. Default is 8.
Context measure 1 : integer. How many words before the mention of the first character to include into the contextual analysis. Default is 5.
Context measure 2: integer. How many words after the mention of the second character to include into the contextual analysis. Default is 5.
Remove stop words (y/n)?: remove stop words from contextual analysis or no. Default Yes.
Write a csv file for Gephi? (y/n): whether to output a csv file for later use in Gephi software
Word clouds to show: integer. Show only n-top word clouds for each character and character pair. For the latter case, word clouds are sorted by edge weight. For the former case, by degree of a character. Default is 5.
Segments: integer. Number of segments into which the book should be splitted to track the word field development of the story. Default is 10.
Analyze with word fields: there are two ways in which you can provide word fields.
- Single category: One plain text file with one word per line. The tool will then use this words to characterize characters, relations between characters, and plot the development of these word field in a single plot.
- Multi-category: A folder with multiple files structured as described above. Files names correspond to the categories of the word fields. The tool will plot the development of these word fields in multiple plots. Warning: multi-category word clouds are not currently supported.
Sample word fields are located in a "word_fields" folder within the app. Folder "multiple_cat_emotions" contain multiple categories of emotions.
Important! if you tick Single category radiobutton, you must click Select file, and if you tick Multi-category radiobutton, you must click Select folder. Leave the field blank if you don't need word field analysis.
Book language: choose between English, German, or Middle-high German. This is needed for choosing correct stop words removal method.
Run: run the program.
Click the Run button.
The program will analyze the text and outputs a pdf file named "relations.pdf" to the same directory.
- Referenz
-
submitted to the German Digital Humanities Conference
- Download
-
The tool is available for the following operating systems:
Roman Klinger
Prof. Dr.Gastprofessor
Evgeny Kim
ehemaliger Doktorand