Skip to content

Word Use

Frequency and TF-IDF

To investigate how words are use in the corpus I decided to count unique lemmata. On the graph you can see TF-IDF representations of lemmatized texts projected into 2D space with UMAP.

By hovering over different texts you can see the top 10 words in the document by frequency, but also by TF-IDF ranking.

Use of Terms on a Group and Individual Level

Phrases

To investigate what kinds of phases the authors use most often I took the most frequently occurring 3, 5, and 7-grams in each work. On the figure you can see the phrases, and how many times they occurred.

Use of Phrases in Works (3-grams)
Use of Phrases in Works (5-grams)
Use of Phrases in Works (7-grams(