Word Use
Frequency and TF-IDF
To investigate how words are use in the corpus I decided to count unique lemmata. On the graph you can see TF-IDF representations of lemmatized texts projected into 2D space with UMAP.
By hovering over different texts you can see the top 10 words in the document by frequency, but also by TF-IDF ranking.
Phrases
To investigate what kinds of phases the authors use most often I took the most frequently occurring 3, 5, and 7-grams in each work. On the figure you can see the phrases, and how many times they occurred.