Style
The corpus can be investigated from a number of stylistical perspectives.
Vocabulary Richness (All Words)
The vocabulary richness of each fable was calculated on the lemmatized work both with vanilla type-token ratio, but also with moving windows of size 10 and 50.
Vocabulary Richness (Noun, Adj, Verb)
The vocabulary richness of each fable was calculated on the lemmatized work both with vanilla type-token ratio, but also with moving windows of size 10 and 50.
UPOS Tags
UPOS tags were tallied up in all fables without removal of any stop words or lemmatization.
The most frequent 3-grams of UPOS tags were also counted for each work.
The most frequent 5-grams of UPOS tags were also counted for each work.
The most frequent 7-grams of UPOS tags were also counted for each work.
Lengths
The length of fables (number of tokens), average length of tokens, mean sentence length, and mean length between occurrence of καὶ were calculated for each work.
3d plot for exploration.
καὶ
Number of tokens vs. occurrences of καὶ
The length of fables (number of tokens) and number of occurrences of καὶ were calculated for each work.
καὶ Richness
Calculated on the lemmatized work both with vanilla καὶ-token ratio, but also with moving windows of size 10 and 50.