Style
The corpus can be investigated from a number of stylistical perspectives.
Vocabulary Richness (All Words)
The vocabulary richness of each text was calculated on the lemmatized work both with vanilla type-token ratio, but also with moving windows of size 10, 50, 500, and 1000.
Vocabulary Richness (Noun, Adj, Verb)
The vocabulary richness of each fable was calculated on the lemmatized work both with vanilla type-token ratio, but also with moving windows of size 500 and 1000.
Vocabulary Richness (Others)
The vocabulary richness of each fable was calculated on the lemmatized work both with vanilla type-token ratio, but also with moving windows of size 500 and 1000. POS-tags used were ADV, INTJ, ADP, CCONJ, SCONJ, DET, PART, and PRON.
UPOS Tags
UPOS tags were tallied up in all texts without removal of any stop words or lemmatization.
The most frequent 3-grams of UPOS tags were also counted for each work.
The most frequent 5-grams of UPOS tags were also counted for each work.
The most frequent 7-grams of UPOS tags were also counted for each work.
Lengths
The length of texts (number of tokens), average length of tokens and mean sentence length were calculated for each work.
3d plot for exploration.
καὶ
Number of tokens vs. occurrences of καὶ
The length of texts (number of tokens) and number of occurrences of καὶ were calculated for each work.
καὶ Richness
Calculated on the lemmatized work both with vanilla καὶ-token ratio, but also with moving windows of size 10 and 50.