Style
The corpus can be investigated from a number of stylistical perspectives.
Vocabulary Richness
The vocabulary richness of each fable was calculated on the lemmatized work both with vanilla type-token ratio, but also with moving windows of size 10 and 50.
UPOS Tags
UPOS tags were tallied up in all fables without removal of any stop words or lemmatization.
The most frequent 2 to 4-grams of UPOS tags were also counted for each work.
The most frequent 4-grams of UPOS tags were also counted for each work.
Lengths
The length of fables (number of tokens), average length of tokens and mean sentence length were calculated for each work. Texts were split on punctuation to create the sentences. Punctuation was defined as full stops (.) and Greek question marks (;). Commas and elevated dots were not counted as punctuation. Metrical feet, metre, and stanzas were not taken into account.
Vocabulary Richness (Noun, Adj, Verb)
The vocabulary richness of each fable was calculated on the lemmatized work both with vanilla type-token ratio, but also with moving windows of size 10 and 50.