Software
At Aarhus NLP we regularly engage in developing software, primarily for research or educational purposes. The following Python packages were either partly or entirely developed by our group.
![]() |
MTEB | Evaluation toolkit for text and image embeddings, including model implementations, datasets and various benchmarks. |
![]() |
Scandinavian Embedding Benchmark | A Scandinavian Benchmark for evaluating document embeddings |
![]() |
EuroEval | An evaluation benchmark for the Scandinavian and Germanic language models evaluating natural language understanding and generation. |
Turftopic | A unified framework for topic modelling with transformer models. | |
stormtrooper | Zero and few shot learning with Large Language Models | |
topicwizard | Model agnostic, interactive topic model interpretation framework. | |
![]() |
DaCy | The State of the Art Danish NLP pipeline for SpaCy |
OdyCy | General Purpose NLP pipelines for Ancient Greek | |
![]() |
TextDescriptives | A Python library for calculating a large variety of metrics from text |
embedding-explorer | Interactively explore your embeddings with semantic graphs and clustering. | |
neofuzz | Blazing fast fuzzy and semantic text search with the power of machine learning. | |
![]() |
Augmenty | An structured augmentation library for augmenting both the texts and the annotations |
Asent | An educational library for performing transparent sentiment analysis | |
tweetopic | Blazing Fast implementations of short-text topic models. | |
glovpy | The fastest and lightest Python package for training GloVe word embeddings |