Skip to content

Software

At Aarhus NLP we regularly engage in developing software, primarily for research or educational purposes. The following Python packages were either partly or entirely developed by our group.

MTEB Evaluation toolkit for text and image embeddings, including model implementations, datasets and various benchmarks.
Scandinavian Embedding Benchmark A Scandinavian Benchmark for evaluating document embeddings
EuroEval An evaluation benchmark for the Scandinavian and Germanic language models evaluating natural language understanding and generation.
Turftopic A unified framework for topic modelling with transformer models.
stormtrooper Zero and few shot learning with Large Language Models
topicwizard Model agnostic, interactive topic model interpretation framework.
DaCy The State of the Art Danish NLP pipeline for SpaCy
OdyCy General Purpose NLP pipelines for Ancient Greek
TextDescriptives A Python library for calculating a large variety of metrics from text
embedding-explorer Interactively explore your embeddings with semantic graphs and clustering.
neofuzz Blazing fast fuzzy and semantic text search with the power of machine learning.
Augmenty An structured augmentation library for augmenting both the texts and the annotations
Asent An educational library for performing transparent sentiment analysis
tweetopic Blazing Fast implementations of short-text topic models.
glovpy The fastest and lightest Python package for training GloVe word embeddings