


This includes function for scoring models applied to a SpaCy corpus.

dacy.score.score.no_misc_getter(doc, attr)[source]#

A utility getter for scoring entities without including MISC.

  • doc (Doc) – a SpaCy Doc

  • attr (str) – attribute to be extracted



Return type


dacy.score.score.score(corpus, apply_fn, score_fn=['token', 'pos', 'ents', 'dep'], augmenters=[], k=1, nlp=None, **kwargs)[source]#

scores a models performance on a given corpus with potentially augmentations applied to it.

  • corpus (Corpus) – A spacy Corpus

  • apply_fn (Union[Callable, Language]) – A wrapper function for the model you wish to score. The model should take in a list of spacy Examples (Iterable[Example]) and output a tagged version of it (Iterable[Example]). A SpaCy pipeline (Language) can be provided as is.

  • score_fn (list[Union[Callable[[Iterable[Example]], dict], str]], optional) – A scoring function which takes in a list of examples (Iterable[Example]) and return a dictionary of performance scores. Four potiential strings are valid. “ents” for measuring the performance of entity spans. “pos” for measuring the performance of fine-grained (tag_acc), and coarse-grained (pos_acc) pos-tags. “token” for measuring the performance of tokenization. “dep” for measuring the performance of dependency parsing. “nlp” for measuring the performance of all components in the specified nlp pipeline. Defaults to [“token”, “pos”, “ents”, “dep”].

  • augmenters (list[Callable[[Language, Example], Iterable[Example]]], optional) – A spaCy style augmenters which should be applied to the corpus or a list thereof. defaults to [], indicating no augmenters.

  • k (int, optional) – Number of times it should run the augmentation and test the performance on the corpus. Defaults to 1.

  • nlp (Optional[Language], optional) – A spacy processing pipeline. If None it will use an empty Danish pipeline. Defaults to None. Used for loading the calling the corpus.


returns a pandas dataframe containing the performance metrics.

Return type



>>> from import create_lower_casing_augmenter
>>> from dacy.datasets import dane
>>> test = dane(splits=["test")
>>> nlp = dacy.load("da_dacy_small_tft-0.0.0")
>>> scores = score(test, augmenter=[create_lower_casing_augmenter(0.5)],
>>>                apply_fn = nlp)



Contains functions for testing the performance of models on varying input length.

dacy.score.input_length.n_sents_score(n_sents, apply_fn, dataset='dane', split='test', score_fn=['token', 'pos', 'ents', 'dep'], verbose=True, **kwargs)[source]#

scores the performance of a given model on examples of a given number of sentences.

  • n_sents (Union[int, list[int]]) – Number of sentences which the performance should be applied to.

  • apply_fn (Callable) – A wrapper function for the model you wish to score. The model should take in a spacy Example and output a tagged version of it.

  • dataset (str, optional) – Which dataset should this be applied to. Possible options include “dane”. Defaults to “dane”.

  • split (str, optional) – Which splits of the dataset should be used. Possible options include “train”, “dev”, “test”, “all”. Defaults to “test”.

  • score_fn (list[Union[str, Callable]], optional) – A scoring function which takes in a list of examples and return a dictionary of the form {“score_name”: score}. Four potiential strings are valid. “ents” for measuring the performance of entity spans. “pos” for measuring the performance of pos-tags. “token” for measuring the performance of tokenization. “nlp” for measuring the performance of all components in the specified nlp pipeline. Defaults to [“token”, “pos”, “ents”].

  • verbose (bool, optional) – Toggles the verbosity of the function. Defualts to True

  • kwargs (dict) – arguments to be passed to dataset or the score function.


returns a pandas dataframe containing the performance metrics.

Return type
