Hate Speech#
Hate speech in text is often defined text that expresses hate or encourages violence towards a person or group based on something such as race, religion, sex, or sexual orientation.
DaCy currently does not include its own tools for hate-speech analysis, but incorperates existing state-of-the-art models for Danish. The hate-speech model used in DaCy is trained by DaNLP. It exists of two models. One for detecting wether a text is hate speech laden and one for classifying the type of hate speech.
Name |
Creator |
Domain |
Output Type |
Model Type |
---|---|---|---|---|
|
|
|||
|
|
Other models for Hate Speech detection
There exist others models for Danish hate-speech detection. We have chosen the BERT offensive model as it obtains a reasonable trade-off between good [performance and speed] and includes a classification for classifying the type of hate-speech. The other models includes
The hate speech model used in DaCy is trained by DaNLP. It exists of two models. One for detecting wether a text is hate speech laden and one for classifying the type of hate speech.
Usage#
To add the emotion models to your pipeline simply run:
import dacy
import spacy
nlp = spacy.blank("da") # create an empty pipeline
# add the hate speech models
nlp.add_pipe("dacy/hatespeech_detection")
nlp.add_pipe("dacy/hatespeech_classification")
Show code cell output
/home/runner/.local/lib/python3.10/site-packages/transformers/utils/generic.py:441: FutureWarning: `torch.utils._pytree._register_pytree_node` is deprecated. Please use `torch.utils._pytree.register_pytree_node` instead.
_torch_pytree._register_pytree_node(
/home/runner/.local/lib/python3.10/site-packages/transformers/utils/generic.py:309: FutureWarning: `torch.utils._pytree._register_pytree_node` is deprecated. Please use `torch.utils._pytree.register_pytree_node` instead.
_torch_pytree._register_pytree_node(
/home/runner/.local/lib/python3.10/site-packages/huggingface_hub/file_download.py:797: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
/home/runner/.local/lib/python3.10/site-packages/transformers/utils/generic.py:309: FutureWarning: `torch.utils._pytree._register_pytree_node` is deprecated. Please use `torch.utils._pytree.register_pytree_node` instead.
_torch_pytree._register_pytree_node(
/home/runner/.local/lib/python3.10/site-packages/huggingface_hub/file_download.py:797: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
<spacy_wrap.pipeline_component_seq_clf.SequenceClassificationTransformer at 0x7fb0f635fb20>
This wil set the two extensions to the Doc object, is_offensive
and hate_speech_type
.
These shows whether a text is emotionally laden and what emotion it contains.
Both of these also come with *_prob
-suffix if you want to examine the
probabilites of the models.
Let’s look at an example using the model:
texts = ["senile gamle idiot", "hej har du haft en god dag"]
# apply the pipeline
docs = nlp.pipe(texts)
for doc in docs:
# print model predictions
print(doc._.is_offensive)
# print type of hate-speech if it is hate-speech
if doc._.is_offensive == "offensive":
print("\t", doc._.hate_speech_type)
/home/runner/.local/lib/python3.10/site-packages/thinc/shims/pytorch.py:114: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
with torch.cuda.amp.autocast(self._mixed_precision):
offensive
sprogbrug
not offensive
/home/runner/.local/lib/python3.10/site-packages/thinc/shims/pytorch.py:114: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
with torch.cuda.amp.autocast(self._mixed_precision):