Rewire supports open-source research with Dynabench
Make it, break it, fix it, repeat!
Dynabench is an open-source research platform dedicated to dynamic evaluation and training of AI through human-and-model-in-the-loop dataset creation. This means annotators seek to create ‘adversarial’ examples that an AI model will misclassify. A team of annotators play against the model in real time, creating, labelling, and perturbing content in consecutive rounds of data collection. Employing human adversaries to fool the model, then re-training it between each round, iteratively exploits weaknesses in the model.
Building a model, breaking it, then fixing it mimics the ‘cat-and-mouse’ game of content moderation, where perpetrators repeatedly seek weaknesses in the system to evade detection. Repurposing this approach on the side of the moderation system pre-empts any such weaknesses, helping to create models that are watertight.
How does adversarial training work?
Image source: Kirk (2022)
Contribute to Dynabench.
Get involved with the hate speech task by testing your models with Dynabench data, using the task for your own research, or even submitting your own models to Dynabench. We welcome all contributions to this growing research community!
Read our team’s research on Dynabench!
Start working with Rewire’s technology today
by scheduling a demo with our team.
Or, you’d like to chat, contact us directly.
We look forward to working with you!