Seminar

Bringing survey methodology to machine learning: Effects of annotation collection methods on training data and models

Thursday, 8^th June 2023, 13:00 – 14:00 (BST)
Online
Public
Free

This event took place in the past.

This is a recurring event: View all events in the series “ City, University of London, European Social Survey and NatCen Social Research Survey Methodology Seminar Series”

Speaker: Stephanie Eckman, University of Maryland

Abstract

The instruments used to collect training data for machine learning models have many similarities to web surveys, such as the provision of a stimulus and fixed response options.

Survey methodologists know that item and response option wording and ordering, as well as annotator effects, impact survey data.

Our previous research showed that these effects also occur when collecting annotations for model training.

Our new study builds on those results, exploring how instrument structure and annotator composition impact models trained on the resulting annotations.

Using previously annotated Twitter data on hate speech, we collect annotations with five versions of an annotation instrument, randomly assigning annotators to versions.

We then train ML models on each of the five resulting datasets.

By comparing model performance across the instruments, we aim to understand:

whether the way annotations are collected impacts the predictions and errors by the trained models
which instrument version leads to the most efficient model.

In addition, we expand upon our earlier findings that annotators' demographic characteristics impact the annotations they make. Our results emphasize the importance of careful annotation instrument design.

About the speaker

Stephanie Eckman is a Researcher and Data Scientist at the University of Maryland's Social Data Science Centre.

She has a PhD in Statistics and Methodology and has worked in survey research for more than 20 years.

The event is free to attend, simply register your details to receive a unique Zoom Webinar link.

Please note that you will be required to register for the event using an email address linked to a valid Zoom account.

Attendance at City events is subject to our terms and conditions.

Bringing survey methodology to machine learning: Effects of annotation collection methods on training data and models

Abstract

About the speaker

Tags:

Share this event

Contact details

Abstract

About the speaker

Tags:

Share this event

Contact details

Stefan Swift