BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//California Center for Population Research - ECPv6.15.14//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-WR-CALNAME:California Center for Population Research
X-ORIGINAL-URL:https://ccpr.ucla.edu
X-WR-CALDESC:Events for California Center for Population Research
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:America/Los_Angeles
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:20240310T100000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:20241103T090000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:20250309T100000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:20251102T090000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:20260308T100000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:20261101T090000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20251015T120000
DTEND;TZID=America/Los_Angeles:20251015T150000
DTSTAMP:20260502T070021
CREATED:20250805T175751Z
LAST-MODIFIED:20250805T175751Z
UID:10000934-1760529600-1760540400@ccpr.ucla.edu
SUMMARY:Workshop: Brandon Stewart\, Princeton University\, "Using Large Language Model Annotations for the Social Sciences: A General Framework of Using Predicted Variables in Statistical Analyses"
DESCRIPTION:Biography: Brandon Stewart is Associate Professor of Sociology at Princeton University where he is also affiliated with the Office of Population Research and numerous other centers on campus. He currently serves as the Co-Editor-in-Chief of Political Analysis and Associate Editor at Sociological Methods & Research. His work spans several areas of computational social science with a focus on text as data and causal inference. \n  \n\n\n\n“Using Large Language Model Annotations for the Social Sciences: A General Framework of Using Predicted Variables in Statistical Analyses”\n\n\n\nAbstract: Social scientists use automated annotation methods\, such as supervised machine learning and\, more recently\, large language models (LLMs)\, that can predict labels and generate text-based variables. While such predicted text-based variables are often analyzed as if they were observed without errors\, we show that ignoring prediction errors in the automated annotation step leads to substantial bias and invalid confidence intervals in downstream analyses\, even if the accuracy of the automated annotations is high\, e.g.\, above 90%. We propose a framework of design-based supervised learning (DSL) that can provide valid statistical estimates\, even when predicted variables contain non-random prediction errors. DSL employs a doubly robust procedure to combine predicted labels and a smaller number of expert annotations. DSL allows scholars to apply advances in LLMs to social science research while maintaining statistical validity. We illustrate its general applicability using two applications where the outcome and independent variables are text-based.
URL:https://ccpr.ucla.edu/event/workshop-brandon-stewart-princeton-university-using-large-language-model-annotations-for-the-social-sciences-a-general-framework-of-using-predicted-variables-in-statistical-analyses/
LOCATION:Room 4240A\, 4th Floor\, Public Affairs Building\, 337 Charles Young Dr.\, LA\, CA 90095
CATEGORIES:CCPR Workshop
END:VEVENT
END:VCALENDAR