Bayesian Statistical Modeling Using Stan

4240 Public Affairs Bldg

Daniel Lee June 23, 2015 10:00 AM-12:00 PM 4240 Public Affairs Building Stan is an open-source, Bayesian inference tool with interfaces in R, Python, Matlab, Julia, Stata, and the command […]

Aude Hofleitner, Facebook

CCPR Seminar Room 4240 Public Affairs Building, Los Angeles, CA, United States

"Inferring and understanding travel and migration movements at a global scale"

Abstract: Despite extensive work on the dynamics and outcomes of large-scale migrations, timely and accurate estimates of population movements do not exist. While censuses, surveys, and observational data have been used to measure migration, estimates based on these data sources are constrained in their inability to detect unfolding migrations, and lack temporal and demographic detail. In this study, we present a novel approach for generating estimates of migration that can measure movements of particular demographic groups across country lines.

Specifically, we model migration as a function of long-term moves across countries using aggregated Facebook data. We demonstrate that this methodological approach can be used to produce accurate measures of past and ongoing migrations - both short-term patterns and long-term changes in residence. Several case studies confirm the validity of our approach, and highlight the tremendous potential of information obtained from online platforms to enable novel research on human migration events.

Reproducibility of Statistical Results

CCPR Seminar Room 4240 Public Affairs Building, Los Angeles, CA, United States

Presented By: Mark S. Handcock (Professor, Statistics) Jeffrey B. Lewis (Professor, Political Science) Marc A. Suchard (Professor, Biomathematics, Biostatistics and Human Genetics)   Reproducibility is one of the main principles […]

Betsy Sinclair, Washington University in St Louis

314 Royce Hall 340 Royce Dr, los angeles, CA, United States

"Electronic Homestyle: Tweeting Ideology"

Abstract: Ideal points are central to the study of political partisanship and an essential component to our understanding of legislative and electoral behavior. We employ automated text analysis on tweets from Members of Congress to estimate their ideal points using Naive Bayes classification and Support Vector Machine classification. We extend these tools to estimate the proportion of partisan speech used in each legislator's tweets. We demonstrate an association between these measurements, existing ideal point measurements, and district ideology.

Rick Dale, University of California, Merced

314 Royce Hall 340 Royce Dr, los angeles, CA, United States

"Quantifying the dynamics of multimodal communication with multimodal data."

*Presented by the Center for Social Statistics

Abstract: Human communication is built upon an array of signals, from body movement to word selection. The sciences of language and communication tend to study these signals individually. However, natural human communication uses all these signals together simultaneously, and in complex social systems of various sizes. It is an open puzzle to uncover how this multimodal communication is structured in time and organized at different scales. Such a puzzle includes analysis of two-person interactions. It also involves an understanding of much larger systems, such as communication over social media at an unprecedentedly massive scale.

Collaborators and I have explored communication across both of these scales, and I will describe examples in the domain of conflict. For example, we've studied conflict communication in two-person interactions using video analysis of body and voice dynamics. At the broader scale, we have also used large-scale social media behavior (Twitter) during a massively shared experience of conflict, the 2012 Presidential Debates. These projects reveal the importance of dynamics. In two-person conflict, for example, signal dynamics (e.g., body, voice) during interaction can reveal the quality of that interaction. In addition, collective behavior on Twitter can be predicted even by simple linear models using debate dynamics between Obama and Romney (e.g., one interrupting the other).

The collection, quantification, and modeling of multitemporal and multivariate datasets hold much promise for new kinds of interdisciplinary collaborations. I will end by discussing how they may guide new theoretical directions for pursuing the organization and temporal structure of multimodality in communication.

Ilan H. Meyer & Mark S. Handcock, UCLA

CCPR Seminar Room 4240 Public Affairs Building, Los Angeles, CA, United States

"Innovative Sampling Approaches for Hard to Reach Populations: Design of a National Probability Study of Lesbians, Gay Men, Bisexuals, and Transgender Peoples and Network Sampling of Hard to Reach Populations"


Speakers:

Ilan H. Meyer, Williams Distinguished Senior Scholar for Public Policy at the Williams Institute

Mark S. Handcock, Professor of Statistics at UCLA and Director of the Center for Social Statistics


Description:


Come for the exciting seminar then stay for the free lunch and discussion. A seminar led by Ilan H. Meyer followed immediately by a Brown Bag Lunch led by Mark S. Handcock.

Dr. Meyer is Principal Investigator of the Generations and TransPop Surveys. Generations is a survey of a nationally representative sample of 3 generations of lesbians, gay men, and bisexuals. TransPop is the first national probability sample survey of transgender individuals in the United States. Both studies attempt to obtain large nationally representative samples of hard to reach populations. Dr. Meyer will review sampling issues with LGBT populations and speak on the importance of measuring population health of LGBTs and the underlying aspects in designing a national probability survey.

From a contrasting perspective, the field of Survey Methodology is facing many challenges. The general trend of declining response rates is making it harder for survey researchers to reach their intended population of interest using classical survey sampling methods.

In the followup Brown Bag Lunch, led by Mark S. Handcock, participants will discuss statistical challenges and approaches to sampling hard to reach populations. Transgenders, for example, are a rare and stigmatized population. If the transgender community exhibits networked social behavior, then network sampling methods may be useful approaches that compliment classical survey methods.
Participants are encouraged to speak on ideas of statistical methods for surveys.

West Coast Experiments Conference, UCLA 2017

Covel Commons UCLA

The tenth annual West Coast Experiments Conference will be held at UCLA on Monday, April 24 and Tuesday, April 25, 2017, preceded by in-depth methods training workshops on Sunday, April 23. The conference […]

Shahryar Minhas, Duke University

CCPR Seminar Room 4240 Public Affairs Building, Los Angeles, CA, United States

The Center for Social Statistics Presents: Predicting the Evolution of Intrastate Conflict: Evidence from Nigeria url: http://css.stat.ucla.edu/event/shahryar-minhas/ The endogenous nature of civil conflict has limited scholars' abilities to draw clear inferences […]

Fragile Families Challenge: Getting Started Workshop

CCPR Seminar Room 4240 Public Affairs Building, Los Angeles, CA, United States

“Fragile Families Challenge: Getting Started Workshop” Ian Lundberg Ph.D. Student, Sociology and Social Policy,  Princeton University The Fragile Families Challenge is a scientific mass collaboration that combines predictive modeling, causal inference, and […]

James Robins, Harvard University

Room 33-105 CHS Building 650 Charles E Young Drive South, Los Angeles, CA, United States

The UCLA Departments of Epidemiology, Biostatistics, Statistics and the Center for Social Statistics presents: Causal Methods in Epidemiology: Where has it got us and what can we expect in the […]

Sander Greenland, UCLA Department of Epidemiology

The UCLA Department of Statistics and the Center for Social Statistics presents: Statistical Significance and Discussion of the Challenges of Avoiding the Abuse of Statistical Methodology Sander Greenland will offer […]

Hadley Wickham, RStudio

The UCLA Department of Statistics and the Center for Social Statistics presents: Programming data science with R & the tidyverse Tidy evaluation is a new framework for non-standard evaluation that […]

Rob Warren, University of Minnesota

CCPR Seminar Room 4240 Public Affairs Building, Los Angeles, CA, United States

"When Should Researchers Use Inferential Statistics When Analyzing Data on Full Populations?"

Abstract: Many researchers uncritically use inferential statistical procedures (e.g., hypothesis tests) when analyzing complete population data—a situation in which inference may seem unnecessary. We begin by reviewing and analyzing the most common rationales for employing inferential procedures when analyzing full population data. Two common rationales—having to do with handling missing data and generalizing results to other times and/or places—either lack merit or amount to analyzing sample (not population) data. Whether it is appropriate to use inferential procedures depends on whether researchers are analyzing sample or population data and on whether they seek to make causal or descriptive claims. When doing descriptive research, the distinction between sample and population data is paramount: Inferential statistics should only be used to analyze sample data (to account for sampling variability) and never to analyze population data. When doing causal research, the distinction between sample data and population data is unimportant: Inferential procedures can and should always be used to distinguish (for example) robust associations from those that may have come about by chance alone. Crucially, using inferential procedures to analyze population data to make descriptive claims can lead to incorrect substantive conclusions—especially when population sizes and/or effect sizes are small.

Yu Xie, Princeton

CCPR Seminar Room 4240 Public Affairs Building, Los Angeles, CA, United States

"Heterogeneous Causal Effects: A Propensity Score Approach "

Abstract: Heterogeneity is ubiquitous in social science. Individuals differ not only in background characteristics, but also in how they respond to a particular treatment. In this presentation, Yu Xie argues that a useful approach to studying heterogeneous causal effects is through the use of the propensity score. He demonstrates the use of the propensity score approach in three scenarios: when ignorability is true, when treatment is randomly assigned, and when ignorability is not true but there are valid instrumental variables.

Jake Bowers, University of Illinois at Urbana-Champaign

Franz Hall 2258A

"Rules of Engagement in Evidence-Informed Policy: Practices and Norms of Statistical Science in Government"

Abstract: Collaboration between statistical scientists (data scientists, behavioral and social scientists, statisticians) and policy makers promises to improve government and the lives of the public. And the data and design challenges arising from governments offer academics new chances to improve our understanding of both extant methods and behavioral and social science theory. However, the practices that ensure the integrity of statistical work in the academy — such as transparent sharing of data and code — do not translate neatly or directly into work with governmental data and for policy ends. This paper proposes a set of practices and norms that academics and practitioners can agree on before launching a partnership so that science can advance and the public can be protected while policy can be improved. This work is at an early stage. The aim is a checklist or statement of principles or memo of understanding that can be a template for the wide variety of ways that statistical scientists collaborate with governmental actors.

Erin Hartman, University of California Los Angeles

CCPR Seminar Room 4240 Public Affairs Building, Los Angeles, CA, United States

Covariate Selection for Generalizing Experimental Results

Researchers are often interested in generalizing the average treatment effect (ATE) estimated in a randomized experiment to non-experimental target populations. Researchers can estimate the population ATE without bias if they adjust for a set of variables affecting both selection into the experiment and treatment heterogeneity.Although this separating set has simple mathematical representation, it is often unclear how to select this set in applied contexts. In this paper, we propose a data-driven method to estimate a separating set. Our approach has two advantages. First, our algorithm relies only on the experimental data. As long as researchers can collect a rich set of covariates on experimental samples, the proposed method can inform which variables they should adjust for. Second, we can incorporate researcher-specific data constraints. When researchers know certain variables are unmeasurable in the target population, our method can select a separating set subject to such constraints, if one is feasible. We validate our proposed method using simulations, including naturalistic simulations based on real-world data.

Co-Sponsored with The Center for Social Statistics