Welcome
You've reached Dan Pemstein's web site. I'm a PhD candidate in Political Science at the University of Illinois and a fellow in the Center for the Study of Democratic Institutions at Vanderbilt University. Previously, I was a visiting fellow at the Institute for Quantitative Social Science at Harvard University. My dissertation examines how inter-institutional information asymmetries affect policy outcomes and legislative politics in the European Union. My other current projects investigate how career ambition and party organization interact to determine legislative behavior and explore the selection processes underlying recorded roll call votes in Parliaments. Additionally, I am a co-author of both the Scythe Statistical Library, an open source C++ library for statistical computation, and RSNL, an R package for the statistical analysis of natural language. Finally, I am a co-developer of the Unified Democracy Scores, a project that synthesizes the contributions of other scholars to produce a composite democracy scale, accompanied by estimates of measurement uncertainty.
Research
You can find preprints and reprints of my publications, copies of my current working papers/presentations, and some data below. Take a look at my CV if you want detail.
(P)reprints
-
Political Ambition and Legislative Behavior in the European Parliament
(with Stephen Meserve and William Bernhard)
(2009) Journal of Politics 71(3): 1015-1032
-
The Scythe Statistical Library: An Open Source C++ Library for Statistical Computation
(with Kevin Quinn and Andrew Martin)
Forthcoming in the Journal of Statistical Software
Working Papers and Presentations
-
Democratic Compromise: A Latent Variable Analysis of Ten Measures of Regime Type
(with James Melton and Stephen Meserve)
-
Predicting Roll Calls with Legislative Text
-
Political Ambition and Legislative Effort in the European Parliament
(with William Bernhard)
-
Who Goes to Europe: Strategic Candidate Nomination to the European Parliament
(with Stephen Meserve and William Bernhard)
-
Election Timing and Financial Market Behavior
(with William Bernhard and David Leblang)
-
Strategy and Selection in Nominating Women Candidates
(with William Bernhard)
Data
-
(with James Melton and Stephen Meserve)
A set of measures that leverage the efforts of a variety of experts to provide a composite scale of democracy, accompanied by estimates of measurement uncertainty. The scores are available for virtually every country in the world from 1946 through 2000.
Teaching
I'm not teaching anything at the moment, but here's what I've taught in the past:
-
Introduction to Comparative Politics
Computing
Software
I write a lot of code. This includes statistical software, other research-related code, and some projects that are just for fun. I don't update this section as often as I should, and some of this stuff is pretty dated, but C++/Scythe implementations of various Bayesian estimators should end up here eventually.
-
The Scythe Statistical Library
A C++ library for statistical computation co-authored with Kevin M. Quinn (Harvard University) and Andrew D. Martin (Washington University). Scythe includes a suite of matrix manipulation functions, a suite of pseudo-random number generators, and a suite of numerical optimization routines. Scythe sits under the hood of a number of R packages, most notably MCMCpack, and has been used in published work in fields ranging from political science to molecular ecology, dentistry, and earth sciences. Like most of my software projects, Scythe is free software.
-
An R package for the statistical analysis of unstructured textual data, co-authored with Anthony Fader, Gary King, and Kevin Quinn. RSNL provides a suite of methods for common natural language tasks (e.g. tokenization, stemming, token transformation and filtering, part-of-speech tagging, and topic modeling) and a collection of extensible S4 objects (e.g. tokenizers, stemmers, transforms, filters, taggers, and topic model objects) to use when carrying out these operations. Furthermore, RSNL's toolset allows the user to keep track of relationships between chunks of text—and decompositions and summaries of those bits of text—throughout the analysis process, using an object-view model to provide multiple, concurrent, representations of an underlying text collection.
-
A Perl module for viewing and modifying the info and comment fields of audio files encoded in the Ogg Vorbis compressed audio format.
-
A Perl module wrapping the libao cross-platform audio library.
Snippets
-
This tutorial on the UDS website demonstrates how to use posterior samples from MCMC in subsequent analyses. The example is intended to explain how to use the Unified Democracy Scores correctly, but is generally applicable. So, for example, if you want to use Clinton-Jackman-Rivers or Martin-Quinn style ideal point estimates in your research and want to correctly incorporate posterior uncertainty in the ideal point estimates into your inferences, this tutorial shows you how, using stata.
-
Here's a set of slides (src) I put together on creating presentations with Prosper for a UI polisci grad student seminar on LaTeX. Note that I don't necessarily recommend Prosper (and I don't use it myself these days) and I would suggest FoilTeX for basic slides and Beamer for more powerpointish presentations.
-
Here are some more examples of presentations in foiltex (pdf, src) and beamer (pdf, src) that I briefly presented at a more recent seminar (September 2007). Also, here's some R code for computing a simple binomial MLE with bootstrapped confidence intervals that I presented at the same event.
-
Here's a bit of code (raw src) that plays an Ogg Vorbis file in perl, using the Ogg:Vorbis::* and Audio::Ao modules. For reference, here's a similar program (raw src) in plain old c.
The photograph at the top of the page is the work of Michael Spry and is distributed under the same license as this web page. The author authorized me to make modifications to the original photograph. Unless specifically accompanied by a license, all source code on this page is distributed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 license.