Earlier this week, we sent out a tweet as sort of a fishing expedition: “Political scientists: if you’ve collected precinct-level election results in your work & are willing to share, get in touch!”

kankakee-2000-resultsMike Sances did. An assistant professor of political science at the University of Memphis, Sances let us know that as part of his dissertation research he had obtained precinct-level results files for many counties in Illinois. In some cases, the files went back more than a decade. Even better, he was willing to share them with us.

Thanks to Sances, and to the National Science Foundation, which funded his research, we’ve begun posting the files – many PDF documents but some HTML and text files – to our Github repository for Illinois sources.

We won’t have every county, nor every election, but this generous contribution to the project is exactly the sort of thing we were hoping to see when we asked. While OpenElections will be processing results data for state and federal races, we’re also posting these files to allow people to find local race results as well – and to avoid requesting the same materials that Sances did for his research. These are public documents; they should be available to the public.

And we’re renewing our call to political scientists: if you have precinct-level election results, either in original format or that you’ve digitized, we’d love to hear from you about making them a part of OpenElections.



Covering Oregon

January 30, 2016

We’ve spent a lot of time in Oregon. Well, not in Oregon, physically, but searching for, obtaining and parsing election results from the state. County-level data is easy to find: the Secretary of State’s office provides electronic PDFs of results for elections dating back to 2004. Thanks to Tabula, those were easy to convert so that we could load them into our database.


Source: http://www.oregon.gov/DHS/CHILDREN/FOSTERPARENT/PublishingImages/map.png

But precinct-level data was, as we’ve said before, a whole other story. We’ve written about specific aspects of this process, but here is the complete story about how we published what is, to the best of our knowledge, the first freely-available set of statewide precinct-level election results for 2012 and 2014 elections in Oregon.

We began gathering Oregon data in early May 2015, starting with the county-level data and then moving onto individual counties. We asked about other sources of precinct-level data, but never found a freely available one. The state does not maintain precinct-level results data. When we turned to the counties, we found a few that provided machine-readable results (mostly in PDF format), including the largest counties (Multnomah, Clackamas and Marion among them). Parsing those results was by no means simple, but the files were consistent enough to allow us to grab precinct results.

But for the bulk of Oregon’s 36 counties, here was our process:

  1. Email county clerk’s office asking about the availability of results. Sometimes we called.
  2. Request results for 2000-2014, if available. If not, request as much as they had.
  3. For some, pay a fee for the results, ranging from $37.50 (Union County) to $222.75 (Tillamook County, just for 2010-2014).
  4. Get the results, typically PDF image files that would require OCR to convert into machine-readable data. In some cases, we were sent paper via the postal service.
  5. Create election and county-specific files.

We had several volunteers (some listed here) help us with the conversion process, which helped us get the data converted faster. All told, we spent more than $1,000 on Oregon precinct-level results, and for that we got a wide range of files covering most of the elections we sought but not all.

We built a matrix showing the county coverage that we have, something that we’ll be doing for other states and putting on openelections.net as well. In many counties, getting pre-2008 results was either impossible or cost-prohibitive. In one case, Crook County, the precinct results from the 2010 general and 2012 primary elections are no longer available. To repeat: no government entity has precinct-level results for Crook County for those two elections. They are missing from the public record.

Almost every county clerk we spoke or emailed with was friendly and wanted to help us. Many offered results for free online, and we happily downloaded those files. Others had results but have not posted them. Many of them appear to use the same software to produce election results reports but continue to print out those electronic reports and scan them, reducing legibility.

Oregon is, sadly, not an isolated case. There are other states where obtaining precinct-level results is a painstaking effort, even when counties and states produce electronic version of political maps and voter files. Election results, in too many cases, are a weak link in our civic infrastructure.

So we’ll be back in touch with our Oregon clerks at the end of this year or early next year, asking for more election results. We’ll know what to expect this time, and it should be a less-expensive and shorter process. But on both counts, the progress isn’t enough.


The Price of OpenElections

December 25, 2015

As we wrap up OpenElections’ work in 2015, we’d like to give you an update on how we’ve spent not only our time but also the money that we’ve gotten, particularly from the John S. and James L. Knight Foundation’s Knight News Challenge. Most of that money that we’ve spent since mid-2013 has gone to salaries for our project manager and a single developer. Neither of the project’s co-founders has been paid for working on OpenElections, and we’ve tried to keep our operations pretty lean.

While our initial grant funding from Knight is nearly exhausted, we’ve made good progress and will keep going. In the past few months we’ve added a few more states (Louisiana, Missouri and Virginia) and we have volunteers working on Wisconsin, Georgia and Oregon, among others. We’ve revised our volunteer documentation to make it easier to understand what we’re doing and how you can help.

In most states, getting county-level data isn’t too much of a problem; that data is usually available online, if not always in native electronic formats. County-level data is usually freely available as well, but we’ve always wanted to develop a resource that can offer precinct-level results where they are available. Here’s why: while counties can be homogenous, precincts are even more distinct and smaller political units and lend themselves to more sophisticated analysis. Candidates and their campaigns care about precinct results. Journalists and researchers should do the same.

Some states make precinct-level data available for free, which is a great service to the public. They include Louisiana, Maryland, Wyoming, West Virginia, Virginia, North Carolina and Florida, among others. Some states, like Pennsylvania, Colorado and Utah, charge a nominal fee for precinct results. But for other states, precinct results are only available county-by-county, and that takes both time and money. We’ve written about Oregon in the past, and we’d like to offer it as an example of the price of precinct results.

The bad news is that it’s not uniform, even within a state. In Oregon we’ve spent more than $1,000 to obtain precinct-level results covering elections 2000-2014, although in many cases we don’t have all of those years. Some counties literally don’t have results to give us before 2010. Crook County was unable to find precinct results for the 2010 general and 2012 primary elections, while a number of counties don’t have elections from before 2008. In other cases, price was a factor: we’ll only have precinct results for 2010-2014 from Tillamook County because the clerk there charged us $222.75 for results for those years. Lake County charges $50 an hour for pulling the results files and another $0.25 a page for copying them. We’ve yet to receive those results, so we don’t know what the final cost for Lake will be.

The good news is that when we do request election results that aren’t freely available online, we’re posting them on our Github site in state-specific repositories. That way other organizations or individuals won’t have to repeat our processes and/or pay for results that we’ve already gotten. We want you to use what we’ve gathered, whether that’s CSV files or original PDFs. That’s our holiday gift to you. We’ll be back at it in 2016, when there are more elections coming up, we hear.

Offline Election Results

September 7, 2015 — Leave a comment

In some states, getting election results data is pretty easy (looking at you, Maryland, Wyoming and Florida, to name three). In others, it’s a matter of going county-by-county, as we’ve previously written. But in most of these cases, we’ve been dealing with results files that are available online.

But what about cases where the results aren’t on the Internet?

Continue Reading...

The Oregon Paper Trail

June 14, 2015


A few weeks ago, we began requesting precinct-level election results from counties in Oregon. The Secretary of State maintains county-level results, typically in electronic PDFs, so to get down to precinct level we need to ask county clerks across the state. Many of them post precinct results on their sites, but some don’t, so we emailed a few to ask for results from 2000-2014. In doing so, we were prepared to pay reasonable fees for them, as the Oregon Revised Statutes permit.

Local officials were quick to get back to us in every case, and their responses were straightforward. Here’s an example, from Art Harvey, the Josephine County clerk and recorder:

The reports you are interested in are available in PDF format.

The cost would be $10.00 per election.

Other counties charged fees ranging from $25 (Umatilla County) to $45 (Wasco County) to $86.50 (Linn County, which sent us paper print-outs of election results that we’ll be scanning). And then there’s Tillamook County, where Tassi O’Neil, the county clerk there, has set a price of $664 for PDF copies of precinct-level results for elections from 2000-2014.

We wondered how that price was calculated, so we asked. Ms. O’Neil responded:

The fee for each election is $3.75 locate fee and then .25 cents per page.  That is the fee if it is a paper copy or if we send it in a PDF.  That is the charge that the Oregon Revised Statues say that we can/or should charge.

That is true, but there are two points here: One is that members of the public are being charged for pages of an electronic document. There are no paper copies involved here. The other is that the Oregon Revised Statutes also say this:

The custodian of any public record may furnish copies without charge or at a substantially reduced fee if the custodian determines that the waiver or reduction of fees is in the public interest because making the record available primarily benefits the general public.

As OpenElections is a non-profit effort dedicated to publishing machine-readable election results that can be freely used by anyone, we’re pretty sure that our project primarily benefits the general public. We’ve asked for such a waiver or reduction of the $664 and are awaiting a reply. Oregon law also permits us to appeal a denial of a fee waiver or reduction to the Attorney General, and we will be pursuing that option should it become necessary.

In the meantime, we’ve been converting Oregon PDF results to CSVs and will continue to do so. There are plenty of ways for you to contribute to that effort, and we welcome any suggestions or advice on our dealings with Oregon officials.

For OpenElections volunteers coming to NICAR in Atlanta next month, we’ve got a challenge for you: help us tackle Georgia election results.

As we did last year in Baltimore, OpenElections will hold an event on Sunday, March 8, with the goal of writing scrapers and parsers to load 2000-2014 election results from the Peach State, and we’re looking for some help. It’s a great way to get familiar with the project and see what our processes are.

Georgia offers some different tasks, from scraping HTML results to using our Clarify library to parse XML from more recent elections. So we’re looking for people who have some familiarity with Python and election results, but we’re happy to help guide those new to the process, too. Thanks to our volunteers, we’ve already got a good record of where election result data is stored by the state.

Here’s how the process will work: we’ll start by reviewing the state of the data – what’s available for which elections – and then start working on a datasource file that connects that data to our system. After that, we’ll begin writing code to load results data, using other states as our models. As part of that process, we’ll pre-process HTML results into CSVs that we store on Github.

If you’re interested in helping out, there are two things to do: first, let us know by emailing openelections@gmail.com or on Twitter at @openelex. Second, take the time to setup the development environment on your laptop following the instructions here. We’re looking forward to seeing you in Atlanta!

Scraping Nevada

January 28, 2015

ICYMI, Derek Willis wrote a piece for Source about his experience scraping Nevada precinct results. Check it out!

When we released our initial dashboard for downloading election results in July, we wanted to make it easy for anyone to grab CSV files of raw results with just a browser. We’ve continued adding states to our results site, the latest being North Carolina, Florida and — for a few elections — Mississippi. Pennsylvania will be on the way soon.

But we also wanted our results to be usable by developers as well, and we’re taking advantage of Github to help make that easier. Each time that we publish raw results data, which hasn’t been standardized beyond geography — we publish it first to a GitHub repository for that state. For example, you can find a repository for Mississippi results that can be cloned and/or accessed via API, avoiding manual downloads. The naming convention for the repositories is the same: openelections-results-{state}, and you might find partial results for states that don’t yet appear on the download map (like Iowa) because they’re still in progress.

Screen Shot 2014-12-22 at 2.18.09 PM

Using GitHub has two advantages for us — it helps to maintain a history of published changes, of course — but GitHub Pages also provides a filesystem for storing the raw CSVs that power the results site downloads. And should we need to move the CSV downloads to another location, we can do that, too. All of this underscores our commitment to using existing standards and practices rather than inventing new ones.

So if you were looking for election results CSVs as part of your holiday plans, we’ve got two ways to get them. Enjoy, and Happy New Year!

By Derek Willis

OpenElections is nearly two years old, and we’re not nearly done yet. In most states we still have a lot of work to do.

As our initial funding from the Knight Foundation winds down, we wanted to provide an update on where the project is and our plans going forward. The first thing to know is this: OpenElections is here to stay. Our timetable has expanded, and we’re looking at other sources of money to boost our capacity to process election results data, but the work we’ve done so far and your interest in it has convinced us of the need.

When we started, Serdar and I had between us years of experience working with election results data in multiple formats. We both worked at news organizations that routinely dealt with different types of data and various election systems.

We still have been surprised by the diversity of results that we’ve found. States like Pennsylvania, North Carolina and Florida have consistent and reliable data across time. Other states, like Arkansas, Colorado and Washington, have different formats and systems depending on the year. Then there are states like Mississippi and New York, which have required significant investments of time and effort.

In practice, that has meant a lot of work within individual states in order to load and process data from 2000 onward. Those efforts have taken more time than we anticipated, for two reasons. First, we have found that states have switched the systems and software they use to publish election results, in some cases multiple times in the past 15 years. We have found some abstractions – we released a separate library to handle states that use Clarity’s software – but in many cases this meant writing several different custom parsers for a single state.

Second, machine readable data is not a universal standard, and for many states it is a recent addition to their practices. This isn’t a criticism as much as it is a statement of reality. Officials from nearly every state we’ve been in contact with have been helpful and even supportive of the project. But we’re also not too far removed from all-paper elections, either.

In response to these factors, we’ve made some adjustments. The main one is to publish “raw” results data from states even before we standardize offices, candidates and parties. We think having election results in a fairly consistent format across a number of years is pretty useful, so we’re not going to wait until everything is done to release that. This week we’ve published raw results in North Carolina, Florida, Pennsylvania and (for recent elections) Mississippi. You can download these from our site or clone them from GitHub depending on your needs. We’ll continue to follow that path as we work on standardization.

Along the way we’ve been very fortunate to have had contributions from volunteers, who both gathered information about the state of election results and also contributed code to the project. We can’t thank all of you enough for your interest and contributions. This would be a much longer road without them, and we hope that you’ll stay involved.

We’d also like to recognize the people who have lived this project with us for most of the past two years. Geoff Hing has been the main point of contact for web development volunteers and has written the bulk of the code that powers the results loading and data display portions of the project. Geoff began a new job at The Chicago Tribune this week, although he’ll still be involved with OpenElections as a volunteer. We’re extremely grateful for his efforts.

Many more of you have emailed with or spoken to Sara Schnadt, the project manager for OpenElections. She’ll be with us through the end of the year as we plan our next steps, and her organizational skills, creative thinking and ability to wrangle two co-founders living on separate coasts has made OpenElections possible.

Investigative Reporters & Editors, a source of training and inspiration for journalists for decades, has made things easy for us by handling the accounting and grant management tasks. Both Serdar and I are proud to be “graduates” of IRE, and we’re thankful for their support of OpenElections.

The goal of OpenElections – to provide access to machine-readable, standardized election results – remains the same as when we began. The path to reach that goal is now a lot clearer than it was two years ago, and with your help we’ve learned a lot about how to get there. We’ll keep moving forward, and invite you to stay involved.



Introducing Clarify

November 26, 2014

An Open Source Elections-Data URL Locator and Parser from OpenElections

By Geoff Hing and Derek Willis, for Knight-Mozilla OpenNews Source Learning


State election results are like snowflakes: each state—often each county—produces its own special website to share the vote totals. For a project like OpenElections, that involves having to find results data and figuring out how to extract it. In many cases, that means scraping.

But in our research into how election results are stored, we found that a handful of sites used a common vendor: Clarity Elections, which is owned by SOE Software. States that use Clarity genferally share a common look and features, including statewide summary results, voter turnout statistics, and a page linking to county-specific results.

The good news is that Clarity sites also include a “Reports” tab that has structured data downloads in several formats, including XML, XLS, andCSV. The results data are contained in .ZIP files, so they aren’t particularly large or unwieldy. But there’s a catch: the URLs aren’t easily predictable. Here’s a URL for a statewide page:


The first numeric segment—15261 in this case—uniquely identifies this election, the 2010 primary in Kentucky. But the second numeric segment—30235—represents a subpage, and each county in Kentucky has a different one. Switch over to the page listing the county pages, and you get all the links. Sort of.

The county-specific links, which lead to pages that have structured results files at the precinct level, actually involve redirects, but those secondary numeric segments in the URLs aren’t resolved until we visit them. That means doing a lot of clicking and copying, or scraping. We chose the latter path, although that presents some difficulties as well. Using our time at OpenNews’ New York Code Convening in mid-November, we created a Python library called Clarify that provides access to those URLs containing structured election results data and parses the XML version of it. We’re already using it in OpenElections, and now we’re releasing it for others who work in states that use Clarity software.

See full piece on Source Learning