Offline Election Results

September 7, 2015 — Leave a comment

In some states, getting election results data is pretty easy (looking at you, Maryland, Wyoming and Florida, to name three). In others, it’s a matter of going county-by-county, as we’ve previously written. But in most of these cases, we’ve been dealing with results files that are available online.

But what about cases where the results aren’t on the Internet?

Continue Reading...

The Oregon Paper Trail

June 14, 2015


A few weeks ago, we began requesting precinct-level election results from counties in Oregon. The Secretary of State maintains county-level results, typically in electronic PDFs, so to get down to precinct level we need to ask county clerks across the state. Many of them post precinct results on their sites, but some don’t, so we emailed a few to ask for results from 2000-2014. In doing so, we were prepared to pay reasonable fees for them, as the Oregon Revised Statutes permit.

Local officials were quick to get back to us in every case, and their responses were straightforward. Here’s an example, from Art Harvey, the Josephine County clerk and recorder:

The reports you are interested in are available in PDF format.

The cost would be $10.00 per election.

Other counties charged fees ranging from $25 (Umatilla County) to $45 (Wasco County) to $86.50 (Linn County, which sent us paper print-outs of election results that we’ll be scanning). And then there’s Tillamook County, where Tassi O’Neil, the county clerk there, has set a price of $664 for PDF copies of precinct-level results for elections from 2000-2014.

We wondered how that price was calculated, so we asked. Ms. O’Neil responded:

The fee for each election is $3.75 locate fee and then .25 cents per page.  That is the fee if it is a paper copy or if we send it in a PDF.  That is the charge that the Oregon Revised Statues say that we can/or should charge.

That is true, but there are two points here: One is that members of the public are being charged for pages of an electronic document. There are no paper copies involved here. The other is that the Oregon Revised Statutes also say this:

The custodian of any public record may furnish copies without charge or at a substantially reduced fee if the custodian determines that the waiver or reduction of fees is in the public interest because making the record available primarily benefits the general public.

As OpenElections is a non-profit effort dedicated to publishing machine-readable election results that can be freely used by anyone, we’re pretty sure that our project primarily benefits the general public. We’ve asked for such a waiver or reduction of the $664 and are awaiting a reply. Oregon law also permits us to appeal a denial of a fee waiver or reduction to the Attorney General, and we will be pursuing that option should it become necessary.

In the meantime, we’ve been converting Oregon PDF results to CSVs and will continue to do so. There are plenty of ways for you to contribute to that effort, and we welcome any suggestions or advice on our dealings with Oregon officials.

For OpenElections volunteers coming to NICAR in Atlanta next month, we’ve got a challenge for you: help us tackle Georgia election results.

As we did last year in Baltimore, OpenElections will hold an event on Sunday, March 8, with the goal of writing scrapers and parsers to load 2000-2014 election results from the Peach State, and we’re looking for some help. It’s a great way to get familiar with the project and see what our processes are.

Georgia offers some different tasks, from scraping HTML results to using our Clarify library to parse XML from more recent elections. So we’re looking for people who have some familiarity with Python and election results, but we’re happy to help guide those new to the process, too. Thanks to our volunteers, we’ve already got a good record of where election result data is stored by the state.

Here’s how the process will work: we’ll start by reviewing the state of the data – what’s available for which elections – and then start working on a datasource file that connects that data to our system. After that, we’ll begin writing code to load results data, using other states as our models. As part of that process, we’ll pre-process HTML results into CSVs that we store on Github.

If you’re interested in helping out, there are two things to do: first, let us know by emailing or on Twitter at @openelex. Second, take the time to setup the development environment on your laptop following the instructions here. We’re looking forward to seeing you in Atlanta!

Scraping Nevada

January 28, 2015

ICYMI, Derek Willis wrote a piece for Source about his experience scraping Nevada precinct results. Check it out!

When we released our initial dashboard for downloading election results in July, we wanted to make it easy for anyone to grab CSV files of raw results with just a browser. We’ve continued adding states to our results site, the latest being North Carolina, Florida and — for a few elections — Mississippi. Pennsylvania will be on the way soon.

But we also wanted our results to be usable by developers as well, and we’re taking advantage of Github to help make that easier. Each time that we publish raw results data, which hasn’t been standardized beyond geography — we publish it first to a GitHub repository for that state. For example, you can find a repository for Mississippi results that can be cloned and/or accessed via API, avoiding manual downloads. The naming convention for the repositories is the same: openelections-results-{state}, and you might find partial results for states that don’t yet appear on the download map (like Iowa) because they’re still in progress.

Screen Shot 2014-12-22 at 2.18.09 PM

Using GitHub has two advantages for us — it helps to maintain a history of published changes, of course — but GitHub Pages also provides a filesystem for storing the raw CSVs that power the results site downloads. And should we need to move the CSV downloads to another location, we can do that, too. All of this underscores our commitment to using existing standards and practices rather than inventing new ones.

So if you were looking for election results CSVs as part of your holiday plans, we’ve got two ways to get them. Enjoy, and Happy New Year!

By Derek Willis

OpenElections is nearly two years old, and we’re not nearly done yet. In most states we still have a lot of work to do.

As our initial funding from the Knight Foundation winds down, we wanted to provide an update on where the project is and our plans going forward. The first thing to know is this: OpenElections is here to stay. Our timetable has expanded, and we’re looking at other sources of money to boost our capacity to process election results data, but the work we’ve done so far and your interest in it has convinced us of the need.

When we started, Serdar and I had between us years of experience working with election results data in multiple formats. We both worked at news organizations that routinely dealt with different types of data and various election systems.

We still have been surprised by the diversity of results that we’ve found. States like Pennsylvania, North Carolina and Florida have consistent and reliable data across time. Other states, like Arkansas, Colorado and Washington, have different formats and systems depending on the year. Then there are states like Mississippi and New York, which have required significant investments of time and effort.

In practice, that has meant a lot of work within individual states in order to load and process data from 2000 onward. Those efforts have taken more time than we anticipated, for two reasons. First, we have found that states have switched the systems and software they use to publish election results, in some cases multiple times in the past 15 years. We have found some abstractions – we released a separate library to handle states that use Clarity’s software – but in many cases this meant writing several different custom parsers for a single state.

Second, machine readable data is not a universal standard, and for many states it is a recent addition to their practices. This isn’t a criticism as much as it is a statement of reality. Officials from nearly every state we’ve been in contact with have been helpful and even supportive of the project. But we’re also not too far removed from all-paper elections, either.

In response to these factors, we’ve made some adjustments. The main one is to publish “raw” results data from states even before we standardize offices, candidates and parties. We think having election results in a fairly consistent format across a number of years is pretty useful, so we’re not going to wait until everything is done to release that. This week we’ve published raw results in North Carolina, Florida, Pennsylvania and (for recent elections) Mississippi. You can download these from our site or clone them from GitHub depending on your needs. We’ll continue to follow that path as we work on standardization.

Along the way we’ve been very fortunate to have had contributions from volunteers, who both gathered information about the state of election results and also contributed code to the project. We can’t thank all of you enough for your interest and contributions. This would be a much longer road without them, and we hope that you’ll stay involved.

We’d also like to recognize the people who have lived this project with us for most of the past two years. Geoff Hing has been the main point of contact for web development volunteers and has written the bulk of the code that powers the results loading and data display portions of the project. Geoff began a new job at The Chicago Tribune this week, although he’ll still be involved with OpenElections as a volunteer. We’re extremely grateful for his efforts.

Many more of you have emailed with or spoken to Sara Schnadt, the project manager for OpenElections. She’ll be with us through the end of the year as we plan our next steps, and her organizational skills, creative thinking and ability to wrangle two co-founders living on separate coasts has made OpenElections possible.

Investigative Reporters & Editors, a source of training and inspiration for journalists for decades, has made things easy for us by handling the accounting and grant management tasks. Both Serdar and I are proud to be “graduates” of IRE, and we’re thankful for their support of OpenElections.

The goal of OpenElections – to provide access to machine-readable, standardized election results – remains the same as when we began. The path to reach that goal is now a lot clearer than it was two years ago, and with your help we’ve learned a lot about how to get there. We’ll keep moving forward, and invite you to stay involved.



Introducing Clarify

November 26, 2014

An Open Source Elections-Data URL Locator and Parser from OpenElections

By Geoff Hing and Derek Willis, for Knight-Mozilla OpenNews Source Learning


State election results are like snowflakes: each state—often each county—produces its own special website to share the vote totals. For a project like OpenElections, that involves having to find results data and figuring out how to extract it. In many cases, that means scraping.

But in our research into how election results are stored, we found that a handful of sites used a common vendor: Clarity Elections, which is owned by SOE Software. States that use Clarity genferally share a common look and features, including statewide summary results, voter turnout statistics, and a page linking to county-specific results.

The good news is that Clarity sites also include a “Reports” tab that has structured data downloads in several formats, including XML, XLS, andCSV. The results data are contained in .ZIP files, so they aren’t particularly large or unwieldy. But there’s a catch: the URLs aren’t easily predictable. Here’s a URL for a statewide page:

The first numeric segment—15261 in this case—uniquely identifies this election, the 2010 primary in Kentucky. But the second numeric segment—30235—represents a subpage, and each county in Kentucky has a different one. Switch over to the page listing the county pages, and you get all the links. Sort of.

The county-specific links, which lead to pages that have structured results files at the precinct level, actually involve redirects, but those secondary numeric segments in the URLs aren’t resolved until we visit them. That means doing a lot of clicking and copying, or scraping. We chose the latter path, although that presents some difficulties as well. Using our time at OpenNews’ New York Code Convening in mid-November, we created a Python library called Clarify that provides access to those URLs containing structured election results data and parses the XML version of it. We’re already using it in OpenElections, and now we’re releasing it for others who work in states that use Clarity software.

See full piece on Source Learning


By Derek Willis

Opening election data isn’t just an American thing. Across Africa, organizations are at work gathering election results and voter data to make better tools and systems that help inform citizens about the political process.

I was a participant in a workshop organized by the Global Network of Domestic Election Monitors, which provides training and support for groups that monitor elections around the world. The three-day workshop, held in Johannesburg, South Africa, in September brought together more than a dozen representatives of organizations from across Africa as well as officials from the Electoral Commission of South Africa.

Governments in Africa publish their official results in a variety of formats, but most provide either electronic PDFs or CSV files. During the workshop, we discussed what else defined election data – in many parts of Africa, that includes not only voting locations but also details about observers, the security situation and the integrity of the voter roll. What I heard from participants like James Mwirima of Citizens’ Watch-IT in Uganda, Tidiani Togola of Mali and Chukwudera Bridget Okeke of TMG Nigeria was that election data was about so much more than the results.

Using OpenElections as an example, we talked about dealing with difficult to parse data, and even showed off the powers of Tabula for converting PDF tables into CSV files using Zimbabwean election results from 2013. Under the guidance of organizers Meghan Fenzel and Sunila Chilukuri of the National Democratic Institute, we worked on summarizing and visualizing voter registration data using Google Fusion Tables and Excel.

10628369_810827622271502_7667687462829274279_nSince most African countries have a single national election authority, results are often collected and published in a single location. South Africa, for example, publishes detailed results data in several formats and breakdowns, including by voting district. Some of the United States may want to take note: there’s a CSV download as well.

What I found at the workshop were election monitoring organizations who wanted to be able to use modern tools to help quickly and accurately  assess elections in their countries. Nigeria already has a robust effort preparing for elections next year.

A few times I was asked about the possibility of extending OpenElections outside the United States. While we’ve got our hands full with the variety of formats and results data that 50 state systems produce, there’s nothing I’d like to see more than our work being used in other places. That’s why I stressed the importance of publishing your code and data, not only so others can build upon them but so that people can see your work and evaluate its accuracy and integrity. Our elections – wherever they are – demand no less.

At ONA14 in Chicago in late September we unveiled the new OpenElections data download interface. We presented at the Knight Foundation’s Knight Village during their office hours for featured News Challenge projects, as well as during a lighting talk. OpenElections’ Geoff Hing and Sara Schnadt showed off their handiwork based on in-depth discussions and feedback from many data journos. The crowd at ONA was receptive, and the people we talked to were keen to start having access to the long awaited data from the first few states.

Screen Shot 2014-10-06 at 2.47.55 PM

As you can see from the data map view above, there are only three states that have data available so far. These are Maryland, West Virginia and Wyoming, for which you can download ‘raw’ data. For our purposes, this means that you can get official data at the most common results reporting levels, with the most frequently used fields identified but without any further standardization. We will have ‘raw’ data on all the states in the next few months, and will work on having fully cleaned and standardized data on all the states after this initial process is complete.

Screen Shot 2014-10-06 at 2.48.12 PM

As things progress, you will see updates to both the map view and the detailed data view where you can see the different reporting levels that have data ready for download so far.

Screen Shot 2014-10-06 at 4.30.19 PM

A pink download icon indicates available data, and a grey icon indicates that data exists for a particular race at a particular reporting level, but that we don’t yet have it online.

Screen Shot 2014-10-06 at 4.28.56 PM
The race selection tool at the top of the page includes a visualization that gives an overview of all the races in our timespan, and a slider for selecting a date range to review races in the download table. For states like Maryland (shown in the full page-view above), there are only two races every two years so this slider isn’t so crucial, but for states like Florida (directly above), this slider can be useful.

We encourage you to take the interface for a spin, and tell us what you think! And, if you would like to help us get more data into this interface faster, and you are fairly canny with Python, we would love to hear from you. You can learn more about what this would entail here.

An Interview with The Pew Charitable Trusts’ Jared Marcotte







OE: You come to civic infrastructure work via previous experience in corporate technology. Can you give me a little background on the Voting Information Project, how you became involved with it, and why this work is so interesting to you?

JM: Voting Information Project (VIP) is a partnership between The Pew Charitable Trusts and Google that started in 2008. Both organizations realized that voters were having difficulty finding the answers to common elections-related questions, such as “where do I vote,” “what’s on my ballot,” and “how do I navigate the elections process”. The project encourages states to publish public information pertaining to elections–geopolitical boundaries, polling locations and early voting sites, local and state election official contact information, ballot information, and other related data–in a standardized format allowing Google to make the data available through the Google Civic Information API. Our goal is to lower the barrier of access to this information, making it easier for elections officials to concentrate on running the elections.

I’ve worked at some great companies over the years, but my work, though challenging and interesting, felt a bit disconnected. I wanted to do more to potentially solve societal problems, which led me to VIP. I’d always found elections cerebral but daunting, since it was difficult to find the information I needed to cast an informed vote. Voting is one of the most important activities in civic life, so this project fulfilled my desire to “improve the world,” so to speak. Early last year, David Becker, Director of Elections Initiatives, offered me the opportunity to manage VIP at Pew. Considering how much I loved the project, it was easy to say yes.

OE: VIP is a collaboration between The Pew Charitable Trusts and Google’s Civic Innovation project. How does this work, and what resources do each entity bring to the table?

JM: Since providing election information is a distributed data problem–meaning the data we require is held in different databases across departments and, sometimes, jurisdictions–Pew, through Democracy Works and Election Information Services, provides engineering support to states to automate and centralize the publication of this information at the state-level. Pew also creates open source tools that leverage the API and allow states, campaigns, and civic organizations to use low-cost tools. Pew works with Engage and Lewis PR to broaden the project’s reach to potential organizations that may be interested in leveraging the data or the tools.

Pew has a great working relationship with Google. They offer an understanding of elections coupled with technical infrastructure and engineering that few others could match at scale. Additionally, they created the Voter Information Tool, which provides a single source of election information to voters and is one of the most visible artifacts of the project. Anthea Watson Strong, my counterpart at Google, has extensive experience with the project and campaigns, making her uniquely suited to manage Google’s role with this initiative.

OE: Can you describe how VIP impacts an individual voter, and how it eases their participation in the elections process?

JM: Though there are numerous tools, at its core, VIP allows a voter to enter their address and find their polling location and ballot information for every major election without ever providing any personally identifiable data. At Pew, we try to cover a number of different access points beyond Google’s Voter Information Tool. We’re working with Azavea to develop a white-label, accessible iOS application and a companion Android application that allows users to find election information. In the interest of bridging the digital divide, we’re also developing an SMS-based service to look up polling location information and registration status. Because the Civic Information API is accessible to the general public, civic organizations and individual developers can use the data in ways that we may not cover through our own open-source applications.

VIP also publishes all of the raw data, which tech collaborators use in various ways. One of the most fun examples was when Foursquare used the geographical polling location data in their application. A voter that checked-in to his/her polling location on Election Day received a virtual “I Voted” badge.

OE: What other Election initiatives are underway at Pew, and how do they all interrelate?

JM: Our core mission in election initiatives is to make elections more accessible, accurate, and cost-efficient. In addition to VIP, we have two other projects that work towards our goals.

The Upgrading Voter Registration (UVR) project partners with election officials, policy makers, technology experts, and other stakeholders to help states move towards more integrated, modern, and secure voter registration systems. This goal is accomplished through a number of initiatives, one of which is the Electronic Registration Information Center (ERIC), an independent non-profit whose membership is made up of representatives of the states that work to improve the quality of voter registration lists through a sophisticated data matching system.

Pew’s ethos is all about constant evaluation through data analysis. In keeping with the culture, the Elections Performance Index (EPI) is our measurement of elections administration based on 17 objective indicators (e.g. data completeness, turnout, voter registration rate, et al). Along with a massive amount of fascinating data and state fact sheets (e.g. Wisconsin [PDF]), the “crown jewel” of this project is the interactive. This year is also the first time that we’ve had the data to compare two presidential elections: 2008 and 2012.

OE: In light of the recent presidential report highlighting that current voting systems are at the end of their viable lifespan, are you aware of any new solutions underway?

JM: Innovation in voting technology is complicated by outdated certification requirements. Since the last time the federal standards were updated, smartphones became ubiquitous, and Apple, with the advent of the iPhone in 2007 and the iPad in 2010, changed the way we think about the capabilities of “mobile users.” Most states have state-specific certification standards, too, many of which are based closely on the federal standards. The result is an expensive and lengthy process to certify new voting technology that prevents entrepreneurs from developing new systems and limits the products available on the market. Vendors are unwilling to invest in innovative technology when there is no guarantee that there will be a market for their technology once it is certified.

Election officials are left treading water with outdated and insecure technology while waiting for new technology to be offered, knowing that the current system prevents innovation. While we are starting to think about creative solutions to the problems in the marketplace, two county-based projects are approaching this problem from their perspectives. The Travis County, Texas Elections Office is working with a number of academics to build STAR-Vote, a completely new election system. A similar initiative is also taking place in Los Angeles County called the Voting Systems Assessment Project (VSAP). VSAP is guided around set of principles defined by the Advisory Committee and the county is working with IDEO to create early prototypes (NB: In the interest of full-disclosure, I serve on the VSAP Technical Advisory Committee).

OE: Ideally, what kinds of organizations and systems would come together to make a robust, transparent and cost-effective elections infrastructure?

JM: VSAP is a solid start. Academics, civic organizations, the private sector, and the public all take part in the process in meaningful ways. With IDEO, they take a “human-centered” approach to the problems, which I believe makes this project transformative. Ideally, elections should be about what works for each individual voter, though this philosophy does introduce a number of unique challenges. Time will tell if initiatives like VSAP and STAR-Vote will change the elections technology landscape, but I’m optimistic.


Jared Marcotte is an officer for Pew’s election initiatives, which supports states’ efforts to improve military and overseas voting; assess election performance through better data; use technology to provide information to voters; and upgrade voter registration systems.

Marcotte primarily oversees work on the Voting Information Project, a partnership with Google that improves the availability of election information for voters and civic developers while easing administrative burdens on local election officials. He also serves as an advisor on other Election Initiatives projects where technical strategy or software engineering is a component of the work.

Previously, as a senior engineer at the New Organizing Institute, Marcotte worked on the Voting Information Project, a collaboration with state and local officials, Google, and Pew to develop a nationwide dataset of election-related information. Marcotte previously worked at Six Apart and IBM and as an interface and interaction designer on the Election Protection Coalition’s Our Vote Live,, and various enterprise-grade sites. He currently serves on the technical advisory committee for the Voting Systems Advisory Committee for Los Angeles County, California.

He holds a bachelor’s degree in computer science from the University of Vermont.