10625130_810289678991963_5642416059532987515_n

By Derek Willis

Opening election data isn’t just an American thing. Across Africa, organizations are at work gathering election results and voter data to make better tools and systems that help inform citizens about the political process.

I was a participant in a workshop organized by the Global Network of Domestic Election Monitors, which provides training and support for groups that monitor elections around the world. The three-day workshop, held in Johannesburg, South Africa, in September brought together more than a dozen representatives of organizations from across Africa as well as officials from the Electoral Commission of South Africa.

Governments in Africa publish their official results in a variety of formats, but most provide either electronic PDFs or CSV files. During the workshop, we discussed what else defined election data – in many parts of Africa, that includes not only voting locations but also details about observers, the security situation and the integrity of the voter roll. What I heard from participants like James Mwirima of Citizens’ Watch-IT in Uganda, Tidiani Togola of Mali and Chukwudera Bridget Okeke of TMG Nigeria was that election data was about so much more than the results.

Using OpenElections as an example, we talked about dealing with difficult to parse data, and even showed off the powers of Tabula for converting PDF tables into CSV files using Zimbabwean election results from 2013. Under the guidance of organizers Meghan Fenzel and Sunila Chilukuri of the National Democratic Institute, we worked on summarizing and visualizing voter registration data using Google Fusion Tables and Excel.

10628369_810827622271502_7667687462829274279_nSince most African countries have a single national election authority, results are often collected and published in a single location. South Africa, for example, publishes detailed results data in several formats and breakdowns, including by voting district. Some of the United States may want to take note: there’s a CSV download as well.

What I found at the workshop were election monitoring organizations who wanted to be able to use modern tools to help quickly and accurately  assess elections in their countries. Nigeria already has a robust effort preparing for elections next year.

A few times I was asked about the possibility of extending OpenElections outside the United States. While we’ve got our hands full with the variety of formats and results data that 50 state systems produce, there’s nothing I’d like to see more than our work being used in other places. That’s why I stressed the importance of publishing your code and data, not only so others can build upon them but so that people can see your work and evaluate its accuracy and integrity. Our elections – wherever they are – demand no less.

At ONA14 in Chicago in late September we unveiled the new OpenElections data download interface. We presented at the Knight Foundation’s Knight Village during their office hours for featured News Challenge projects, as well as during a lighting talk. OpenElections’ Geoff Hing and Sara Schnadt showed off their handiwork based on in-depth discussions and feedback from many data journos. The crowd at ONA was receptive, and the people we talked to were keen to start having access to the long awaited data from the first few states.

Screen Shot 2014-10-06 at 2.47.55 PM

As you can see from the data map view above, there are only three states that have data available so far. These are Maryland, West Virginia and Wyoming, for which you can download ‘raw’ data. For our purposes, this means that you can get official data at the most common results reporting levels, with the most frequently used fields identified but without any further standardization. We will have ‘raw’ data on all the states in the next few months, and will work on having fully cleaned and standardized data on all the states after this initial process is complete.

Screen Shot 2014-10-06 at 2.48.12 PM

As things progress, you will see updates to both the map view and the detailed data view where you can see the different reporting levels that have data ready for download so far.

Screen Shot 2014-10-06 at 4.30.19 PM

A pink download icon indicates available data, and a grey icon indicates that data exists for a particular race at a particular reporting level, but that we don’t yet have it online.

Screen Shot 2014-10-06 at 4.28.56 PM
The race selection tool at the top of the page includes a visualization that gives an overview of all the races in our timespan, and a slider for selecting a date range to review races in the download table. For states like Maryland (shown in the full page-view above), there are only two races every two years so this slider isn’t so crucial, but for states like Florida (directly above), this slider can be useful.

We encourage you to take the interface for a spin, and tell us what you think! And, if you would like to help us get more data into this interface faster, and you are fairly canny with Python, we would love to hear from you. You can learn more about what this would entail here.

An Interview with The Pew Charitable Trusts’ Jared Marcotte

JaredMarcotte

 

 

 

 

 

OE: You come to civic infrastructure work via previous experience in corporate technology. Can you give me a little background on the Voting Information Project, how you became involved with it, and why this work is so interesting to you?

JM: Voting Information Project (VIP) is a partnership between The Pew Charitable Trusts and Google that started in 2008. Both organizations realized that voters were having difficulty finding the answers to common elections-related questions, such as “where do I vote,” “what’s on my ballot,” and “how do I navigate the elections process”. The project encourages states to publish public information pertaining to elections–geopolitical boundaries, polling locations and early voting sites, local and state election official contact information, ballot information, and other related data–in a standardized format allowing Google to make the data available through the Google Civic Information API. Our goal is to lower the barrier of access to this information, making it easier for elections officials to concentrate on running the elections.

I’ve worked at some great companies over the years, but my work, though challenging and interesting, felt a bit disconnected. I wanted to do more to potentially solve societal problems, which led me to VIP. I’d always found elections cerebral but daunting, since it was difficult to find the information I needed to cast an informed vote. Voting is one of the most important activities in civic life, so this project fulfilled my desire to “improve the world,” so to speak. Early last year, David Becker, Director of Elections Initiatives, offered me the opportunity to manage VIP at Pew. Considering how much I loved the project, it was easy to say yes.

OE: VIP is a collaboration between The Pew Charitable Trusts and Google’s Civic Innovation project. How does this work, and what resources do each entity bring to the table?

JM: Since providing election information is a distributed data problem–meaning the data we require is held in different databases across departments and, sometimes, jurisdictions–Pew, through Democracy Works and Election Information Services, provides engineering support to states to automate and centralize the publication of this information at the state-level. Pew also creates open source tools that leverage the API and allow states, campaigns, and civic organizations to use low-cost tools. Pew works with Engage and Lewis PR to broaden the project’s reach to potential organizations that may be interested in leveraging the data or the tools.

Pew has a great working relationship with Google. They offer an understanding of elections coupled with technical infrastructure and engineering that few others could match at scale. Additionally, they created the Voter Information Tool, which provides a single source of election information to voters and is one of the most visible artifacts of the project. Anthea Watson Strong, my counterpart at Google, has extensive experience with the project and campaigns, making her uniquely suited to manage Google’s role with this initiative.

OE: Can you describe how VIP impacts an individual voter, and how it eases their participation in the elections process?

JM: Though there are numerous tools, at its core, VIP allows a voter to enter their address and find their polling location and ballot information for every major election without ever providing any personally identifiable data. At Pew, we try to cover a number of different access points beyond Google’s Voter Information Tool. We’re working with Azavea to develop a white-label, accessible iOS application and a companion Android application that allows users to find election information. In the interest of bridging the digital divide, we’re also developing an SMS-based service to look up polling location information and registration status. Because the Civic Information API is accessible to the general public, civic organizations and individual developers can use the data in ways that we may not cover through our own open-source applications.

VIP also publishes all of the raw data, which tech collaborators use in various ways. One of the most fun examples was when Foursquare used the geographical polling location data in their application. A voter that checked-in to his/her polling location on Election Day received a virtual “I Voted” badge.

OE: What other Election initiatives are underway at Pew, and how do they all interrelate?

JM: Our core mission in election initiatives is to make elections more accessible, accurate, and cost-efficient. In addition to VIP, we have two other projects that work towards our goals.

The Upgrading Voter Registration (UVR) project partners with election officials, policy makers, technology experts, and other stakeholders to help states move towards more integrated, modern, and secure voter registration systems. This goal is accomplished through a number of initiatives, one of which is the Electronic Registration Information Center (ERIC), an independent non-profit whose membership is made up of representatives of the states that work to improve the quality of voter registration lists through a sophisticated data matching system.

Pew’s ethos is all about constant evaluation through data analysis. In keeping with the culture, the Elections Performance Index (EPI) is our measurement of elections administration based on 17 objective indicators (e.g. data completeness, turnout, voter registration rate, et al). Along with a massive amount of fascinating data and state fact sheets (e.g. Wisconsin [PDF]), the “crown jewel” of this project is the interactive. This year is also the first time that we’ve had the data to compare two presidential elections: 2008 and 2012.

OE: In light of the recent presidential report highlighting that current voting systems are at the end of their viable lifespan, are you aware of any new solutions underway?

JM: Innovation in voting technology is complicated by outdated certification requirements. Since the last time the federal standards were updated, smartphones became ubiquitous, and Apple, with the advent of the iPhone in 2007 and the iPad in 2010, changed the way we think about the capabilities of “mobile users.” Most states have state-specific certification standards, too, many of which are based closely on the federal standards. The result is an expensive and lengthy process to certify new voting technology that prevents entrepreneurs from developing new systems and limits the products available on the market. Vendors are unwilling to invest in innovative technology when there is no guarantee that there will be a market for their technology once it is certified.

Election officials are left treading water with outdated and insecure technology while waiting for new technology to be offered, knowing that the current system prevents innovation. While we are starting to think about creative solutions to the problems in the marketplace, two county-based projects are approaching this problem from their perspectives. The Travis County, Texas Elections Office is working with a number of academics to build STAR-Vote, a completely new election system. A similar initiative is also taking place in Los Angeles County called the Voting Systems Assessment Project (VSAP). VSAP is guided around set of principles defined by the Advisory Committee and the county is working with IDEO to create early prototypes (NB: In the interest of full-disclosure, I serve on the VSAP Technical Advisory Committee).

OE: Ideally, what kinds of organizations and systems would come together to make a robust, transparent and cost-effective elections infrastructure?

JM: VSAP is a solid start. Academics, civic organizations, the private sector, and the public all take part in the process in meaningful ways. With IDEO, they take a “human-centered” approach to the problems, which I believe makes this project transformative. Ideally, elections should be about what works for each individual voter, though this philosophy does introduce a number of unique challenges. Time will tell if initiatives like VSAP and STAR-Vote will change the elections technology landscape, but I’m optimistic.

***

Jared Marcotte is an officer for Pew’s election initiatives, which supports states’ efforts to improve military and overseas voting; assess election performance through better data; use technology to provide information to voters; and upgrade voter registration systems.

Marcotte primarily oversees work on the Voting Information Project, a partnership with Google that improves the availability of election information for voters and civic developers while easing administrative burdens on local election officials. He also serves as an advisor on other Election Initiatives projects where technical strategy or software engineering is a component of the work.

Previously, as a senior engineer at the New Organizing Institute, Marcotte worked on the Voting Information Project, a collaboration with state and local officials, Google, and Pew to develop a nationwide dataset of election-related information. Marcotte previously worked at Six Apart and IBM and as an interface and interaction designer on the Election Protection Coalition’s Our Vote Live, KCET.org, and various enterprise-grade sites. He currently serves on the technical advisory committee for the Voting Systems Advisory Committee for Los Angeles County, California.

He holds a bachelor’s degree in computer science from the University of Vermont.

Eating Our Dog Food

July 15, 2014

By Derek Willis

When Serdar and I first talked about building a national collection of certified election results, we had a very specific audience in mind: the two of us. It seemed like every two years (or more frequently), one or both of us would spend time gathering election results data as part of our jobs (me at The New York Times, Serdar then at The Washington Post). We wanted to create a project that both of us could use, and we knew that if we found it useful, others might, too.

Precinct comparison

The New York Times

In the world of software development, using your own work is called eating your dog food, and we’ve done just that. While we’re nowhere near finished, I am happy to report that OpenElections data has proven useful to at least half of the original intended audience. Both last week and this week, The Upshot, a new politics and policy site at The Times that I work on, used results data from Mississippi collected by OpenElections to dig into the Republican primary and runoff elections for U.S. Senate. The analyses that Nate Cohn did on voting in African-American precincts would not have been possible using the PDF files posted by the Mississippi Secretary of State. We needed data, and we (and you) now have data.

We’ve completed data entry of precinct-level results for the 2012 general election and the 2014 Republican primary runoff elections, plus special elections from 2013, and we’re working on converting more files into data (we just got our first contributions from volunteers, too!). These are just the raw results as the state publishes them; we haven’t yet published them out using our own results format (but that’s coming soon for Maryland and a few other states). We provide the raw results for states that have files requiring some pre-processing – usually image PDFs or other formats that can’t be pulled directly into our processing pipeline.

The Mississippi example is exactly the kind of problem that we hoped OpenElections would help solve, and it’s only the beginning for how election results data could be used. Once we begin publishing results data, we’d love to hear how you use it, too. In the meantime, if you have some time, there’s more Mississippi data to unlock!

Home Screenshot

As we get the first few states’ data processed and ready to release, we are building an interface to deliver it to you, and to show our progress as we go. The live site (above) now shows metadata work to date, and the volunteers involved. Soon you will be able to toggle between this view and a map of the current condition of results data (below). Clicking on each state will show you details on the most cleaned version of available data.

Data Map

The color coding on this new map will change as we get more states online with ‘raw’ results, and fully cleaned data (what you see here is hypothetical and just to illustrate how the map will work). When we say ‘raw’, we really mean results that reflect the data provided by state elections officials. These results are only available at the reporting levels provided by the states and fields like party and candidate names have not been standardized.  These results do have standardized field names, so you will be able to more easily load and analyze the data across states. We will get as many states fully cleaned as we can, but our baseline goal for this year is to wire up and make available most of the data in a ‘raw’ state.

As we build the data interface, we would love to know what you think. Is the terminology we are using clear to you? Is the interaction clear? Is there anything else you would like to see here?

Download Page

If you click on the ‘Detailed Data’ link on the data map page, you will get to this download page showing all the races for the state you have chosen. You can download results at a variety of reporting levels, depending on what is available for this state. We will include rows in this view for all processed data (clean and raw) as well as any races we haven’t processed yet, just so that you know they exist.

Above the download table there is a slider that both gives you an overview of all the races available for a state, and a way to select just a specific date range for which to browse detailed results. You can filter results by race type – such as President, Governor, State Legislative races, etc.. If there are any other ways that you need to access the data, or if anything about this interface could be clearer, please let us know!


We will be building out a preliminary version of this interface in the next couple of weeks, and will revise it further based on what we hear from you.  

To tell us what you think, comment on interface elements here, or email us at openelections@gmail.com.

An Interview with IEEE Voting Systems Standards Committee’s John Wack and Sarah Whitt
.

JohnSaraah
.

OE: Can you describe your current work with elections standards, who the collaborators are, and how this fits into the larger context of what the Institute of Electrical and Electronics Engineers (IEEE) does?

JW: In the Voting Systems Standards Committee (VSSC), we are currently working to produce several standards and guidelines, including for election results reporting, for election management system export, for event log export, and hopefully soon for voter registration database export.  The collaborators include various election officials, voting system vendors, some people in industry, and others in academia. This fits quite naturally into IEEE’s framework.

SW: I joined the IEEE VSSC as an election official interested in the elections technology standards that IEEE and National Institute of Standards and Technology (NIST) were working on.  The VSSC is developing several standards related to elections: a standard for blank ballot distribution to military voters was published before I joined the team; and we are finishing up an Election Results Reporting standard, for which I am the working group chair.  The VSSC includes a wide range of participants, including election officials like myself, voting system vendors, academics, folks from NIST and the Elections Assistance Commission, election activists, media such as the AP, technologists, interested citizens, etc.  This is the first IEEE project I have been involved with, but it seems like a natural fit given the other technology standards that IEEE issues.

I have also been an active participant in the Pew-sponsored Voting Information Project (VIP), which works with states to provide election data such as polling places and sample ballots in a common data format for consumers like Google and Microsoft to use in their search engines and other tools to assist voters.

OE: How does this work relate to recent efforts to improve voting systems nationally?

JW: The IEEE was engaged in producing voting system standards prior to the passage of the Help America Vote Act (HAVA) in 2002.  The EEC and NIST then began producing voluntary voting system guidelines.  In recent years, though, the EAC has become somewhat inactive because of the absence of commissioners and thus, new voting system standards have not been approved.  NIST then began working with the IEEE as a pathway to developing needed voting system standards that can be adopted voluntarily by states.

SW: I have not been involved in national work related to voting systems, however Wisconsin and other states were involved in the voting system testing and approval process run by the National Association of State Election Directors prior to the Help America Vote Act of 2002.

OE: How did you both come to be doing this work?

JW: I had been managing some of the voting system standards development and wanted to work on common data format-related standards because I personally felt that it was important to build this sort of capability into voting systems and into voting system operations.  Transparency of data is very important for a number of reasons, including for testing, for security, and for public access of election data.  My role at NIST gave me the opportunity and freedom to work with IEEE and focus on this material.  It has been very gratifying in that I have become acquainted with a number of election officials and voting system vendors who are of the highest caliber and have contributed greatly to this overall project.

SW: I heard about the IEEE standards work through several elections IT colleagues I worked with on the VIP Project.  I help manage Wisconsin’s Statewide Voter Registration System so the mission of the VSSC to create interoperability between elections IT systems was very attractive to me.  Election officials today use multiple IT systems for various purposes, that don’t necessarily communicate with each other, so the ability to exchange data more easily between systems saves time and money, increases accuracy of the data in all systems, and allows synergy between systems for better analysis of data and ultimately better decision making.

OE: Can you describe your working group’s digital voting standards initiative, and how it will potentially impact voters, election organizers, and the people who design election systems?

JW: Having election data in a common format means that the format of the data is documented publicly and thus it is available, open to anyone.  The format is not proprietary, which frees small developers and election staff themselves to use commonly available tools such as web browsers to read the data.  When elections staff are designing new systems or attempting to cause systems to interoperate with each other, having the data in a publicly-documented format is a huge advantage.

SW: The VSSC is not working on digital voting standards per se, but we are working on standards for IT systems used in the Elections arena.  As I stated above, the first standard to come out of this group was for distributing blank ballots to military voters.  This standard will help states implement systems that deliver ballots to military and overseas voters electronically instead of on paper.

This reduces transit time and helps enfranchise America’s military and overseas civilians. The election results reporting standard we are finishing up will help the media and other groups that use election results.  Having results reported in a consistent way across states and jurisdictions will allow for easier aggregation of election results, which provides faster results reporting on election night as well as better analysis of election results after the election.

The standard also encourages reporting more data than is reported today.  Better analysis of election results data, and a more complete dataset, will help drive better policy.  There are many groups out there developing election administration tools that use common data formats (Open Source Election Technology, TurboVote, OpenElections, etc).  Today the only common data formats out there are really VIP and EML.  The VSSC is trying to create standards that fill the gaps where common data formats have not been available in the past.  These standards will also meet the needs of the diverse stakeholders involved in elections, which is truly the benefit we reap from having such diverse constituencies on the team.

OE: You have slightly different positions on the value and role of consistent and standardized  national voting administration systems vs a diverse and interoperable ecosystem of tools. Can you both talk about this, and why you hold the positions you do?

JW: I believe that it makes sense to have a national testing and certification program and a relatively high degree of uniformity among the states in the basic information technology.  This doesn’t mean that every state has to do it the same way, but it does mean that, regardless of equipment manufacturer and state, the data is in a publicly-documented format and can be tested as such across all states and territories.

The common data format standards can be likened somewhat to electrical code.  Having a uniform electrical code doesn’t necessarily mean that buildings need to look or feel the same – it just means that the electrical outlets and so forth are consistently uniform and licensed electricians can freely work on the electrical systems without having to understand proprietary information.  This is exactly the same with a common data format – voting systems can be different, they can be used differently across different states, they can be interconnected in different ways, they can have different user interfaces – but the underlying data formats are all publicly documented and uniform.  Anyone can write tools to operate on the data.

SW: America proudly carries on its tradition of a federated election system, which includes a national framework of laws to guide election administration across the country but also provides for states to set election policies that best meet the needs of their individual states.  A state’s ability to set its own laws for election administration is one of the great strengths of American democracy, but it does add complexity to the system.  The Help America Vote Act brought consistency and technology across states through mandatory statewide voter registration systems and accessible voting equipment.

In the post-HAVA world, state and local election offices have various technology systems to register voters, manage elections, facilitate voting, tabulate, report and certify election results, and track campaign finance information.  These systems come from various vendors or are home-grown.

Having some national baseline standards, and having common data formats for easy interchange allows for states and localities to purchase or build whatever systems best meet their needs, and allows those systems to interoperate.  If systems can interoperate regardless of who builds them, this allows for innovation in the marketplace, and can help reduce costs of individual systems.  But if standards are too burdensome or too literal, innovation is stifled and the vendor pool shrinks, offering states fewer choices at higher prices.  So there is really a sweet spot for standards where some integrity is assured, but election administrators have choices and systems are reasonably priced.

OE: What are the pros and cons of digital voting, and when is it more or less relevant and useful?

JW: I think in a perfect world, election day for both primaries and general elections would be national holidays – we would all vote on paper, election officials would have adequate time to make the instructions very clear, and election officials would have adequate time to carefully count all the paper, perform audits, and issue the results.  I could take this further, but of course we don’t live in that perfect world.

Digital voting makes lots of sense for many reasons, including that computerized voting interfaces can help voters to vote more accurately and prevent them from making common mistakes.  Computerized voting makes it much easier for election officials to administer elections – and to administer them accurately.  Yes, there are issues when voting electronically and there is no paper audit trail – but this has to be balanced against other factors, such as those I’ve mentioned.

SW: As a person working in an election administration office, I don’t really have a comment on digital voting.  That is really an issue for legislatures to determine.  If digital voting is made law at the state of federal level, we will administer the law to the best of our ability.

OE: What is your time-frame for finalizing a new national election results reporting standard, and how will it improve on the way things function now?

JW: The timeframe I am working towards includes getting the election results reporting standard out for public review by end of June, 2014, and having the final standard ready for IEEE approval in late fall/early winter. I expect that we will receive a good number of comments from various states – and this will be good – but at the same time it will require a fair amount of work to respond to the comments and we will no doubt need time to make some changes and improvements in the standard and the XML schema.

SW: We are hoping to finish up the election results reporting standard yet this year.  Once we are finished with the draft standard, it goes through the IEEE balloting process before it is officially released, which takes some time, so we are trying to get the draft ready for balloting as soon as possible so we can get it released in 2014.

This standard will improve things in several critical ways.  1.  It provides a common format for reporting so that results can be more easily aggregated across jurisdictions.  This allows for faster results reporting, it allows more groups to be able to report results and not just the media, and it allows for better analysis of data  2.  It provides additional data elements that are not always reported with election results, which results in better analysis of the data, and easier auditing of results.  3.  It supports three use cases — pre-election (i.e. election set-up information), election night reporting, and post election reporting (i.e. certified results or results for performing audits).

Supporting all three use cases allows for interoperability between election management systems, voting systems, results reporting systems and canvassing systems, which saves time and money in elections offices, as well as improving data accuracy.  So within an elections office, we can save time and money.  For people who consume election results, more groups will be able to consume results, they can get results faster, and do better analysis, which results in better policy. For voters, they find out winners sooner, and enjoy the benefits of better elections policy.

Standards may seem dry to some, but I think this is really exciting.  These are real, tangible benefits that will come out of this work.  I just feel grateful to be a part of it.

OE: What has the process of developing this standard been like? Who have been the stakeholders and has this been a new kind of collaboration in this space?

JW: Developing this particular standard was at first difficult.  We initially worked with a relatively large group of people, roughly 20 in size, and progress was very slow.  In particular, some people who had good intentions nonetheless impeded progress by focusing more on the process than on the need to get something done in a reasonable timeframe.  I convened a smaller group composed of election officials, vendors, and data modeling experts, and pushed for Sarah to be chair of the working group.

I feel very strongly that the work in IEEE should be managed by election officials, who above all understand elections and need the equipment to work well for them as well as for voters.  At the same time, vendors have a broad understanding of how elections are run across the United States, as well as organizations such as the Associated Press.  Working with a smaller group made things work much more smoothly and has resulted in a standard that, I believe, is much more applicable across all states.

SW: When I joined the VSSC (at that time it was just Project 1622) I was invited by colleagues in other election offices because they noticed there weren’t a lot of folks who actually work in elections administration on the team.  I since invited other election officials to join as well to help balance the group out.  The team we have right now working on Election Results reporting is really kind of a dream team — we have both major voting system vendors (Dominion and ES&S), the Associated Press, folks from state elections offices (WI, OH, WV), industry experts like Kim Brace and the folks at NIST, and interested parties in academics and the audit communities.  The kind of expertise that this broad stakeholder base has brought really improved the standard.  We applied a use-case approach to the standard so we could walk through real world scenarios for how this data is produced — what systems it comes from, what government level, at what time in the process, etc.  I think that’s how we were able to have it be so comprehensive.

We looked at the total election results reporting picture from the angles of the elections office producing the files, the vendors of the systems they will be using, and the consumers who will be using the data.  I think this represents a different type of collaboration than I have seen in the past.  We also used an inclusive approach to membership instead of exclusive — if you are interested in this standard or have opinions on how you think it should be done, join the team!

Anyone can sit at the table if they want to, and everyone at the table gets a voice.  We have a leadership structure through the working group chair and the standards editor to help filter through the comments, and we put a lot of things up for vote. So it’s a very democratic system.  This prevents the group from ignoring interested constituencies, and helps balance views from very different communities.

OE: What do you think of the recent presidential commission on elections administration? Will it affect how your work in any way?

JW: I can’t comment much on the presidential commission.  I do think that their report is imperfect – I wish they had gone into much greater detail and provided more specific recommendations in a number of areas.  However, they had a lot of work to do and there were many stakeholders besides me.  All in all, I have the belief that they worked hard and tried to do the right thing and mostly produced a good report that should be paid attention to – I was particularly gratified that they didn’t find much evidence of voter fraud to warrant voter ID laws that will result in needless litigation and state taxes spent on lawsuits.  They did end up validating our work

SW: I am very excited about the Presidential Commission’s report.  The bipartisan nature of the commission and the report takes a lot of the controversy out of the recommendations and gives election officials solid choices for how to improve their processes.  The commission report likely won’t impact our common data formats work very much, but as an IT person in an elections office, it was great for me to see the focus on improving elections technology in the report.  Some recommendations may require legislative changes in some states, and not all recommendations are a good fit for all localities, but the scope of the recommendations are broad enough that I feel like there is something in there for everyone.

I personally think these kinds of bipartisan efforts that focus on research and provide options are a great way for the federal government to drive policy in a way that is not as heavy handed.  The report itself appeared to be very well researched, well written, and overall of excellent quality.  As a taxpayer, I appreciate the quality of work this team did.

OE: What do you think is the best way forward to continue to innovate in this space? What kinds of relationships and models? And do you see any particularly pressing needs currently?

JW: The election results reporting standard was produced by first creating a UML model.  The advantages to producing a model include that one can focus on the data definitions and relationships as opposed to the format, e.g., XML.  Now, I believe that the data model should be abstracted upwards so that a higher level model can be created of election data in general.  This would help to provide a foundation for producing common data formats for various applications.  While the applications could be quite different, the format could be consistent and, as a result, systems will still interoperate.

Some of the more important areas to work in are, I believe, tablets and pure electronic devices.  While I am personally not a fan of Internet voting for the general public, I do believe that Internet voting for overseas military and individuals with disabilities is acceptable and that common data formats for ballot data should be developed to make systems more transparent and auditable.

SW: I completely concur with John on the need for an overall model of election systems.  While states run elections differently, we all have common sets of data that flow between common IT systems.  This type of high level modeling is critical to move towards better interoperability between elections IT systems.  Having a common understanding of election systems and data makes it easier for vendors (and homegrown state systems) to build interoperability into the next generation of systems — whether voting systems, statewide voter registration systems, voter information portals, online ballot delivery systems, e-poll books, election results reporting systems, campaign finance systems, the list goes on and on.  Ultimately with a common data model, we can also move towards more common formats for reporting data to the public.

***

John P. Wack is a researcher at the National Institute of Standards and Technology in the area of elections standards. He chairs several standards groups within IEEE and is managing the standardization of a common data format for election systems, working in conjunction with election officials, manufacturers, and others in the community. He is also an assessor for the National Voluntary Laboratory Accreditation Program and visits voting system test laboratories regularly to check compliance with requirements and standards. With the EAC’s TGDC, he has managed the development of the 2007 VVSG Recommendations to the EAC and the 2005 VVSG. Prior to working in elections, he authored and managed a variety of IT and network security guidance and assistance activities for NIST. His goals in the elections area are to make voting systems easier to manage by election officials, easier to use accurately by voters, and more transparent to test by election officials and testing labs.

Sarah Whitt is an IT professional with the Wisconsin Government Accountability Board, the state’s chief election agency.  She joined the agency in 2003 to help establish Wisconsin’s first statewide voter registration system, and is currently overseeing the modernization of that system.   She is chair of the IEEE Voting System Standards Committee’s Election Results Reporting working group, who is working on a common data format for publishing election results and election definition information. Through her experiences with elections and IT, she has learned that technology is of no use unless it is harnessed for good public policy.  She serves as a bridge between IT staff and policy makers to help ensure the public’s work is done effectively and efficiently.

Screen Shot 2014-06-05 at 7.31.32 PM

OpenElections represented at this year’s Transparency Camp, a national conference for civic hackers who work to make political process and government data more, well, transparent. This is a growing and very dynamic un-conference and the session topics ranged from ‘Why the internet hasn’t changed politics’  to ‘Interoperable Civic Data — for user-centric technology’. There were many journalists in attendance, as well as political scientists, policy makers, and technologists working within and in support of government. The atmosphere was palpably optimistic, as the general ethos of the crowd was that ‘we are all here to affect positive change’.

Screen Shot 2014-06-05 at 7.32.20 PM

There were many international civic tech folks and journalists in attendance too, who were especially interested to observe how the US deals with the issue of advancing it’s government transparency since the  impact of this is felt all over the world. TCamp is becoming more international each year.

Screen Shot 2014-06-05 at 7.31.03 PM

The conference was also very technical. OpenElections team members Derek Willis and Sara Schnadt spoke to a room full of hackers particularly attuned to the nuances of elections processes and aware of existing results infrastructures and their limitations. Derek walked through the process of acquiring a data source for a state, writing a scraper, and made the case for joining our effort. There were many thoughtful questions and a lively broader discussion about how to best create technologies to facilitate democratic process. The discussion continued and got even more down to the nitty gritty in a later session bringing together representatives from OpenElections, Voting Information Project, Google Civic Innovation, the Sunlight Foundation, and others, to tease out the problem of defining open data identifiers in an open and non-hierarchical ecosystem of technology projects.

Screen Shot 2014-06-05 at 7.29.37 PM

That weekend, as you heard from us leading up to it, was also National Day of Civic Hacking, and TCamp was one of over 100 events taking place around the country. We camped out and hacked in the main room at the conference a good bit (as did our teammates in Chicago and the Bay Area), ramping up new developer volunteers who were joining in from TCamp and from events in other parts of the country. A big thank you to everyone who joined us over the weekend, and great to meet all of you who came on board at TCamp!

Screen Shot 2014-05-26 at 1.10.01 PM

As part of National Day of Civic Hacking, we are organizing an OpenElections challenge for the hacking events at locations all over the country – Sat May 31 and Sun June 1st.

If you are attending one of these events near you, and would like to join in on our effort to write scrapers for elections results, let us know!

Write Scrapers for us…
Help us extend our core scraper architecture to create a series of custom scrapers that account for the idiosyncrasies in how each state structures data, stores it, and makes it available.

**Our docs for this process are now up on our site. Look here to see what would be involved with joining in**

Your time and expertise would be most appreciated either day. Also, feel free to join in from home.

If you would like to help out, email sschnadt.projects@gmail.com either or tweet at us @OpenElex either before the event or on the day. Our team will be online and available to get you set up.

Thank you!

The OpenElections Team

Interview with TurboVote Co-Founder Kathryn Peters

kathryn-peters-056a70501d0122524c48c2be9d6a0d97

In this series of interviews, OpenElections has conversations with the leadership of other initiatives that are improving data transparency, easing the voting process and applying new technologies to elections.

For our first piece we talk to Kathryn Peters, co-founder of TurboVote, our sister Knight News Challenge: Data project. TurboVote is a service that aims to make it as easy to vote, and keep track of all the elections you can participate in, as it is to do all the other things we now do online.

***

OE: How did the TurboVote project and Democracy Works Inc. come about, and what were your motivations for starting them?

KP: Seth [Flaxman, TurboVote co-founder] spent a summer in college registering voters in Philadelphia with a sandwich board and a stack of paper forms, and recognized that there had to be a better way to reach would-be voters than standing on street corners. When he finished his first semester of grad school and realized he’d missed a local election back home, that same realization struck him again – voting should fit the way we live. We live online, on our phones, with services and applications that help organize our lives and simplify daily tasks.

Seth asked my advice in building an election-reminder service. My first response was incredulity. I’m from Columbia, MO, where the county clerk Wendy Noren builds her own voter engagement tools and has sent email reminders about upcoming elections for a decade already. I just assumed that these were normal voter services. Once Seth convinced me that Wendy’s online voter services were rare, it made perfect sense to try and make them available to every voter. So we started prototyping.

OE: What background(s) do you bring to this work?

KP: Seth and I met in a graduate policy program, so we’re both deeply committed to innovating with and for government–in this case, local election administrators–which sets us apart from most of the tech startups we know. Seth’s previous work had been as a researcher (at the Council on Foreign Relations), and he approached graduate school with a big research question: why does the Internet seem to be passing government by? I had worked in both political organizing and information management, but was studying international affairs and thinking about how we promote and support democratic processes abroad. Those two concerns came together in a really fantastic way, even if it means I’m in Brooklyn instead of, say, Cairo right now.

OE: Can you describe how TurboVote impacts an individual voter?

KP: It depends a lot on the voter, where they are and what they need. But let’s imagine a college freshman, who arrives on campus and is offered the opportunity to register to vote during orientation, and decides to register at her parents’ home in another state. As she signs up, we’ll also get her on-campus address, and ask if she’ll need to vote by mail in elections back home. So after she joins TurboVote, we’ll send her a voter registration form filled out with her information with an addressed, stamped envelope so she can return it to her election administrator. And then as an election comes up, we’ll send her an email reminder and mail her an absentee ballot request form, again with a stamped envelope so all she has to do is sign it and send it in. And then we’ll send her reminders about the deadline to submit those forms so she gets everything in the mail on time. And election after election, she’ll hear from us and have whatever forms and information she needs to take part, even in local elections she might not hear about living on a college campus the next state over, for example.

We designed a simplified flow chart to try and simplify all the many ways we serve different voters.

process_flow_chart

OE: TurboVote is one of three projects currently in your roster. How has your work expanded and further defined itself this year?

KP: TurboVote’s growth in 2012 demonstrated how much demand there is for voting information and services, but the only way to do this sustainably is if government eventually adopts it and takes on these new tools for voter outreach. To that end, we spent 2013 researching local election administrations across the country, spending six weeks shadowing offices across six states and learning about their work, their staff, the tech they’re using, their needs and motivations. We found dedicated innovators making incremental improvements at every election in pursuit of better elections for their voters. And we found dozens of ideas worth building or popularizing that could help them run elections better, more simply.

From that research, we started building Ballot Scout, which makes it easy to add Intelligent Mail barcodes to absentee ballot envelopes and trace them through the postal system. Right now, most election officials send out their absentee ballots, get some of them back, and have no way of knowing if the others went undelivered, or weren’t cast, or are delayed in a postal processing facility and will arrive three days after the election. Barcode tracking gives officials better insight into what happens to those ballots as they leave the election office, and the ability to intervene if anything goes wrong. We’re working with seven counties from Oregon to Florida to test Ballot Scout this fall (and we’re still looking for three more counties to join the beta).

And last summer, the Pew Charitable Trusts asked us if we’d consider taking on data and technology support for the Voting Information Project. It’s the biggest election dataset in the country, providing tens of millions of Americans with polling place information each cycle, and we were eager to help build out its permanent infrastructure for data collection and processing. It’s also connected us to state election officials and let us get to know their work and needs, as well as those of the counties we’d been working with previously.

OE: What is your business model, and how does it inform your effectiveness?

KP: We’re a 501(c)(3) nonprofit, currently funded through grants from the Knight Foundation, Democracy Fund, and Google, among many others. TurboVote operates on a partnership/fee model, where each of our partner organizations contributes a small amount toward our operating costs, and we’re developing a pricing model for Ballot Scout that will do the same for that service. As we continue to grow and add new partners, these revenues should bring us to fiscal sustainability by 2017, ensuring that we can continue our work without major donations.

OE: How does Democracy Works fit within the ecosystem of voting infrastructure projects going on now? Are there other best practices you are aware of?

KP: Great question. The ecosystem is somewhat ad hoc, but we’ve used research by Dana Chisnell and Whitney Quesenbery at Civic Design for information on what  voters are looking for and how they interact with election data online, and we’re currently collaborating with ELECTricity on a project to offer free website templates to local election offices that takes the Civic Design best practices and implements them by default. We pool our election research with Long Distance Voter, whose forms we use in states that don’t otherwise provide a ballot request form, for example, and we compare deadlines, election administrator addresses, and other data where we can help check and support each others’ work.

We’re also participating in the third-annual National Voter Registration Day, which brings together civic organizations like the League of Women Voters, the Bus Federation, and Voto Latino to celebrate voting and engage new voters across the country.

I’m also keeping an eye on projects in both Los Angeles County, CA and Travis County, TX, where election administrators have recruited designers, computer scientists, academics and citizens to reimagine voting machines. Both are designing their projects to be open-source and available to other jurisdictions, and I think it’s a fantastic model for the kind of collaboration I’d like to see become even more popular in this space.

OE: What do you think of the recent Presidential Commission on Elections Administration and it’s findings? Will it affect how your work is rolled out?

KP: I’m a big fan of the report! The Presidential Commission on Election Administration issued a practical list of recommendations–and accompanying tools–that can help election officials run better elections. They think postal ballot-tracking is a great idea, too, so I may be a little bit biased.

OE: Ideally, what kinds of organizations and systems would come together to make a robust, transparent and cost-effective elections infrastructure?

KP: I think the collaborations in Travis and Los Angeles counties have the right mix – administrators, technologists, designers, and ordinary voters – and that it’s mostly a question of how we scale that and build communications among election innovators so good ideas can really take root and spread nationally.

Kathryn Peters is a co-founder of TurboVote. Her belief in better democracy has taken her from campaign organizing in rural Missouri to a Master’s in Public Policy at the Kennedy School of Government to political rights monitoring in Afghanistan. Katy has also worked for the information management team for the United Nations Department of Safety and Security and the National Democratic Institute’s Information and Communications Technology staff. In 2011, she was honored as one of Forbes magazine’s “30 Under 30” in the field of law and policy.

 

When we embarked on this quest to bring sanity to election data in the U.S., we knew we were in for a heavy lift.

A myriad of data formats awaited us, along with variations in data quality across states and within them over time.  In the past few months, the OpenElections team and volunteers have crafted a system to tame this wild landscape. This post takes a closer look at how we applied this system to Maryland, the first state that we took on to define the data workflow process end to end. Hopefully it helps shine some light on our process and generates ideas on how we can improve things.

The Data Source

Maryland offers relatively clean, precinct-level results on the web. In fact, it provides so many result CSVs (over 700!) that we abstracted the process for generating links to the files, rather than scraping them off the site .

Other states provide harder-to-manage formats such as database dumps and image PDFs that must be massaged into tabular data. We’ve devised a pre-processing workflow to handle these hard cases, and started to apply it in states such as Washington and West Virginia.

The common denominator across all states is the Datasource. It can be a significant effort to wire up code-wise, but once complete, it allows us to easily feed raw results into the data processing pipeline.  Our goal in coming months is to tackle this problem for as many states as possible, freeing contributors to work on more interesting problems such as data loading and standardization.

Raw Results

When the datasource was in place, we were ready to load Maryland’s data as RawResult records in Mongo, our backend datastore. The goal was to minimize the friction of initial data loading. While we retained all available data points, the focus in this critical first step was populating a common core of fields that are available across all states.

In Maryland, this meant writing a series of data loaders to handle variations in data formats across time. Once these raw result loaders were written, we turned our attention to cleanups that make the data more useful to end users.

Transforms

Loading raw results into a common set of fields is a big win, but we’ve set our sights much higher. Election data becomes much more useful after standardizing candidate names, parties, offices, and other common data points.

The types of data transforms we implement will vary by state, and in many cases, one set of cleanups must precede others. Normalizing data into unique contests and candidates is a transform common to all states, usually one that should be performed early in the process.

Transforms let us correct, clean or disambiguate results data in a discrete, easy-to-document, and replicable way.  This helps keep the data loading code simple and clear, especially when dealing with varying data layouts or formats between elections.

In Maryland, we used the core framework to create unique Contest and Candidate records for precinct results. These transforms included:

This opened the door to generating totals at the contest-wide level for each candidate.

Validations

At this point, you might be getting nervous about all this processing.  How do we ensure accuracy with all this data wrangling? Enter data validations, which provide a way to link data integrity checks with a particular transformation, or more broadly check data loading and transformation.  In Maryland, for example, we implemented a validation and bound it to a transform that normalizes the format of precinct names.  In this case, the validation acts like a unit test for the transform.  We also cross-check the loaded and transformed result data in validations that aren’t bound to specific transforms to confirm that we’ve loaded the expected number of results for a particular election or ensure that the sum of a candidate’s sub-racewide vote totals matches up with published racewide totals.

Implementing and running validations has helped us uncover data quirks, such as precinct-level data reflecting only election day vote totals, while result data for other reporting levels includes absentee and other types of votes. Validations have also exposed discrepancies between vote counts published on the State Board of Elections website and ones provided in CSV format.  We’ve circled back to Maryland officials with our findings, prompting them to fix their data at the source.

Summary

Maryland has been a guinea pig of sorts for the OpenElections project (thank you Maryland!).  It’s helped us flesh out a data processing framework and conventions that we hope to apply across the country.  Of course, challenges remain: standardizing party names across states; mapping precincts to counties; and sundry other issues we didn’t cover here remain a challenge.

As we tackle more states, we hope to refine our framework and conventions to address the inevitable quirks in U.S. election data . Meantime, we hope this provides a window into our process and gives you all some footing to make it easier to contribute.