By Siri Anderson

3 Guidelines for Publishing Your First Open Data Sets

Posted May 17th, 2013

You’ve decided now is the time for your agency to get started with an open data program. But how do you decide which data sets to publish first?  

There is no standard formula that applies to all organizations. Depending on your program goals and priorities, there can be any number of good places to start. However, in working with a variety of organizations, we have identified a few smart starting points that have worked well. We hope they will help you decide the best route to take as you kick off your open data initiative.

1. Publish data sets that help fulfill your organization’s mission and goals

Drawing clear connections between your data program and your organization’s mission will help win support, as well as give your program purpose, discipline, and measurability.

Start by looking over your organization’s strategic plan or the governor’s action plan. Then, identify within it potential data sets that would have the biggest impact on achieving those goals. You may also want to ask your co-workers what data will help them to do their job better or more easily.

As an example, the Oregon Marine Board used open data to help achieve their strategic goal to educate citizens about boating safety, education, and access. They used their open data platform to create interactive maps that let citizens know information such as location-specific waterway regulations, where to register a boat, where to purchase various permits, view boat ramp locations, and more. This ended up saving the Marine Board time, materials costs, and labor costs – all while using data to achieve a mission-based goal. Read more about the Oregon Marine Board’s open data experience.  

2. Publish data that is already (or easily) available and organized

Open data may be new, but publishing data online is not. It’s more than likely your agency already has some data available, often via PDFs, Excel spreadsheets, or other offline database extracts, shapefiles, and KML files. Since they’re already organized and perhaps online already, these are great places to start for your open data project.

Starting here means that you’ll begin by republishing these sets in all the required formats, so that they are easily accessible and machine-readable. (Learn more about data formats and standards.)  

Another way to approach this starting point, especially if you’re currently publishing very little data online, is to start with data that is easily made available. Usually, this will be financial transparency and personnel data about your organization. That’s because it is typically well-organized and can be easily extracted from your back-end financial systems as clean spreadsheets. Once it’s loaded on your platform, you can use it to build helpful visualizations and guided browsing filters to help citizens interact with and understand the data with little effort.

One good example of this is the Ethics section of Data.gov. There, citizens can search, download and visualize extensive campaign funding and other data. Before publication on the open data portal, this would have been available only in spreadsheets or by submitting a public information request. Now, the data can be easily filtered and viewed through a variety of customizable charts, tables, and sorting features, delivering on the promise of transparency and easy access for citizens.

3. Publish the information citizens and developers are asking for

This is one of the best places to start if your agency wants to increase citizen satisfaction and reduce repetitive handling of frequent information requests. There are many ways to find out what information citizens are searching for or requesting most, such as:

  • Analyze your website data, using programs such as Google Analytics
  • Review your Freedom of Information Act requests
  • Survey your internal public information officers, clerks, and disclosure teams
  • Directly request feedback from constituents and developers

Using feedback as your starting point is an organic process. After gathering suggestions, it will require prioritization, often based on strategic goals or how much work it will take to provide the requested data set on your portal.

What is exciting about starting here is that it kicks off your initiative with widespread engagement. It will also help you measure the interest level of your community on a variety of community concerns. These may surprise you! From recycling data to health inspection data, take a look at what other open data leaders have learned about which data sets the public finds most useful.