Last weekend, volunteers in more than 20 cities across the country contributed to the U.S. City Open Data Census to find out where their city stands on open data.
Developed collaboratively by CfA, the Sunlight Foundation, and the Open Knowledge Foundation, the U.S. City Open Data Census is an ongoing effort to assess how cities are doing on releasing key datasets in open and machine readable formats. The purpose of the Census is to evaluate not only what civic data is available, but how accessible and usable it is — a foundational step for ensuring data can go “beyond transparency” to be truly actionable.
What We’re Tracking
“Open data is data that can be freely used, reused and redistributed by anyone – subject only, at most, to the requirement to attribute and sharealike.”
- Code Enforcement Violations: Locations of reported housing code violations.
- Crime Reports: Reported crimes and their location.
- Expenditures: A complete list of city expenditures by transaction (including: tax breaks, loans, contracts, grants, and operational spending).
- GIS – Zoning: The mapped zone (GIS) shapefiles of designated permitted land use.
- Transit Routes: Public transit schedules, route maps and real-time location information.
For each participating city, contributors evaluate whether these datasets are meeting criteria, like: Does the data exist? Is it online? Is it accessible free of charge? Is it machine readable and available in bulk? Is it up to date? Based on this information, the Census generates a score from 0 to 1700 to provide a snapshot of that city’s overall open data completeness.
The Results, So Far
While the Census is far from complete, information has been submitted for 21 cities so far and several interesting trends have emerged. So, what do we currently know about the state of open data in cities around the United States?
- Crime Reports (reported crimes and their location) is reported to be the most commonly available dataset, and also had one of the highest average total openness scores. Seventeen cities currently report that their crime data is publicly, freely available online. Thirteen of those were reported to be in a machine readable format.
- Annual Proposed Budget (city budget by unit of appropriation with programmatic descriptions) is reported to be the second most commonly available dataset, with sixteen cities reporting it to be available to the public online. However, only five of those cities reported that this data was machine readable.
- Public Buildings (locations of city-owned buildings) is the least commonly available dataset, with only three cities reporting that it is available.
- Of the 186 datasets reported to be available, just 55 percent are available in bulk — meaning the entire dataset can be downloaded or accessed easily. Making more data available for bulk download will be key in ensuring that developers and researchers can easily use it to build new applications and generate deeper insights.
- Of the cities reporting so far, San Francisco, Sacramento, and Salt Lake City are in the lead with 1500, 1165, and 1025 respectively. Still, there’s plenty of room for improvement — not a single city has achieved the highest possible overall score yet.
With this snapshot of open data completeness, community open data advocates and city officials can make more informed decisions about what data to release and where to focus energy on making improvements. By making it easy to see best practices from other cities around the country, we hope open data leaders will examine their own practices and work to fill any gaps — leading to more abundant and actionable data about these key civic functions.
How does your city measure up?
There’s still work to be done to complete this survey of the open data landscape. All the submissions are community contributed, which means that you can add information about your city’s open data by following the steps here.
- * </p>
Questions? Comments? Hit us up @codeforamerica.