Blogging for America

Learning About A Community: Data Analysis vs. Listening to People

As one of three Code for America Fellows working with the City of Santa Cruz, I was fortunate to spend five weeks in this beautiful city learning about the people, businesses, organizations, and institutions that make it such a great place to live and work. Of course, the beautiful weather and jaw-dropping scenery contribute as well — how’s that snow shoveling working out for you, Team Chicago? ;-)

As Fellows, our primary method of learning during our residency month is simply meeting people and listening to them. We’re listening to the participants in our focus groups share their stories, we’re listening to the conversations between local business owners and dedicated civil servants in the City’s Planning Department, and we’re listening to the people of Santa Cruz tell us why they love this city and how we might be able to help make it even better. We’re also listening carefully as folks we meet tell us the best locations for burritos, bicycling, and beach-going.

Our focus this year is to help the City of Santa Cruz improve the experience of local businesses as those business obtain permits and licenses. We’re helping Santa Cruz businesses “get down to business,” so to speak: every hour that an entrepreneur doesn’t spend waiting in line for a form at City Hall is an hour he or she can spend innovating, building his or her business, and hiring employees!

But as Fellows, we also want to make decisions informed by data and we like to geek out sometimes. So when I heard about the City of Santa Cruz Business License Database, a city-published data file containing a list of every individual and company conducting business in the city, I knew I had to spend some time taking a look and seeing what else I could learn about the community.

I used Google Refine, a free data analysis tool provided by Google, to take a closer look at the 5,399 business licenses held in the City as of February 2012 (the City updates this file each month). Google Refine offers several easy methods for sorting and filtering large sets of data, letting me analyze the business licenses by location, type, founding date, and many other aspects.

Using Google Refine to take a closer look at the data makes it easy to get quick insights and interesting perspectives. With just a few clicks, I was able to see a list of the city’s businesses sorted by number of employees.

Screenshot of Google Refine interface showing sort

Hmmm, looks like the big employers include Plantronics, CostCo, the Boardwalk — aha! We’ve got a meeting scheduled later this month to interview the management of the Boardwalk — one of the city’s largest employers and biggest tourist attractions — about their thoughts on the permitting process. (We’ll also probably play Laser Tag and Skee Ball just to make sure we fully understand the Boardwalk as much as possible.)

Really, though, there are only a couple of dozen businesses in the city that employ more than 100 people. We need to re-sort the data so we can easily see that the vast majority of businesses in Santa Cruz employ just one or two people. These are the professionals, contractors, moonlighters, the people who have invested in a rental property or side business to supplement their income. These are the people that make up the small business community; these are the people that collectively spend thousands of hours every year filling out business permit applications and renewal forms; these are the people we’ve been seeking out so we can listen to their stories and learn from their experiences. These are the people we want to help when we begin building the city’s application with our partners in March.

Small Businesses make up the majority of Santa Cruz businesses

Let’s see how many new businesses have started in Santa Cruz so far this year (since our data was released the first week of February, we’ll basically be looking for January business license issuances). With Google Refine, we use the “Timeline Facet” and sort by the date the business was started. It’s easy to see the dozens of businesses that have launched this year: veterinarians, cycle shops, surf shops, jewelry stores, building contractors. Of course it’s great to see that activity happening, even if those entrepreneurs missed out on the chance to use the great permitting application we’ll be building this year. Hopefully they’ll enjoy renewing next year just as much.

Screenshot of Date Filter in Google Refine

It’s hard to tell with a data set like this if a “Business Start Date” of 1900 actually means the business is 112 years old or if it’s just a data entry or program error, but the Timeline shows that there are quite a few businesses with legitimate histories going back into the mid-20th Century. Hmmm, Zoccoli’s Deli downtown, with a business license history dating back to 1959. That sounds familiar: I just ran into Chief of Police Vogel there on Thursday where I listened to his advice on the best sandwich to get (I went with the chicken parm). By the way, SCPD is one of the most technologically-advanced police forces in the US so of course they release their own datasets for public analysis — but that’s a topic for another blog post.

I’ve just scratched the surface of Google Refine this week but it’s obviously a great tool to help analyze and visualize large datasets. Of course, even the best data and the niftiest tools are no substitute for sitting down and actively listening to your future users and the community that you’re working to serve. For instance, even though you can tie Google Refine to a GIS system and run a filter for alcohol licenses, there is no app that can beat just asking the local geeks where to find the closest bar to your house that has a great beer list and decent Wi-Fi (it was the Seabright Brewery in our case :-) . Thankfully the geek community in Santa Cruz has welcomed us with open arms and answered those and many other questions for us already!

Code for America Labs, Inc is a non-partisan, non-political 501(c)(3) organization. Content is licensed through Creative Commons.