From Civic Data to Civic Insight

Journalists can help the public make sense of the growing amount of government data now being made available.

Ear­li­er this year the water began to recede on gov­ern­ment data in the Unit­ed States with Pres­i­dent Obama’s announce­ment of an unprece­dent­ed push toward fur­ther trans­paren­cy in the fed­er­al gov­ern­ment. But with the rush of new data comes the chal­lenge of mak­ing sense of it all — some­thing admit­ted­ly still in its for­ma­tive stages.

By June of 2009 the nation’s Chief Infor­ma­tion Offi­cer, Vivek Kun­dra, had over­seen the launch of data.gov with the goal of increas­ing pub­lic access to machine read­able datasets pro­duced by the fed­er­al gov­ern­ment. Oth­er gov­ern­ment sites such as usaspending.gov and recovery.gov have since been launched to pro­vide even more focused data on how the U.S. spends its tax­pay­ers’ dollars.

A promise of increased participation

The promise of data.gov and of many of these oth­er civic data col­lec­tions is in allow­ing cit­i­zens to par­tic­i­pate in the scruti­ny of their gov­ern­ment and of soci­ety at large, open­ing vast stores of data for exam­i­na­tion by any­one with the inter­est and patience to do so. Open source civic data analy­sis, if you will.

The array of data avail­able on data.gov is still sparse in some areas but has steadi­ly grown to include things rang­ing from res­i­den­tial ener­gy con­sump­tion, to patent appli­ca­tions, to nation­al water qual­i­ty data among others.

And the data trans­paren­cy move­ment isn’t just fed­er­al any­more: State and local munic­i­pal­i­ties such as Cal­i­for­nia and New York City are fol­low­ing suit with pledges to make more civic data available.

The ben­e­fits of the open data move­ment are also start­ing to be rec­og­nized through­out Europe. The U.K. has called on Sir Tim Bern­ers-Lee, the inven­tor of the world wide web, to lead a sim­i­lar gov­ern­ment data trans­paren­cy effort there, which should soon result in a data.gov ana­logue. And indi­ca­tions of move­ments are begin­ning to stir in Ger­many (link in German).

Beyond Data Scraping

In the U.S. var­i­ous gov­ern­ment data resources have been avail­able in some form or anoth­er online for years now. Pro­gram­mers could scrape these online data sources by writ­ing cus­tom parsers to scan web­pages and cre­ate their own data­bas­es. And many jour­nal­ist-pro­gram­mers work­ing in today’s mod­ern news­rooms still do. But it’s messy, it doesn’t scale or extend well, it’s brit­tle, and ulti­mate­ly the data that results may not inter­op­er­ate well with oth­er data.

Hav­ing gov­ern­ment buy-in to the pub­li­ca­tion of orga­nized and struc­tured data low­ers the bar­ri­ers sub­stan­tial­ly for devel­op­ers and oth­ers to get involved with ana­lyz­ing that data. It also means that struc­tured for­mats, such as those that con­form to seman­tic Web stan­dards can inter­op­er­ate more eas­i­ly and be uti­lized to build ever more com­plex appli­ca­tions on top of the data.

Data.gov ≠ Insight.gov

So now that the U.S. gov­ern­ment is pub­lish­ing all kinds of data online, soci­ety will be bet­ter, right? Well – maybe. Let’s not for­get that data has a long way to go before it becomes the infor­ma­tion and knowl­edge that can ulti­mate­ly impact back on policy.

Some non-gov­ern­men­tal orga­ni­za­tions are push­ing data to become infor­ma­tion by incen­tiviz­ing con­tests with big prizes. For instance, the Apps for Amer­i­ca 2 con­test, coor­di­nat­ed by Sun­light Labs, award­ed a total of $25,000 to the top appli­ca­tion sub­mis­sions which made data.gov data more trans­par­ent and acces­si­ble for citizens.

These efforts at coor­di­nat­ing devel­op­ers and stim­u­lat­ing appli­ca­tion devel­op­ment around gov­ern­ment data are vital, no doubt. The appli­ca­tions which result typ­i­cal­ly involve pol­ished inter­faces and visu­als which make it much eas­i­er for peo­ple to search, browse, and mashup the data.

Take for exam­ple the Apps for Amer­i­ca 2 win­ner, Data­Mash­er, which lets users cre­ate nation­al heat maps by cross­ing two datasets (either adding, sub­tract­ing, divid­ing, or mul­ti­ply­ing val­ues). These oper­a­tions, how­ev­er, can’t show cor­re­la­tion, and at best they can only show out­liers. As one anony­mous com­menter put it:

I don’t get it. It shows vio­lent crime times pover­ty. So these are either poor, or vio­lent, or both? I don’t think mul­ti­ply­ing the two fac­tors is very enlightening.

What we end up with is that many of the pos­si­ble com­bi­na­tions of datasets lead to down­right point­less maps which add lit­tle if any infor­ma­tion to a dis­course about those datasets.

Data.gov and indeed many of the appli­ca­tions built around it some­how fall short of the mark in terms of help­ing peo­ple share and build on the insights of oth­ers – to pro­duce infor­ma­tion. It’s not sim­ply that we need inter­faces to data, we also need ways to col­lab­o­ra­tive­ly make sense of that data.

The Min­neso­ta Employ­ment Explor­er was an ear­ly for­ay into help­ing peo­ple col­lab­o­ra­tive­ly make sense of gov­ern­ment data. It not only visu­al­izes employ­ment infor­ma­tion but also allows peo­ple to ask ques­tions and build on the insights of oth­ers look­ing at the visu­als in order to make sense of the data. In the long run it’s these kinds of sense­mak­ing tools that will real­ly unlock to poten­tial of the datasets pub­lished by the government.

What’s Next?

With a long tra­di­tion of mak­ing sense of the com­plex, there’s a unique oppor­tu­ni­ty for the insti­tu­tion of jour­nal­ism to play a lead­er­ship role here. Jour­nal­ists can lever­age their expe­ri­ence and exper­tise with sto­ry­telling to pro­vide struc­tured and com­pre­hen­sive explo­rations of datasets as well as con­text for the inter­pre­ta­tion of data via these appli­ca­tions. More­over, jour­nal­ists can focus the efforts and atten­tion of inter­est­ed cit­i­zens to chan­nel the sense­mak­ing process.

I’ll sug­gest four explic­it ways for­ward here:

(1) that data-based appli­ca­tions be built with an under­stand­ing of try­ing to pro­mote infor­ma­tion and insight rather than sim­ply be data­base widgets,
(2) that jour­nal­ists should be lead­ers (but still col­lab­o­ra­tors with the pub­lic) in this sense­mak­ing enterprise,
(3) that these appli­ca­tions incor­po­rate the abil­i­ty to aggre­gate insights around what­ev­er visu­al inter­face is being pre­sent­ed, and
(4) that data.gov or oth­er gov­ern­men­tal data por­tals should col­lect and show track­back links to all appli­ca­tions pulling from its var­i­ous datasets.

And final­ly, after we all fig­ure out how to make sense of all this great new data, lies the ques­tion of whether gov­ern­ment is even “lis­ten­ing” to these appli­ca­tions.  Is the fed­er­al gov­ern­ment pre­pared to accept or adopt the insight of its con­stituents’ data analy­sis into policy?

TEMA

O

ffentli
ge data

116 ARTIKLER FRA VOX PUBLICA

FLERE KILDER - FAKTA - KONTEKST

5 KOMMENTARER

  1. […] Kom­menter! De økende meng­dene offentlig data som blir gjort tilgjen­gelige gir stor muligheter for både økt bruk­er­in­volver­ing og nye måter å dri­ve jour­nal­is­tikk på,  skriv­er Nicholas Diakopou­los i en artikkel for nettma­gasinet Vox Publica. […]

  2. […] tools to help jour­nal­ists uncov­er new sto­ries in vast cor­po­ra of data. With the recent push toward civic data trans­paren­cy by the US Gov­ern­ment, com­pu­ta­tion­al account­abil­i­ty tools will be essen­tial to uncovering […]

  3. […] indi­vid­u­als, it’s hard to think that their approach is real­ly sus­tain­able. I’ve argued before that there’s lit­tle impe­tus for build­ing on and con­nect­ing the dots when there’s lots […]

  4. […] At our uni­ver­si­ty depart­ment, we plan to con­tin­ue our project with a dif­fer­ent approach — build­ing appli­ca­tions or ser­vices, hope­ful­ly in coop­er­a­tion with Nor­we­gian media. This way, we want to demon­strate how gov­ern­ment data can be re-used in ways that stim­u­late pub­lic debate. We also have an ambi­tion to strength­en the devel­op­ment of com­pu­ta­tion­al jour­nal­ism. If we suc­ceed in this, we can give a small con­tri­bu­tion to what must be a long term goal for the open data com­mu­ni­ty — mov­ing from raw data to real insight. […]

  5. […] At our uni­ver­si­ty depart­ment, we plan to con­tin­ue our project with a dif­fer­ent approach – build­ing appli­ca­tions or ser­vices, hope­ful­ly in coop­er­a­tion with Nor­we­gian media. This way, we want to demon­strate how gov­ern­ment data can be re-used in ways that stim­u­late pub­lic debate. We also have an ambi­tion to strength­en the devel­op­ment of com­pu­ta­tion­al jour­nal­ism. If we suc­ceed in this, we can give a small con­tri­bu­tion to what must be a long term goal for the open data com­mu­ni­ty – mov­ing from raw data to real insight. […]

til toppen