From Civic Data to Civic Insight

Journalists can help the public make sense of the growing amount of government data now being made available.

Ear­li­er this year the water began to rece­de on govern­ment data in the Uni­ted Sta­tes with Pre­si­dent Obama’s announ­ce­ment of an unpre­ce­den­ted push toward furt­her trans­pa­rency in the federal govern­ment. But with the rush of new data comes the chal­len­ge of making sen­se of it all — somet­hing admit­ted­ly still in its for­ma­ti­ve sta­ges.

By June of 2009 the nation’s Chief Infor­ma­tion Offi­cer, Vivek Kund­ra, had over­se­en the launch of data.gov with the goal of increas­ing pub­lic access to machi­ne rea­dab­le data­sets pro­du­ced by the federal govern­ment. Other govern­ment sites such as usaspending.gov and recovery.gov have sin­ce been laun­ched to pro­vi­de even more focu­sed data on how the U.S. spends its tax­pay­ers’ dol­lars.

A promise of increased participation

The pro­mi­se of data.gov and of many of these other civic data col­lections is in allowing citizens to par­ti­ci­pa­te in the scruti­ny of their govern­ment and of socie­ty at lar­ge, ope­ning vast sto­res of data for exa­mi­na­tion by anyone with the inte­rest and pati­en­ce to do so. Open source civic data ana­ly­sis, if you will.

The array of data avai­lab­le on data.gov is still spar­se in some areas but has ste­ad­i­ly grown to inclu­de things ran­ging from resi­den­ti­al energy con­sump­tion, to patent appli­ca­tions, to natio­nal water qua­li­ty data among others.

And the data trans­pa­rency move­ment isn’t just federal any­mo­re: Sta­te and local muni­ci­pa­lities such as Cali­for­nia and New York City are following suit with pled­ges to make more civic data avai­lab­le.

The bene­fits of the open data move­ment are also star­ting to be rec­og­nized throug­hout Euro­pe. The U.K. has cal­led on Sir Tim Ber­ners-Lee, the inven­tor of the world wide web, to lead a simi­lar govern­ment data trans­pa­rency effort the­re, which should soon result in a data.gov ana­lo­gue. And indi­ca­tions of move­ments are begin­ning to stir in Ger­ma­ny (link in Ger­man).

Beyond Data Scraping

In the U.S. various govern­ment data resources have been avai­lab­le in some form or anot­her online for years now. Pro­gram­mers could scra­pe these online data sources by wri­ting custom par­s­ers to scan webpa­ges and crea­te their own data­ba­ses. And many jour­na­list-pro­gram­mers wor­king in today’s modern news­rooms still do. But it’s mes­sy, it doesn’t sca­le or extend well, it’s britt­le, and ulti­mate­ly the data that results may not inter­ope­ra­te well with other data.

Having govern­ment buy-in to the pub­li­ca­tion of orga­nized and struc­tu­red data low­ers the bar­rie­rs sub­stanti­al­ly for devel­opers and others to get involved with ana­ly­zing that data. It also means that struc­tu­red for­mats, such as those that con­form to seman­tic Web stan­dards can inter­ope­ra­te more easi­ly and be uti­lized to build ever more com­plex appli­ca­tions on top of the data.

Data.gov ≠ Insight.gov

So now that the U.S. govern­ment is pub­lish­ing all kinds of data online, socie­ty will be bet­ter, right? Well – may­be. Let’s not for­get that data has a long way to go before it beco­mes the infor­ma­tion and know­led­ge that can ulti­mate­ly impact back on poli­cy.

Some non-govern­men­tal orga­niza­tions are pushing data to become infor­ma­tion by incen­ti­vi­zing con­tests with big prizes. For instan­ce, the Apps for Ame­rica 2 con­test, coor­di­nated by Sun­light Labs, awar­ded a total of $25,000 to the top appli­ca­tion sub­mis­sions which made data.gov data more trans­pa­rent and acces­sib­le for citizens.

These efforts at coor­di­na­ting devel­opers and sti­mu­la­ting appli­ca­tion devel­op­ment around govern­ment data are vital, no doubt. The appli­ca­tions which result typi­cal­ly invol­ve polis­hed inter­faces and visu­als which make it much easi­er for peop­le to search, brow­se, and mas­h­up the data.

Take for examp­le the Apps for Ame­rica 2 win­ner, Data­Mash­er, which lets users crea­te natio­nal heat maps by crossing two data­sets (eit­her adding, sub­trac­ting, divi­ding, or mul­ti­p­ly­ing values). These ope­ra­tions, how­e­ver, can’t show cor­re­la­tion, and at best they can only show out­li­ers. As one ano­ny­mous com­men­ter put it:

I don’t get it. It shows vio­lent cri­me times pover­ty. So these are eit­her poor, or vio­lent, or both? I don’t think mul­ti­p­ly­ing the two factors is very enligh­te­ning.

What we end up with is that many of the pos­sib­le com­bi­na­tions of data­sets lead to down­right point­less maps which add litt­le if any infor­ma­tion to a dis­cour­se about those data­sets.

Data.gov and inde­ed many of the appli­ca­tions built around it somehow fall short of the mark in terms of hel­ping peop­le share and build on the insights of others – to pro­du­ce infor­ma­tion. It’s not sim­ply that we need inter­faces to data, we also need ways to col­la­bo­ra­tive­ly make sen­se of that data.

The Min­ne­so­ta Emp­loy­ment Explo­rer was an ear­ly for­ay into hel­ping peop­le col­la­bo­ra­tive­ly make sen­se of govern­ment data. It not only visu­alizes emp­loy­ment infor­ma­tion but also allows peop­le to ask ques­tions and build on the insights of others look­ing at the visu­als in order to make sen­se of the data. In the long run it’s these kinds of sen­se­ma­king tools that will real­ly unlock to poten­ti­al of the data­sets pub­lis­hed by the govern­ment.

What’s Next?

With a long tra­dition of making sen­se of the com­plex, there’s a uni­que opport­u­ni­ty for the insti­tu­tion of jour­na­lism to play a lea­dership role here. Jour­na­lists can leverage their expe­ri­en­ce and exper­ti­se with story­tel­ling to pro­vi­de struc­tu­red and com­pre­hen­si­ve explo­ra­tions of data­sets as well as con­text for the inter­pre­ta­tion of data via these appli­ca­tions. More­over, jour­na­lists can focus the efforts and atten­tion of inte­re­sted citizens to chann­el the sen­se­ma­king process.

I’ll sug­gest four expli­cit ways for­ward here:

(1) that data-based appli­ca­tions be built with an under­stan­ding of try­ing to pro­mote infor­ma­tion and insight rat­her than sim­ply be data­base wid­gets,
(2) that jour­na­lists should be lea­ders (but still col­la­bo­ra­tors with the pub­lic) in this sen­se­ma­king enter­pri­se,
(3) that these appli­ca­tions incor­po­rate the abi­li­ty to aggre­gate insights around whate­ver visu­al inter­face is being pre­sented, and
(4) that data.gov or other govern­men­tal data por­tals should col­lect and show tra­ck­back links to all appli­ca­tions pul­ling from its various data­sets.

And final­ly, after we all figu­re out how to make sen­se of all this great new data, lies the ques­tion of whether govern­ment is even “liste­ning” to these appli­ca­tions.  Is the federal govern­ment pre­pared to accept or adopt the insight of its con­sti­tu­ents’ data ana­ly­sis into poli­cy?

TEMA

O

ffentli
ge data

116 ARTIKLER FRA VOX PUBLICA

FLERE KILDER - FAKTA - KONTEKST

5 KOMMENTARER

  1. […] Kom­men­ter! De øken­de meng­de­ne offent­lig data som blir gjort til­gjen­ge­li­ge gir stor mulig­he­ter for både økt bru­ker­in­volve­ring og nye måter å dri­ve jour­na­lis­tikk på,  skri­ver Nicholas Diakopou­los i en artik­kel for nett­ma­ga­si­net Vox Pub­li­ca. […]

  2. […] tools to help jour­na­lists uncover new sto­ries in vast cor­po­ra of data. With the recent push toward civic data trans­pa­rency by the US Govern­ment, com­pu­ta­tio­nal accoun­ta­bi­li­ty tools will be essen­ti­al to uncove­ring […]

  3. […] indi­vi­duals, it’s hard to think that their approa­ch is real­ly sustai­nab­le. I’ve argued before that there’s litt­le impe­tus for buil­ding on and con­nec­ting the dots when there’s lots […]

  4. […] At our uni­ver­sity depart­ment, we plan to con­ti­nue our pro­ject with a dif­fe­rent approa­ch — buil­ding appli­ca­tions or ser­vices, hope­fully in coope­ra­tion with Nor­we­gi­an media. This way, we want to demon­st­rate how govern­ment data can be re-used in ways that sti­mu­la­te pub­lic deba­te. We also have an ambition to strengt­hen the devel­op­ment of com­pu­ta­tio­nal jour­na­lism. If we succe­ed in this, we can give a small con­tri­bu­tion to what must be a long term goal for the open data com­mu­ni­ty — moving from raw data to real insight. […]

  5. […] At our uni­ver­sity depart­ment, we plan to con­ti­nue our pro­ject with a dif­fe­rent approa­ch – buil­ding appli­ca­tions or ser­vices, hope­fully in coope­ra­tion with Nor­we­gi­an media. This way, we want to demon­st­rate how govern­ment data can be re-used in ways that sti­mu­la­te pub­lic deba­te. We also have an ambition to strengt­hen the devel­op­ment of com­pu­ta­tio­nal jour­na­lism. If we succe­ed in this, we can give a small con­tri­bu­tion to what must be a long term goal for the open data com­mu­ni­ty – moving from raw data to real insight. […]

til toppen