From Civic Data to Civic Insight

Journalists can help the public make sense of the growing amount of government data now being made available.

Ear­lier this year the water began to recede on govern­ment data in the Uni­ted Sta­tes with Pre­si­dent Obama’s announ­ce­ment of an unpre­ce­den­ted push toward furt­her trans­pa­rency in the federal govern­ment. But with the rush of new data comes the chal­lenge of making sense of it all — somet­hing admit­tedly still in its for­ma­tive stages.

By June of 2009 the nation’s Chief Infor­ma­tion Offi­cer, Vivek Kundra, had over­seen the launch of data.gov with the goal of increas­ing pub­lic access to machine rea­dable data­sets pro­du­ced by the federal govern­ment.  Other govern­ment sites such as usaspending.gov and recovery.gov have since been laun­ched to pro­vide even more focu­sed data on how the U.S. spends its tax­pay­ers’ dollars.

A pro­mise of increased participation

The pro­mise of data.gov and of many of these other civic data col­lections is in allowing citizens to par­ti­ci­pate in the scrutiny of their govern­ment and of society at large, ope­ning vast sto­res of data for exa­mi­na­tion by anyone with the inte­rest and pati­ence to do so. Open source civic data ana­ly­sis, if you will.

The array of data avai­lable on data.gov is still sparse in some areas but has ste­ad­ily grown to include things ran­ging from resi­den­tial energy con­sump­tion, to patent appli­ca­tions, to natio­nal water qua­lity data among others.

And the data trans­pa­rency move­ment isn’t just federal any­more: State and local muni­ci­pa­lities such as Cali­for­nia and New York City are following suit with pled­ges to make more civic data available.

The bene­fits of the open data move­ment are also star­ting to be rec­og­nized throug­hout Europe. The U.K. has cal­led on Sir Tim Berners-Lee, the inven­tor of the world wide web, to lead a simi­lar govern­ment data trans­pa­rency effort there, which should soon result in a data.gov ana­lo­gue. And indi­ca­tions of move­ments are begin­ning to stir in Ger­many (link in German).

Beyond Data Scraping

In the U.S. various govern­ment data resources have been avai­lable in some form or anot­her online for years now. Pro­gram­mers could scrape these online data sources by wri­ting custom par­s­ers to scan webpa­ges and create their own data­ba­ses. And many journalist-programmers wor­king in today’s modern news­rooms still do. But it’s messy, it doesn’t scale or extend well, it’s brittle, and ulti­mately the data that results may not inter­ope­rate well with other data.

Having govern­ment buy-in to the pub­li­ca­tion of orga­nized and struc­tu­red data low­ers the bar­riers sub­stanti­ally for devel­opers and others to get involved with ana­ly­zing that data. It also means that struc­tu­red for­mats, such as those that con­form to seman­tic Web stan­dards can inter­ope­rate more easily and be uti­lized to build ever more com­plex appli­ca­tions on top of the data.

Data.gov ≠ Insight.gov

So now that the U.S. govern­ment is pub­lish­ing all kinds of data online, society will be bet­ter, right? Well – maybe. Let’s not for­get that data has a long way to go before it beco­mes the infor­ma­tion and know­ledge that can ulti­mately impact back on policy.

Some non-governmental orga­niza­tions are pushing data to become infor­ma­tion by incen­ti­vi­zing con­tests with big prizes. For instance, the Apps for Ame­rica 2 con­test, coor­di­nated by Sun­light Labs, awar­ded a total of $25,000 to the top appli­ca­tion sub­mis­sions which made data.gov data more trans­pa­rent and acces­sible for citizens.

These efforts at coor­di­na­ting devel­opers and sti­mu­la­ting appli­ca­tion devel­op­ment around govern­ment data are vital, no doubt. The appli­ca­tions which result typi­cally involve polis­hed inter­faces and visu­als which make it much easier for people to search, browse, and mas­hup the data.

Take for example the Apps for Ame­rica 2 win­ner, Data­Masher, which lets users create natio­nal heat maps by crossing two data­sets (eit­her adding, sub­trac­ting, divi­ding, or mul­ti­p­ly­ing values). These ope­ra­tions, how­e­ver, can’t show cor­re­la­tion, and at best they can only show out­li­ers. As one ano­ny­mous com­men­ter put it:

I don’t get it. It shows vio­lent crime times poverty. So these are eit­her poor, or vio­lent, or both? I don’t think mul­ti­p­ly­ing the two factors is very enlightening.

What we end up with is that many of the pos­sible com­bi­na­tions of data­sets lead to down­right point­less maps which add little if any infor­ma­tion to a dis­course about those datasets.

Data.gov and indeed many of the appli­ca­tions built around it somehow fall short of the mark in terms of hel­ping people share and build on the insights of others – to pro­duce infor­ma­tion. It’s not sim­ply that we need inter­faces to data, we also need ways to col­la­bo­ra­tively make sense of that data.

The Min­ne­sota Emp­loy­ment Explo­rer was an early foray into hel­ping people col­la­bo­ra­tively make sense of govern­ment data. It not only visu­alizes emp­loy­ment infor­ma­tion but also allows people to ask ques­tions and build on the insights of others look­ing at the visu­als in order to make sense of the data. In the long run it’s these kinds of sen­se­ma­king tools that will really unlock to poten­tial of the data­sets pub­lis­hed by the government.

What’s Next?

With a long tra­dition of making sense of the com­plex, there’s a uni­que opport­u­nity for the insti­tu­tion of jour­na­lism to play a lea­dership role here. Jour­na­lists can leverage their expe­ri­ence and exper­tise with story­tel­ling to pro­vide struc­tu­red and com­pre­hen­sive explo­ra­tions of data­sets as well as con­text for the inter­pre­ta­tion of data via these appli­ca­tions. More­over, jour­na­lists can focus the efforts and atten­tion of inte­re­sted citizens to channel the sen­se­ma­king process.

I’ll sug­gest four expli­cit ways for­ward here:

(1) that data-based appli­ca­tions be built with an under­stan­ding of try­ing to pro­mote infor­ma­tion and insight rat­her than sim­ply be data­base wid­gets,
(2) that jour­na­lists should be lea­ders (but still col­la­bo­ra­tors with the pub­lic) in this sen­se­ma­king enter­prise,
(3) that these appli­ca­tions incor­po­rate the abi­lity to aggre­gate insights around whate­ver visual inter­face is being pre­sented, and
(4) that data.gov or other govern­men­tal data por­tals should col­lect and show tra­ck­back links to all appli­ca­tions pul­ling from its various datasets.

And finally, after we all figure out how to make sense of all this great new data, lies the ques­tion of whether govern­ment is even “liste­ning” to these appli­ca­tions.  Is the federal govern­ment pre­pared to accept or adopt the insight of its con­sti­tu­ents’ data ana­ly­sis into policy?

5 KOMMENTARER

KOMMENTÉR
  1. […] Kom­men­ter! De økende meng­dene offent­lig data som blir gjort til­gjen­ge­lige gir stor mulig­he­ter for både økt bru­ker­in­volve­ring og nye måter å drive jour­na­lis­tikk på,  skri­ver Nicholas Diakopou­los i en artik­kel for nett­ma­ga­si­net Vox Publica. […]

  2. […] tools to help jour­na­lists uncover new sto­ries in vast cor­pora of data. With the recent push toward civic data trans­pa­rency by the US Govern­ment, com­pu­ta­tio­nal accoun­ta­bi­lity tools will be essen­tial to uncovering […]

  3. […] indi­vi­duals, it’s hard to think that their approach is really sustai­nable. I’ve argued before that there’s little impe­tus for buil­ding on and con­nec­ting the dots when there’s lots […]

  4. […] At our uni­ver­sity depart­ment, we plan to con­ti­nue our pro­ject with a dif­fe­rent approach — buil­ding appli­ca­tions or ser­vices, hope­fully in coope­ra­tion with Nor­we­gian media. This way, we want to demon­st­rate how govern­ment data can be re-used in ways that sti­mu­late pub­lic debate. We also have an ambition to strengt­hen the devel­op­ment of com­pu­ta­tio­nal jour­na­lism. If we succeed in this, we can give a small con­tri­bu­tion to what must be a long term goal for the open data com­mu­nity — moving from raw data to real insight. […]

  5. […] At our uni­ver­sity depart­ment, we plan to con­ti­nue our pro­ject with a dif­fe­rent approach – buil­ding appli­ca­tions or ser­vices, hope­fully in coope­ra­tion with Nor­we­gian media. This way, we want to demon­st­rate how govern­ment data can be re-used in ways that sti­mu­late pub­lic debate. We also have an ambition to strengt­hen the devel­op­ment of com­pu­ta­tio­nal jour­na­lism. If we succeed in this, we can give a small con­tri­bu­tion to what must be a long term goal for the open data com­mu­nity – moving from raw data to real insight. […]

Skriv en kommentar

Bidra til god debatt - skriv under fullt navn. Se våre kommentarregler.

Abonner på kommentarer
 
til toppen