Finding the keys to government data: seminar report

Many approaches, same interest: The Bergen seminar on open government data brought together journalists, academics, civil servants and business innovators.

The semi­nar with the opti­mi­s­tic head­li­ne «Give us our data» was orga­nised by the Info­me­dia depart­ment at the Uni­ver­sity of Ber­gen. The depart­ment has ini­tia­ted and fun­ded a fact-fin­ding pro­ject on Nor­we­gi­an govern­ment data this autumn, hoping that the pro­ject report and the semi­nar can help move the topic hig­her up on the poli­ti­cal and busi­ness agen­das.

A whole cata­lo­gue of inter­e­s­ting facts and opi­nions about govern­ment data — that’s what I’m taking away from the semi­nar (of cour­se, as I hel­ped orga­ni­se it this is a com­plete­ly sub­jec­ti­ve and biased view!).

Open data as a topic is unusu­al in that it brings together peop­le with very dif­fe­rent roles and back­grounds, from com­pu­ter scien­tists via pub­lic sec­tor spec­ia­lists to jour­na­lists, busi­ness entre­preneurs and inno­va­ti­ve civil ser­vants. The pre­sen­ta­tions and deba­tes at the semi­nar always zoo­med in on the same ques­tions, but from dif­fe­rent ang­les: Why should more govern­ment data be made pub­lic? What obsta­c­les are in the way and how can they be passed? What can we do with the data?

These are my notes from the semi­nar pre­sen­ta­tions, sup­ple­men­ted with sli­des from speak­ers. See also other reports and remarks (in Nor­we­gi­an): Ben­te Kals­nes’ post on Origo­blog­gen has spar­ked a live­ly deba­te, and a blog com­ment from Anders Waage Nil­sen sum­ma­rizes the day very effi­ci­ent­ly.

Denmark: Demand-driven approach

Cathri­ne Lip­pert from Den­mar­k’s Natio­nal IT and Tele­com Agency reported on the agency­’s ini­tia­ti­ves to impro­ve access to govern­ment data. They inclu­de a pro­ject com­pe­tition for inno­va­ti­ve ser­vices (win­ners to be announ­ced at a con­fe­ren­ce on Februa­ry 4), an inno­va­tion pro­gram­me directed towards the pri­va­te sec­tor, and a data source cata­lo­gue on the soci­al plat­form Plan­ned is also an open data desk which can pro­vi­de assi­stan­ce, define guid­e­li­n­es and high­light good prac­tices.

Down­load pre­sen­ta­tion (pdf).

The agency tries to advan­ce its agen­da by appea­ling to and brin­ging together inte­re­sted groups in both the pri­va­te and pub­lic sec­tor. Lip­pert said the agency belie­ves it can accom­plish more by this demand-dri­ven approa­ch mobi­li­sing the grass­roots. A top-down approa­ch is hard, as open data does not have the same poli­ti­cal weight as cur­rent­ly in Bri­tain and the US.

Britain: Data and innovative journalism at The Guardian

The Bri­tish new­spa­per is at the van­guard of using data in jour­na­lism. Simon Rogers, edi­tor of the Data­b­log, explai­ned how The Guar­di­an works toward the «mutua­li­sa­tion of data». Data is shared with users by pub­lish­ing the data mate­ri­al behind sto­ries on the Goog­le Docu­ment plat­form — simp­le and user-fri­end­ly. A Flickr group has been set up to col­lect users’ own visu­aliza­tions of data.

Increas­ing­ly, the role of jour­na­lists will be to guide the pub­lic through the vast forest of data; to be cura­tors of infor­ma­tion, Rogers said.

The Guar­di­an’s «crowds­ourcing» of rese­ar­ching the files of par­lia­ment mem­bers’ expen­ses is alre­ady famous. More than 23.000 users took part in reviewing the files. The edi­tors learned from the expe­ri­ment that when you ask users for help, you need to define mana­ge­ab­le tasks and you should give the users somet­hing back for their efforts. When a new batch of data was released, the edi­tors gave more spec­i­fic tasks and the job was done in one and a half days.

Last week, The Guar­di­an laun­ched its own gatew­ay to pub­lic data por­tals. In the futu­re, they also want to give peop­le visu­aliza­tion tools, Rogers explai­ned.

Hidden data and how to find them

Web develo­per Harald Gro­ven at the Nor­we­gi­an Cent­re for ICT in Edu­ca­tion focu­sed his pre­sen­ta­tion on how vast amounts of highly inter­e­s­ting pub­lic sec­tor data are kept under lock and key. In the ana­lo­gue era pub­lish­ing medi­um or low level aggre­ga­tes of data was prac­ti­cal­ly impos­sib­le — the­re was­n’t enough paper. This is no lon­ger rele­vant, but the same prac­tices remain, Gro­ven said. Legal con­stra­ints are part of the rea­son why Sta­ti­s­tics Nor­way and other insti­tu­tions do not release more fine-grai­ned data.

A Nor­we­gi­an govern­ment data por­tal should con­cen­t­rate on making avai­lab­le ano­ny­mized low level aggre­gated sta­ti­s­tics, data sources that are large­ly unk­nown today, Gro­ven recom­men­ded. He illust­rated the pro­po­sition with examp­les from his own work devel­o­ping ser­vices aimed at giving young peop­le a bet­ter basis for making deci­sions about what to study. A type of data nee­ded for one of the ser­vices, sala­ry levels in dif­fe­rent occu­pa­tions, was dif­fi­cult to get access to at a suf­fi­ci­ent­ly detai­led level.

A news journalist’s perspective: TV 2

Jour­na­lists often expe­ri­en­ce that pub­lic sec­tor agen­cies want to con­trol the pre­sen­ta­tion of data, Gau­te Tjems­land of Nor­we­gi­an TV 2’s news web­si­te said in his pre­sen­ta­tion. When TV 2 wan­ted the data from natio­nal school tests, the mini­s­try respon­ded by sen­ding pdf docu­ments, before final­ly caving in and releas­ing the spre­ads­he­ets that they had had all along. The rea­son was expli­cit­ly that they did­n’t want the media to pro­du­ce school ran­kings — i.e. pre­sent the data in their own way.

For jour­na­lists, the ideal situa­tion is to get struc­tu­red data, as detai­led as pos­sib­le, and as fast as pos­sib­le, Tjems­land com­men­ted. He had encounte­red three main obsta­c­les. Pub­lic sec­tor agen­cies want to retain con­trol over infor­ma­tion; they are afraid of los­ing reve­nue; or in many cases they are not awa­re that their data can be valuab­le to others. The last obsta­cle is pro­bab­ly the most impor­tant, Tjems­land said.

The TV 2 edi­tor pro­po­sed bench­mar­king the open­ness of govern­ment insti­tu­tions. By defi­ning vari­ab­les to measure trans­pa­rency, more pres­sure can be applied to have govern­ment data released. The media need to do their part by deman­ding infor­ma­tion and should take a lead­ing role in the deba­te about open data.

The need for a

In my own pre­sen­ta­tion, I emp­ha­sized four main fin­dings from the pro­ject at the Info­me­dia depart­ment — based on a sur­vey among sta­te agen­cies, an eva­lua­tion of sta­te agency web­si­tes and inter­views with civil ser­vants at the local and regio­nal level.

We found that the­re is a scar­city of infor­ma­tion about what data sources that actual­ly exist. Very few agen­cies pro­vi­de sub­stan­ti­al infor­ma­tion about their own data­sets. Second, a cen­tral data­sto­re, a, does­n’t exist; the­re­fore we created a simp­le «sto­re» of our own using a Goog­le spre­ads­he­et. With the help of a small com­mu­ni­ty around 130 data sources have been registe­red the­re so far. Third, our sur­vey and inter­views con­vin­ced us of the great poten­ti­al that exists in making more data avai­lab­le. Among other results, six out of ten agen­cies said they plan to make more data avai­lab­le during the next year. Final­ly, I high­lighted how know­led­ge of open data issues vary wide­ly across sec­tors and agen­cies. This pro­bab­ly reflects the low pro­fi­le that the topic still has poli­ti­cal­ly and in the pub­lic sphe­re.

In our pro­ject report we make ten pro­po­sals for making more data avai­lab­le in Nor­way. In the pre­sen­ta­tion I emp­ha­sized four of them: Crea­te data­sto­res at the sta­te, regio­nal and local levels; define prin­cip­les and guid­e­li­n­es; give spec­i­al atten­tion to pri­vacy issues; and define and fund pilot pro­jects to kick-start the process.



ge data




  1. […] data­sets avai­lab­le for re-use. Our first pro­ject report (see Eng­lish sum­mary), pre­sented at a semi­nar in Ber­gen in Janua­ry, is main­ly based on this work. Among the […]

til toppen