In April 2012 representatives from 55 governments and hundreds of delegates from civil society gathered in Brazilia for the second annual meeting of the Open Government Partnership.
The Open Government Partnership is a new multilateral initiative that aims to secure concrete commitments from governments to promote transparency, empower citizens, fight corruption, and harness new technologies to strengthen governance. In the spirit of multi-stakeholder collaboration, OGP is overseen by a steering committee of governments and civil society organisations.
Each signatory country presents a National Action Plan, with list of commitments that it will be expected to fulfil in the following year. This is the list of the top ten commitments presented by countries in Brazilia.
At the 2012 annual meeting, the UK took on the co-chairing of the OGP for the next year. ORG has joined with other civil society organisations to form a coalition that will engage with the government to ensure that it fulfils its obligations and delivers on its commitments.
Our initial civil society analysis of the UK National Action Plan, can be found here. It was produced with contributions from Article 19, Campaign for Freedom of Information, Christian Aid, Global Witness, ONE, Open Rights Group, Publish What You Fund, Tiri and Transparency International UK.
These are the main issues we found:
The UK government must improve its engagement with civil society including wider consultation and clear mechanisms for collaborative design and progress monitoring of the national action plan.
The current national commitments are too focused on open data, information technology and public services and should be expanded to cover a comprehensive model of open governance.
Internationally, the UK is a global leader on aid transparency. However, the UK must now address the transparency of natural resource revenues and international corporate transparency more broadly.
The paper can be directly downloaded here:
We have helped create a common web space for UK organisations at OpenGovernment.org.uk and we will be posting much of our work there as well as in our site.
While the advances since 2006 are undeniable, the comment above shows there is a long way to go. The new Public Data Group that will amalgamate OS, Land Registry and some other data providers will perpetuate the monopoly model while giving away minor data concessions. The issue here is the basic core public data infrastructure (mapping, stats, etc.) required for every other service and open data project. This is the "too difficult" box that could hamper innovation beyond some college project apps.
This brings another critical issue with the current government's Open Data agenda. There is an unhealthy conflation of transparency, data on public services and personal data, all of which converge towards the "Open for Business" principle.
Transparency of government has not advanced because now you can get data instead of paper print outs. In our area of digital issues and copyright, policy is being carried out with the same influence of lobbyists hiding behind commercial confidentiality in order to refuse Freedom of Information requests.
Public scrutiny of data on public services, such as hospitals and schools, is very welcome and can indeed save lives. But in the government's Open Public Services agenda this data would mostly enable an open market of qualified providers, without concrete commitments from public bodies and clear mechanisms for improving outcomes. Recent NHS debates show that this model is very controversial to say the least.
Personal data was not part of the original Free Our Data campaign but it is becoming a central pillar of the policy. This ranges from sharing medical records with pharmaceutical companies to opening up data on welfare and benefits. While some of these initiatives are not strictly open data, as they will have restrictions on access, they are being thrown in the same policy bag. Despite assurances that personal data will be anonymised, there is almost complete consensus in the tech community that this is not possible in an open environment.
Beyond individual privacy, we could question the legitimacy of seeing Public Big Data - composed of millions of individual digital breadcrumbs - as simply an economic asset to be shared with the likes of Experian. Instead we should see it as a common treasure trove that should be democratically governed towards the public good.
The Open Rights Group will be discussing some of this issues in our forthcoming conference ORGCON 2012, come along!
Access to the Agreement between Google Books and the British Library
The Google Books project has been the subject of protracted legal battles, generating a huge debate as to whether it will help authors distribute their work or turn them into low paid employees of the corporation. Most of these debates have focused on books under intellectual property restrictions, with less debate covering the inclusion of out-of-copyright works.
The British Library recently announced to much fanfare a deal with Google to make available online a quarter of a million books no longer restricted by copyright, thus in the public domain.
The deal is presented as a win-win situation, where Google pays for the costs of scanning the books, which will be available on both Google and BL’s websites. This sounds very philanthropic from Google, however the catch is in the detail:
“Once digitised, these unique items will be available for full text search, download and reading through Google Books, as well as being searchable through the Library’s website and stored in perpetuity within the Library’s digital archive.”
In order to find out what this really means we asked the British Library for a copy of the agreement with Google, which was not uploaded to their transparency website with other similar contracts, as it didn’t involve monetary exchange. This may be a loophole transparency activists want to look at. After some toing and froing with the Freedom of Information Act we got a copy, which can be downloaded here:
Notice: Google has kindly agreed to the publication of the agreement, while asserting their copyright over it and wishing to restrict further re-distribution.
The document seems to follow similar agreements with US libraries, but please let us know in the comments or by email what you think. Our preliminary views are below.
The agreement has clauses that in a nutshell mean that only Google will be able to do anything they want with the scanned books, while BL will have restrictions on what they can or cannot do with their digital copy of the scans. BL will be able to display the books in their website, but must prevent commercial use (e.g. print on demand), redistribution of the copies or automated downloads. Google will primarily index the books, but will also be able to license or sell copies and make them available for printing.
This is understandable, as despite its laudable motto to “don't be evil” Google is not a charity, but a very successful business that is investing hard cash on scanning books in order to make a profit elsewhere. It must restrict access to the books to competitors. But, however natural it may be, is this a satisfactory state of affairs for the public interest and the protection of the public domain?
Free as in beer
Google already has digitised and made freely available over 15 million books in the public domain. This is a good thing in principle, but is it wise to base national policy for the digitisation of literary works on the good will of a corporation? There is a clause (4.3.1) in the Agreement that would lift restrictions on the Library if Google fails to provide free online access to the public domain works for a certain period of time.
While this provides some safeguards, public institutions should look for a mixed model that avoids relying excessively on one single partner. There are other initiatives promoting open access -such as The Internet Archive - which should be given consideration.
Copyright Year Zero
An issue with public private partnerships for digitisation is the creation of new intellectual property. This is not generally a problem in the USA, but in UK digital copies may attract a new copyright, although this is unclear. When combined with restrictions on access to original works, this could create a de facto “copyright reset” on materials that have long entered the public domain. This would place restrictions on redistribution and reuse of the digitized books, making derivative works very difficult and expensive.
The agreement clearly claims all rights on Google’s digital copy in clause 4.2. In the case of Google fortunately this seems less of an issue in practice, as their business model is not based on selling access to the public domain. There is an intractable conflict between open access and placing restrictions on public domain works via digitisation contracts –mass downloading, text mining, redistribution, etc. -- instead of copyright, which remains an issue here and elsewhere.
Concerns have been raised with this concentration of digital works under one company, although in theory anyone else can step forward and scan those books again as the Agreement is not exclusive.
This sounds reasonable until you have a look at the wider picture of the digitisation of culture in Europe. Nick Poole, from the Collections Trust estimates the costs of digitising the contents of Europe’s museums, archives and libraries, including the audiovisual material they hold at €100 billion over ten years, with another €10-25 billion over the following ten years to maintain it and make it available.
With such a need for investment, it would be reasonable to expect that works be digitised only once, with a common strategy for ensuring the eventual incorporation of all works to an unrestricted public domain digital network run for the public interest in the same way as physical national libraries.
Length of restrictions
If Google and other companies are to invest in digitisation they will expect a profit, which will come from some restrictions. The Agreement establishes such restrictions for a period of 15 years.
A recent EU report on the digitisation of culture called The New Renaissance defines optimal arrangements for public private partnerships, and sets a maximum of seven years for preferential terms. This is perceived to strike a balance between the interests of businesses and public institutions.
Most digitisation agreements of this kind by The National Archives and British Library are set to last ten years. We believe that a transparent cost recovery should inform the length of restrictions, with a cap of seven years as recommended at European level.
Open Data and the Strategy for Growth
Google does not make any money selling public domain books, but it uses them for text mining, for its search engine and translation software, which is seen as the main business objective of the whole operation, including the digitisation of in-copyright books.
The Agreement contains provisions for non-commercial access to the material by non-profit institutions for academic and research purposes, although the latter will have to sign a separate contract with Google. There is also a welcome clause explicitly allowing for metadata to be included in the Europeana database (4.9).
The Hargreaves Review of copyright in UK proposed a “wide non-commercial research exception covering text and data mining” because this area is perceived as critically important. Separately, the recent consultation paper on Open Data envisions that public data will be one of the engines of innovation to overcome the economic crisis, deserving a section in the forthcoming Strategy for Growth.
If we look at the Agreement from this perspective we see that allowing non-commercial research is laudable, but many opportunities for innovation will require commercial input and it will be up to Google to determine what counts as commercial in the research access contracts. Economic growth will be lost if start-up companies are denied the chance to innovate by incumbent businesses such as Google.
If the government wants to stimulate growth through open data it needs to put its money where its mouth is and provide adequate funding. Besides, the digitisation of cultural and archival materials, into datasets, media, texts and metadata, should be a natural extension of the mission of public institutions. However, in our conversations with the British Library, the response is always the same: there is simply no money being provided for digital activities.
In addition, this process distorts the values of cultural institutions, which increasingly perceive digital activities as a source of revenue similar to the ubiquitous gift shop. Thus, libraries and museums attempt to claim and enforce copyright over digital copies of public domain works themselves.
The British Library already has a digital collection of public domain works which are not open and freely accessible, in part due to the perceived loss of potential revenue. We would like to see a commitment form the British Library to make public domain books fully available once they are free from contractual restrictions. However, we understand this entails some funding, which is not generally available.
A recent report on Funding of the arts and heritage from the House of Commons Culture, Media and Sport Committee contains one single passing reference to supporting the “challenging transition to the digital age”, and some praise for the efforts of the Arts Council Collection to digitize and put their works online. There is no vision for the internet age.
Without a national strategy for the digitisation of culture, supported by an adequate mix of government and private funding, public institutions will be at the mercy of a handful of businesses, which is not beneficial for the public interest. This should be seen as money well invested in the future.
The crime mapping initiative at Police.uk has generated an unprecedented debate on the merits of online crime mapping. Issues have been raised about the accuracy of the maps, the privacy implications and the labelling of communities as criminals.
However, there has been very little debate as to the actual effectiveness of the website from the perspective of law enforcement, either by helping people collaborate with police or any other mechanism that has been proven elsewhere. This kind of initiatives in UK tend to take its cue from the US, where this type of websites contain much more detailed information, including registered sex offenders with photographs.
It is easy to see how mapping and use of data is useful for police, but what is less clear is the value of opening the data for public websites to help reducing crime or catching those responsible for the incidents reported on those websites.
This is definitely an issue worthy of debate, and it sounds like you're having a better discussion in the UK than we have had in the US. Perhaps some of our police agencies offer online crime mapping tools that are more "advanced" than the ones on Police.uk, but they still struggle with the same fundamental issues.
Ultimately, any information-sharing initiative like this should have, at its core, a set of outcomes that the initiative hopes to achieve. What do we want people to do with this information? Then, with these outcomes in mind, we must ask whether the initiative provides enough information, or the right type of information, to achieve them.
For instance, advocates of these measures usually say that they want the public to be able to see where the hot spots are and take steps to prevent their own victimization. So what information would a typical citizen need to prevent victimization?
Let's use burglary as an example. A citizen logs into a mapping web site and finds that her neighborhood is a hot spot for burglary. She wants to try to prevent this crime from happening at her house. What should she do? This is where online crime mapping technologies generally fall short.
To truly inform prevention, the data would have to tell her things like
- how the burglars are typically entering the residences;
- at what times of day; and
- what they are stealing.
"Robbery" is another good example. With no distinction made even between commercial and individual robbery, let alone situational factors, types of weapons, victim profiles, and so on, it's hard to know whether a resident should or should not be concerned about a large dot at the end of his street.
Perhaps the goal is to inform citizen watch groups, so that they can lookout for suspicious activity or potential offenders. But again the same issues apply. What do citizen watch groups need to do these things? Situational variables, offender descriptions, and so on. In technical terms, these systems rarely provide enough attribute data along with the visual representation.
"Violent crime" might be the most stark example from thePolice.uk site. A citizen's reaction to this data would and should varyenormously depending on whether it represents domestic violence, gangviolence, or random street violence, but the site does not provide anycontextual information. Nor do many in the U.S.
In this sense, I believe there is some validity to concerns over cost vs. benefit. There are negative side-effects to this type of data sharing, including cost in public pounds or dollars, loss of privacy, increase in citizen fear, and trouble for property dealers. To me, these downsides are acceptable if they are balanced by positive outcomes. I just question whether those positive outcomes are achievable with the depth of data these sites typically provide.
Actually, we should question whether the outcomes are fully achievable no matter how much data is provided. By putting raw data on the web, police agencies essentially ask citizens to be their own crime analysts: to discern meaningful patterns and trends within the dots.
Naturally, the average citizen can probably identify the most obvious patterns and such, but police forces employ analysts full time to scan, analyze, and interpret this raw data. Yet, rarely do online crime mapping sites attempt to synthesize the information produced by their full-time analysts. A good online crime mapping initiative would not only provide the raw data with a full set of variables; it would also find a way to integrate public versions of the internal reports on patterns, trends, hot spots, and problems that the agency's analysts are no doubt producing every day.
The ultimate problem with these implementations in the US is that they tend to be technology-driven rather than outcome-driven. Police administrators are razzle-dazzled by a salesman who shows them pretty colours and cool little symbols. But rarely does the police agency bother to develop a real strategic plan for this data-sharing. A strategic plan would ask the right questions about outcomes and then find the specific technologies needed to achieve them.
I don't mean to sound bleak, though. Initiatives like this are generally a good start. With the infrastructure in place, the police can offer more data, better data, and additional analysis if they so choose. For instance, clicking on one of the dots brings up a little pop-up window that shows the counts of crime in each category. There is no reason (except by current policy) that this pop-up window couldn't offer more detail as to the characteristics of those crimes. There is no technological reason that the maps could not be accompanied by relevant links to alerts and bulletins that provide more detailed information about patterns and trends. People just have to demand these things.
Finally, to my knowledge there has been no scholarly evaluation of the effects of these web sites on either public perception of crime or crimeprevention. This is something that both our governments should probably invest in.
Conferences, such as the US National Institute of Justice Crime Mapping conference (the next one is in Miami in April), and crime analysis conferences, should also consider this topic. These conferences are generally for academics and analysts, who use geographic information systems (GIS) to create more advanced thematic maps and run complex spatial statistics to identify hot spots, predict future crimes, inform resource deployment, and so on. You rarely see discussions of online citizen-based crime mapping at these conferencessimply because it's too basic; there's no real analysis involved. It is something we should probably discuss more because the implications are complicated even if the use of data is simple.
The recent announcement by the Cabinet Office of the creation of a new Public Data Corporation has quickly generated a large amount of turmoil among open data activists.
The vagueness of the press release has left many bewildered and wondering what are the intentions of the government and what it meansfor existing open data initiatives. Tom Steinberg from MySociety sits in the government's Transparency Board, but seems as in the dark as everyone else about the true intentions of the coalition on this. His view is that we may have to fight on this one as different interests within government pull in their direction with unpredictable results. You can follow the discussion here.
Other experienced activists, such as Chris Taggart, are worried about the involvement of the department for Business, Innovation and Skills (BIS), and what "value for the taxpayer" means, a view reflected in several in specialised medioutlets.
The Register sounds the alarm at the stated "opportunities for private investment in the corporation", which does not sound reassuring from the perspective of open government and transparency.
Open Source advocate and journalist Glyn Moody goes further and calls for public consultations, a view we share.
Simon Rogers, from The Guardian newspaper which has long been at the forefront of open data access, has a good summary of the issues and discussions, and joins in the call for more information while promising to chase this story with a much needed sense of urgency: "Because it sounds like we don't have much time".
The plan does not shed much more light on the government's intentions:
Drive release of high value datasets
i. Work with BIS and HMT to create a Public Data Corporation
ii. Work with the Shareholder Executive to drive the release of core reference data for free re-use from the Public Data Corporation
So far we have not heard from the Treasury but we can -- maybe somewhat unfairly -- assume they are not in the least interested in Open Data and will press for revenue.
The Shareholder Executive is the “body within the British Government responsible for managing the government's financial interest in a range of public companies” *
The Shareholder Executive co-manage several of the trading funds responsible for high value data such as Ordnance Survey, Met Office and UK Hydrographic Office. They are also directly responsible for overseeing Royal Mail, nuclear energy, and the much criticised Export Credit Guarantee Department.
“to pursue the commercialisation, efficiency and examination of alternative ownership structures for core assets within the ShEx portfolio and advise on the possible sale of non-core assets”
It is not clear how this sits with driving the release of data for free reuse, but they have some experience from OS OpenData ™. If the PDC follows this model it could become a two tier system with some free data available, while other valuable stuff is sold as premium services.
The involvement of private investors in basic data provision rather than refined services developed in partnership could be very problematic, as it could weaken public ownership. Indeed, partnerships at any level could become an issue from the point of view of competition law and Public Sector Information Reuse Regulations, as they could be perceived as exclusivity contracts.
On the positive side, this move may finally generate an asset register for public information. If the focus widens from large corporate business it could provide a collaboration environment for two-way improvement of data, and drive innovation by linking smaller start-ups with larger companies that can scale up the ideas.
In general it seems that this announcement has caught many off guard after so many positive news around open data in the past two years: transparency board, data.gov.uk, Ordnance Survey OpenData ™, Open Government License, Right to data, etc.
This coincidentally happens in the same week as the French Agence du Patrimoine Immatériel de l'État (APIE) releases a study proposing a dual pricing model that includes charging for commercial use of data, raising concerns about the end of the honeymoon for Open Government Data.
We have written to Francis Maude asking for more information and a public consultation, given the interest on this issue and the spirit of transparency. We will keep you posted.