Proposed Definition for “Open Data”

Open Data is a philosophy and practice that makes data easily available in order to enable re-use of the data in new and unforeseen ways. Open Data relies on (1) a liberal licensing model that encourages re-use, (2) data discoverability and (3) data accessibility.

  1. Liberal licensing – enables third parties to re-use data with minimal or no legal or policy constraints. This may range from an open license but copyright is maintained (e.g. license encourages use, but copyright is retained by the Government of Canada) to copyleft approach of the Creative Commons initiative.
  2. Data discoverability – Given that data files can be numerous and may not be easily opened and viewed, it is important that data files are catalogued. Hence Open Data relies extensively on some form of metadata to catalogue the data.
  3. Data accessibility – The value proposition of open data is the more data is used the more valuable it is. Data accessibility options range from simply putting unenhanced “raw” data on the web to offering it in a wide variety of formats for diverse audiences requirements. Data needs to be in some format that enables reuse by programmers who develop new applications. Typically this is a structured XML format or equivalent. While web pages that summarize data in a tabular format allow for accessibility of data for the human eye, they are less useful for machine to machine application development and are not considered to not meet the criteria of “Open Data”.

In terms of Web 2.0, Open “Data is the Intel inside” that drives mashups and applications. Source: Tim O’Reilly, 2007.

The Four Panton Principles for Open Data in Science are worth consideration.

What is Gov 2.0? What could be Gov 5.0?

Gov 2.0 = Web 2.2 + Service-Oriented Architecture + Standards + OpenData + OpenLicenses

I find there are many perspectives as to what is Web 2.0 is, let alone what Gov 2.0 can be. In particular, Social Media and Web 2.0 seemed to be used interchangeably. For the Gov 2.0 vision to be a success we need to look beyond Social Media to meet our collective potential. This blog post attempts to concisely outline some commonly accepted definitions, proposes others and offer some personal opinions that I hope will spur debate. While not in the literature, I have arbitrarily added Web 2.1 and Web 2.2 to further classify some key concepts:

Web 1.0 – “Read web”, institutions publish information that people consume.

Web 2.0 – A read – write web or collaborative web or web as a platform. An institution can choose what ever balance of publishing and/or consuming to/from Web 2.0 that meets their business drivers; to provide value added to the organization and their clients or stakeholders. The Web 2.0 “sound bite” has provide a lightning rod for read/write processes and benefits, while other’s argue that it is simply an evolution of Web 1.0. Web 2.0 criteria are diverse and include; harnessing collective intelligence, data is the Intel inside, rich user experience, many devices (e.g. mobile devices), leverage the long tail, innovation in assembly (web as a platform, API, mashups, etc). Source: “Web 2.0 Principles and Best Practices”, John Musser, O’Reilly Press ISBN 0-596-52769-1.

Web 2.1 – Web 2.0 powered by Structured Data. Much information in Web 1.0 (websites) and Web 2.0 is unstructured (Facebook, Chat) or loosely structured based on Folksonomies (Blogs, Photos, LinkedIn, etc.). Content is not categorized into known fields making it is difficult or impossible for computer to computer data or information exchange. With content encoded into an XML document according to a known schema (e.g. title or date), content is separated from presentation form. People are free to build their own applications harvesting data and information from numerous sources. Common implementations are RSS news feeds (in a structured XML file), that is then read by one’s preferred news reader or any other application that can read XML. More advanced integration can be done by connecting a variety of Application Programing Interfaces or Widgets programatically or through web-based Rapid Application Development tools including Yahoo! Pipes.

Web 2.2 – Geographic Web 2.0. The issue with the World Wide Web, is, well it is World Wide. People intuitively think and act spatially; often locally. Web 2.2 has made more explicit use of geography through map based mashups (Google Maps, Google Earth, Microsoft Virtual Earth, etc.), geographically encoded news feeds (GeoRSS), location based searches, cell phone GPS, geocoded Flicker photos and Twitter Tweets. Within the Web 2.x parlance there is literature on the GeoWeb stack, primarily by Andrew Turner. This is the tip of iceberg. The geographic information community is well established outside the Web 2.0 world with huge volumes of data, associated web services and standards. “The future web 2.0 internet operating system…will also provide access to data subsystems. The GeoWeb is perhaps the best developed and one of those most worthy studying by anyone concerned with the future of the internet platform. The GeoWeb is multiplayer and multilayer, a rich melange of data and services, full of opportunity.” Source: Tim O’Reilly, O’Reilly Radar bulletin 2.0.10, Oct. 2008

Web 3.0 – Semantic Web. In a semantic web, words have definitions, aka ontologies, that further classify the unworkable volume of information on the web. For example, if I want to repair the windows in my house, a semantic web will filter out the millions of search results relating to Microsoft Windows. Semantic classification is a method for a large information provider, i.e. governments, to classify data, information and knowledge which will enable consumers to combine relevant, yet heterogeneous sources into their stories.

Social Media is a component of Web 2.0 that provides methods for individuals and organizations to easily “Write” to the Web. Vehicles to share information and knowledge include Wikis, Blogs, Forums, Twitter, etc. Web 2.0 includes Social Media. However, Social Media does not encompass the diversity of Web 2.0 concepts, business processes and technologies. This is an important point as the Social Media community, including professional Social Media Marketing firms/individuals, are active writers/bloggers/twitters; the shear volume of that commentary risk clouding the full potential of Web 2.0 and Gov 2.0 for decision makers.

We are seeing other powerful initiatives that share information and knowledge. In particular I can’t help but think that professional produced, well researched initiatives such as (Technology, Entertainment, Design), are the start of a new, more powerful, breed of science based social advocacy form of Social Media. “The application of the scientific method for social concern”.

Gov 2.0 – Government as a Platform. Gov 2.0 builds on the Web 2.0 Internet as a Platform principles noted above with a couple of important additions. For governments the value added of Gov 2.0 is leveraging a society’s collective intelligence to solve problems, to grow, by providing access to government data via mechanisms that enable data integration and exploration. To power citizen defined applications, government data needs to be readily accessible with open permissions in usable formats. For example, in Canada, government data is copyright by the Queen in Right of Canada. Irregardless of copyright, permissive licensing can be employed, an example of which is on Open Data is in keeping with an Organisation for Economic Co-operation and Development (OECD) resolution [C(2008)36] Maximising the availability of public sector information for use and re-use based upon presumption of openness as the default rule to facilitate access and re-use.”

Parallel with Web 2.0 APIs, large institutions typically use a Service-Oriented Architecture that publish data to standardized services which then power a wide range of applications. In other words data and web services are application neutral. Suggest governments shift focus from “web-site application development” to publishing to web services which will enable applications inside and outside government. In this way data’s value can be increased by its re-use in new and unforeseen ways.


Gov 2.0 = Web 2.2 + Service-Oriented Architecture + Standards + OpenData + OpenLicenses

And to crystal ball….

Gov 3.0 = Web 3.0 + Gov 2.0 + CopyLeft

Gov 4.0 = Gov 3.0 + democratization of decisions + fragmentation of government services

Gov 5.0 = Migration from governments based on the artificial boundaries of nation states to city-states or areas of similar cultures/values. Transition from services provided by a specific government jurisdiction to “government service clouds”, a multitude of service providers at all scales, potentially with world wide reach.

Consider the United States health care debate. The debate is on the agenda because of the election of a new President. Yet, why are personal health needs based upon the vulgarities of a complex, polarized, political dynamic. One could literally die before a solution is offered by that form of implementation. So, for every $100 of taxes I spend on health, why can I not choose a different economic model or ideological model or linguistic model that will govern where my portion of health tax goes. My preference could be a Canadian based public/private hybrid economic health delivery model delivered in Spanish. I choose models and money flow from the international “government service cloud” which in turn fund the professionals, bricks and mortar of my local hospital. I will explore this further in future articles.

Cameron Wilson, Ottawa, Ontario, Canada