Friday, 12 January 2007

The Tragedy of the Enclosed Lands

I want to talk a little bit about the use of mapping data, something which periodically piques my interest at work.

First though, I want to quote parts of an email I received a while back (in 2005), when enquiring how much it would be to licence some electronic maps for use internally at my organisation :
This email is to confirm that you have successfully saved estimate reference 6610 on the OS MasterMap Data Selector, as detailed below.

This estimate excludes VAT and is valid until the date shown below, after which it will expire; it will remain on the Data Selector for six months from the date saved.

Price: £16,966.02

Area selected: Pre-defined polygons - Single London Borough

Licensing information
Licence period (years): 2
Number of terminals: 1


So, just to clarify - for the use of mapping data in one licence for two years of an individual London Borough would cost almost seventeen thousand pounds.

I mention this because as I say, I'm interested in GIS technologies. I was actually going to apply for a job at the Housing Corporation a year ago to work in their GIS team, but I decided against it. More generally though, it's obvious how cool applications of mapping data can be.

In the field I work in housing (asset management) the use of maps would be particularly beneficial. On top of this, I also help manage some patch related data for our housing management team. Being able to link all this into some automapping system would make many tasks I perform so much easier. In addition, visual tools are staggeringly useful when persuading people or explaining things to a new audience.

Being able to show a map which outlined that 80% of our Decent Homes failures are in one half of our stock would be a lot more powerful when persuading our board to release extra funds, for instance. The human mind responds strongly to visual stimuli, and maps in particular plug straight into a particular part of the brain.

Of course, to do all this, we need a certain amount of geographical data. There's the maps themselves and then there's geocoding our properties. Which leads to enquiries like the above. No doubt we could afford that, but in good conscience can we really spend £16k on what is effectively a glorified A to Z? £16k after all could supply brand new kitchens to four families, or ensure 8 homes have brand new central heating systems which will cut fuel bills and keep people warm should the weather change. We have to consider the opportunity costs.

Don't get me wrong, the price listed above is probably way above what is required, and there are undoubtedly other options which would be more cost-effective. But cost is not the only problem. Last year I attended a demonstration by some companies looking to supply us with a GIS solution. I did not get to hear any costs at this point, but what maddened me somewhat was the level of restrictions the data suppliers wanted to put on any information they gave us.

These included :

- Insisting that if we put map data on our intranet we'd have to buy a licence for every potential user, i.e. every person who has access to our intranet. Considering this is over a thousand people now (and growing) this is fairly ridiculous.

- Advising us that we would only be able to print out maps (to include in publications to customers) if we got additional licences for this.

- If we decided not to renew our licence for the data, we'd have to destroy all maps produced/printed as well as the more obvious step of deleting all data we'd produced and uninstalling the software.

Now, it is probably the case that in any pre-sale negotiation we could get out of some of these clauses (and they might not even be enforceable legally speaking) - but the fact people selling these sorts of services believe they're reasonable in the first place speaks volumes.

Free Lunch?

It's well accepted that in most cases, there's no such thing as a free lunch. In this particular case, there is a certain cost incurred by those producing / collecting information, organising it and making sure it's accurate and so on. The price we are willing to pay will depend (to an extent) on the expected use we will make of such data weighed against an estimate of how much it would cost to produce the data ourselves (or from another source).

So, what is a fair price? To be honest, I've no idea. £16k could be a reasonable price but you must remember that is payable every two years. This type of "subscription model" (for want of a better term) makes sense with some types of data. Stock prices for instance, change so often that one must keep up-to-date (sometimes up to the minute) with the latest change. So if I'm interested in that sort of information, I'll consider a subscription of an ongoing basis.

Not so with mapping data, or at least mapping data of this kind. Our properties (by and large) do not change locations, and where something major did occur (e.g. one of our blocks being demolished, or a new road cutting through one of our estates) we would certainly be aware of it - probably before any mapping agency. 80% of our stock has stayed the same for the last five years and probably will for the next five years too. So why would we want to be in a position of paying every two years for such information? Well, we wouldn't.

Beyond this, if we did decide to make a considerable outlay, would being able to use the data on one machine be adequate? Well, no doubt we could engineer our processes in such a fashion so this worked, but by and large for the information to be useful we'd want everyone to have access to it all the time - and this would include publications sent out to customers. This is fairly self-evident.

DIY?

And so what will we do? As is often the case, my answer is DIY : We'll collect the data ourselves. As ridiculous as it sounds, because of the restrictions on the data we'll be much better off simply collecting the information ourselves and using any one of a number of open source applications simply generate the maps ourselves. Or so is my intention.

GPS equipment is now within the reach of the average citizen and a procedure for geocoding our properties could easily be included within our stock condition procedure or even included in with caretaker duties or a void routine.

But...isn't this a bit ridiculous? Aren't we going against a sensible division of labour? Instead of information being collected by experts en masse, we're going to be taking a piecemeal amateur approach. Admittedly with modern GPS equipment it shouldn't be too taxing, but inevitably the quality of the end result will be significantly lower than a "professional" approach. But even given this, the DIY approach will still prove superior because we won't have to worry about any sort of legal nonsense when we have the dataset.

Let's imagine we're not the only person doing this for our area. A local business might feel the same and even the local authority might want to avoid paying the OS fees as well. So, in such a hypothetical situation there would be four parties (the OS, ourselves, the LA, a local business) collecting the same data. Totally unecessary duplication of effort.

But in some instances, duplication is not really duplication at all. There are numerous companies making shoes, but each one does things slightly differently (either in terms of design of the shoe, or the production technique, or whatever). Such diversity is desirable because it increases the chance of innovation and means that "better" designs might predominate.

This can hardly be said to be the case with mapping data - if we're all using the same standards (which we are) then if everyone does things perfectly then theoretically we would end up with identical end "products" - i.e. exactly the same data. We are measuring objectively real conditions and as such there is no artistic or creative flair involved. As such, duplication is not desirable, it is merely waste.

Which leads me to my point. In the United States, the government is restricted from holding certain kinds of copyright and so their mapping data is largely in the public domain (not counting classified military data). Not so in this country. Maps created by government employees are withheld from citizens unless they spend money. As seen in my quote above, in some instances these are not insubstantial sums of money.

In my second example, the cost was not an issue (since I never found out what it was) but the restrictions seemed utterly absurd to me from an operational standpoint. Even if each individual licence was priced at £50 per user there would still be an additional administrative burden on checking we were compliant and the threat if we ever cancelled our licence we'd have to undertake a significant audit of internal data to delete everything we once generated.

I believe these "quirks" are not the result of bad mangement decisions in the companies involved but rather are an inevitable outcome of the "private data" business model. The reason why companies have to be careful about licencing is that if they're not some wiseguy will simply buy one copy of their software and then put it on the web for everyone to use for all time. To maintain their viability they have to undertake measures which are deliberately annoying to end-users. Which often include technical restrictions.

To use an example from a related field - take the Royal Mail postal database. You might think that Post Code data would (as a list of facts) be public domain information. Infuriatingly, you wouldn't be entirely correct. And so, if you want to check your post-code data is accurate you have to hand over a thousand quid (for one user) to the Royal Mail. In some senses this is worse than the mapping data since the Royal Mail are a monopoly with regard to the issuing of post-codes (and you have to pay to get a new road post-coded too!).

In any sensible system, there would simply be a gigantic .CSV or XML or whatever file of all UK addresses hosted in a number of locations free for all to download so people could do interesting or innovative things with the data. Instead, we have slightly rubbish software which deliberately makes plain text exports difficult to do.

Such things are prime examples of a sort of an anti-"tragedy of the commons". If this data was owned collectively (that is to say, was not owned at all) and such basic factual documents were not seen as money making opportunities we would have so many advantages. Instead, we have a situation where hundreds of hours are being wasted simply because of outdated business models sadly adopted by our government. On top of this, such restrictions are stifling innovation. Google Maps may be able to afford to licence the OS data but the average bedroom developer cannot and so there is a less than optimal level of development in this area.

I actually believe that mapping data will be de-facto public domain within the next decade. Until then though, we have alternatives. Of the data we collect, I intend to submit it all to the Open Street Map project (http://wiki.openstreetmap.org/index.php/Main_Page) which is an excellent attempt to bypass some of the legal faggotry in the copyright datasets. Collectively, we can tear down the enclosures. We can rebuild a commons which can help organisations of all sizes innovate with GIS technologies (surely something which can only increase with better mobile devices?)

I'll let you know how things go.