Data is the next Intel Inside

I admit it being somewhat surprised by the extent to which data has gained value, even surpassing that of software in some ways. Yet data is an increasingly precious commodity and there are many examples of Web 2.0 applications capitalising on this, one of which is MyDataNest (a data storage company offering users a space to store their data of for free or, if they want advanced features included in the deal, for a fee). MyDataNest also offers editing tools which customers can use on their stored data, but it should be noted that these ‘applications’ are of lesser importance compared to the data storage/access are generally functionally centred around that storage/access anyway.

This type of business can be classified as a Data Infrastructure Strategy as it provides “infrastructure for storing and accessing others data”. Rather than spending large amounts of money to create specialised, hard-to-obtain data of their own or finding marketable ways of accessing such valuable data, this company simply creates space for customers to stash their data for safe-keeping – a business model seizing the opportunities at the bottom layer of the internet stack.

It is difficult to tell what level of ‘lock-in’ exists for those who register with MyDataNest, since it does not seem to present its policy on users migrating their data to its competitors. This brings up a key issue of the data-centric industry – to what extent do companies prevent their users from shifting to competitor companies. Chris Messina appears to hold up Google as an example of a company that seeks to monopolise the market by drawing users into its own proprietary platform and then shutting out third party services. Tim O’Reilly suggests that this practice is not, in fact, best practice, and that best practice dictates that users should have control over their data and be able to move it to wherever they want. A term for this is ‘open data’, and Tim Bray says that “any online service can call itself “Open” if it makes, and lives up to, this commitment: Any data that you give us, we’ll let you take away again, without withholding anything, or encoding it in a proprietary format, or claiming any intellectual-property rights whatsoever” (link) The data-centric industry is relatively new, so the terms to describe it aren’t fully set yet, but I think O’Reilly’s definition is quite perceptive and usable.

Another point to note about MyDataNest is that while it specifies what sort of data is collected from its users and how it is used, it does not say who owns that data. This touches upon another key issue relevant to data-centric businesses – the ownership of data. In examining this issue, it should be noted that there are at least two types of data in question – the data users upload themselves (be it files or personal registration data such as names and addresses) and the aggregate data that companies can glean from users’ provided data and behaviour. Sometimes the question of ownership is made easy by means of a signup contract or similar that explicitly states who owns what but not all situations are governed by such contracts and it’s then that things get complicated.

To be honest, I don’t know the answers for the question of ownership – I can see legitimate positions on both sides. One might argue that since the data is derived from users, users have a right to at least access it, if not have some say in what is done with it. On the other hand, the companies who retrieved, combined and analysed the data could claim that they were just like other researchers and that the results of their research are theirs. Tim Bray addresses this issue as well, saying “a service could also say: We acknowledge your interest in any value-added information we distill from what you give us, and will share it back with you to the extent we can do so while preserving the privacy of others. (link) This seems to me to an idealistic rather than legalistic response and depending on your opinions on the entire open data concept, it may seem more or less desirable and appropriate.


