Civilisational Data Mining

It’s a new expression I haven’t heard before. ‘Civilisational data mining.’

Let me start by putting it in some context. Every character, you or I have typed into the Google search engine or Facebook over the last decade, means something, to someone or perhaps ‘something,’ if it’s an algorithm.

In May 2014, journalists revealed that the United States National Security Agency, the NSA, was recording and archiving every single cell-phone conversation that took place in the Bahamas. In the process they managed to transform a significant proportion of a society’s day to day interactions into unstructured data; valuable information which can of course be analysed, correlated and transformed for whatever purpose the intelligence agency deems fit.

And today, I read that a GOP-hired data company in the United States has ‘leaked’ personal information, preferences and voting intentions on… wait for it… 198 million US citizens.

Within another decade or so, the cost of sequencing the human genome will have come down to about $1.00. By then, I would expect everyone to start being sequenced; I have been already. But just imagine in the Gattaca movie-like world, the implications for us all as genetic science leaps forward even faster than Moore’s Law?

The ability and opportunity to store unimaginable volumes of data and then map, analyse and search it with complex algorithms at ever-reducing cost, means it becomes increasingly attractive for governments, finance and business to do just that.

Companies like Cambridge, Analytica, Facebook, OK Cupid and Tinder, illustrate only too well, that once you can run a predictive data set on a million people or more, human behaviour can be anticipated, measured and influenced.

The problem we face is a one of a modern ‘Faustian Bargain.’ We rather like it when our smartphones and PCs recommend items, interests and even vacations, ones that a collection of clever algorithms, believe we will like to an accuracy of 98%. In fact, given that each one of us is a data-point framed by a predictable and complex set of values, it’s far more likely that an algorithm is a better judge of what’s best for us than we are ourselves.

Fast forward ten years or even twenty years and not only has each adult left an exhaust trail of digital information and preferences but it’s likely that both the Government and the private sector has access to all or part of this. If you happen to be a fan of the TV series, ‘Black Mirror,’ this may start to sound a little predictable?

A problem we face today, is that while Government appears determined to control the internet, data and encryption in its struggle against terrorism. But it's all our data too and to quote a friend from the intelligence service: “It’s not until all this stuff starts to join-up that you need to be worried.” And that’s kind of where we find ourselves now.

Meanwhile, not enough or very little attention is being given to the sheer volume of data gathering and harvesting that is taking place. Yes, we have the arrival of the GDPR (General Data Protection Regulation) in Europe in 340 days but one might argue, that it’s arriving rather too late, as much of the data has already escaped into the wild and rather more will follow, with or without regulation from Brussels, given the chronic and broken nature of insecurity which defines the internet in 2017.

I’m attempting to imagine a future, where every keystroke, every search, every Tweet, every ‘Blog’ and email and of course, every indiscretion, is wrapped around the brief existence called me?

Somewhere, machines are humming and algorithms are running on all that data being sucked-in to a growing Black Hole of information storage, and one day, there’ll be an AI overseeing it all, making decisions; pattern-recognitions running across our lives that none of us will be smart enough to understand.

Popular posts from this blog

The Nature of Nurture?