Right now there is a lot of talk about business models that give people what they want based on the data accumulated about them. For example, take this article about the rise of ‘faster than real-time’ predictive business models. But is there an obvious flaw? Many dislike the idea of huge amounts of personal data being collected, just so a business can forecast our purchasing habits. But might some users wreck the system, by deliberately pumping garbage into the analysis? Cory Doctorow, the author and internet activist, thinks that privacy concerns will provoke some customers to rebel; see here. For example, CyanogenMod is an aftermarket firmware distribution for Android phones that aims to enhance privacy, and it also has an experimental feature where randomly-generated data is supplied whenever an app asks for personal data (such as current location). So here is a very relevant question for any business intending to make money from user data: what is their prediction for the amount of data that will be deliberate garbage?
Predicting Garbage In, Garbage Out
