BIg Data and predictive analytics are debated and discussed almost endlessly in the interwebs. One of the threads that runs through these discussions relates to how much maths and statistics does one need to know (although sometimes the question seems to be more like ‘how little can I get away with?’) to practice data science/predictive analytics, etc.

Actual maths and stats people come down on the side of a little knowledge is a dangerous thing, and people should try know as much as possible. See here:

http://mathbabe.org/2013/04/04/k-nearest-neighbors-dangerously-simple/

But knowing enough statistics to be called a statistician could lead to being seen as out of touch with Big Data:

http://normaldeviate.wordpress.com/2013/04/13/data-science-the-end-of-statistics/

Particularly if contemporary, highly computer literate statisticians who are widely admired in their field admit in public they don’t know anything about Hadoop:

http://andrewgelman.com/2013/11/01/data-science/

Maybe this guy has the answer – ignore statisical theory and training, learn the least amount of programming to start hacking, and just teach yourself with whatever data comes to hand:

http://www.datasciencecentral.com/profiles/blogs/proposal-for-an-apprenticeship-in-data-science

Well, not exactly, but *statistics* is still kind of relegated to being something you ‘learn basics about’. I don’t think that posts 1, 2 and 3 can possibly be talking about the same discipline?

From my point of view, as someone who still pinches themselves that they get to do predictive modelling as a for real job, with only a Master’s degree in statistics, experience in business from before I did stats, and a really poor command of VB6 as my only qualifications (although I learned a lot of SQL *very *quickly when started this job ‘cos otherwise I had nothing to analyse), I can only say that with respect to maths and statistics I wish I knew more, with respect to machine learning, I wish I knew more and with respect to hacking I wish I knew more.

How much is enough? All of it isn’t enough.