Archive | February, 2014

Why is math research important?

14 Feb

I’ve been trying to post a comment on this article from MathBabe, with zero success. The comments seem to just disappear, so I am trying this as an alternative way to say my piece. This is what I wanted to say:

Why not start by establishing the value of research in general. Others have gone down this path for example:


From there the argument is over how important maths is to health of the whole research community. The second paper lists benefits of a healthy research community, including ‘increasing stock of useful knowledge’ and ‘forming networks’. Arguably a healthy maths research community is vital for these outcomes to occur across all research communities.

Another way of putting it is that the research community is a community of communities, and all the member communities suffer if one of their number is lessened in some way; maths is a place where many of the member communities meet, so if the maths research community is lessened, the effect will be especially great.


As I’ve already described, I’m worried about the oncoming MOOC revolution and its effect on math research. To say it plainly, I think there will be major cuts in professional math jobs starting very soon, and I’ve even started to discourage young people from their plans to become math professors.

I’d like to start up a conversation – with the public, but starting in the mathematical community – about mathematics research funding and why it’s important.

I’d like to argue for math research as a public good which deserves to be publicly funded. But although I’m sure that we need to make that case, the more I think about it the less sure I am how to make that case. I’d like your help.

So remember, we’re making the case that continuing math research is a good idea for our society, and we should put up some money towards it…

View original post 767 more words

Data Mining/ Predictive Modeling Resources

6 Feb

A short list of some of the more interesting, and free DM/ PM resources I have found the net, at least in part by way of knowing where they are myself for future reference.

First, and close to most obviously, Trevor Hastie’s publications, where you can find both the comprehensive Elements of Statistical Learning, and the newer Introduction to Statistical Learning available for download, along with descriptions of Hastie’s other books.

I’ve mentioned Cosma Shalizi before on this blog, because he seems to talk good sense on a number of issues. His future book, which began as class notes is available as a downloadable pdf.

Meanwhile, at Columbia University, Ian Langmore and Daniel Krasner teach a Data Science course with a much greater programming bent, kind of as an antidote against too much maths and statistics training. The course site also includes the lecture notes.

Another book covering material closer to the first few, but including some additional topics is by Zaki, and has a website here

Some original papers are also available, e.g. Breiman’s Random Forests paper, which I have not yet read, but want to.

New Year’s Plans Continued – Maths

2 Feb

Yes, I know it’s nearly February, I just write slowly (or more to the point, disjointedly)

In my earlier post I discussed my ambitions for learning some computer science, in order to be a more effective data scientist and statistician. In particular, my aim is to follow Cosma Shalizi’s advice that statisticians should at least be aware of how to program like a computer programmer.

To become a better data scientist/ statistician  maths is also an important element. The maths that I think that I am most lacking is probably algebra, in terms of linear algebra and abstract algebra. From what I can see, most algorithms for data start in this area, also making use of probability theory. Whilst my knowledge of probability is also in need of renovation, my knowledge of algebra is much more dilipidated. Professor Shalizi has an area of his personal site devoted to maths he ought to learn – assuredly, if I had such a website, the corresponding area would be much larger.

Fortunately the internet is here to help.

With respect to linear algebra, we can start at’s open university:

Note that this features the winner of Saylor’s open textbook competition,

so it seems safe to assume this is one of the best of Saylor’s offerings.

Saylor also have Abstract Algebra I and Algebra II courses in modern and abstract algebra. It is in the Abstract Algebra II course that found the following great video, which discusses the links between group theory and data mining, especially with respect to classification problems. From this video I discovered the existence of John Diaconnis and his area of research in probability on groups, which unfortunately I am nowhere near understanding due to deficiencies in almost all of the pre-requisites, from the group theory perspective and the probability perspective.

A final course I am trying to follow, although the timing is not quite right, is Coursera’s Functional Analysis course. I have enjoyed the videos so far, and seem to mostly understand it. This area is also important for understanding probability on groups, hopefully I will be able to find the time keep following along.