Archive | Statistics RSS feed for this section

Lattice R Users Group Talk

14 Aug

Last night I gave a talk on the Lattice package of R, held together by the idea that Lattice is an expression of Bill Cleveland’s overall philosophy of visualisation. I don’t know that I put my argument very clearly, but I think the fact of having an argument made the talk a tiny bit less incoherent!

After the talk there was some discussion of a few things – the use of colour schemes for one, but also two questions I didn’t have answers to, although I made a couple of totally wrong guesses!

Question 1: Can you include numeric scale on Lattice trivariate functions (and how)?

Answer – Yes, there is a scale argument, which must be included as a list.

Hence, given you have your surface dataframe pre-prepared:

wireframe(z~x*y, data=your.surface.data, scales=list(z=list(arrows=FALSE, distance =1)))

Question 2: Can you use the print() function to arrange graphics objects produced from multiple R graphics packages?

So far as I can tell, the answer is a qualified ‘yes’, where the qualification is that you need to be working with a graphics package which produces a storeable graphics object – lattice obviously does, and it looks like ggplot2 does also. Another package I selected at random, vcd, does not, however.

 

 

Lattice Trellis and Other Pretty Patterns

16 Mar

My local R users group is sufficiently in need of speakers that it was suggested that I give a talk, which I am quite willing to do. I have started to prepare a talk about the Lattice package, its ‘parent package’ Trellis from S-Plus, and some of the research of Trellis’s creator, Bill Cleveland, and his continuing influence.

Admittedly, I’m not an expert on visualisation, so I will cite Fell In Love With Data, to back up my assertion Cleveland is a big name in this area. FILWD listed their selection of the greatest papers of all time , with one of Cleveland’s early visualisation papers at the top.

That paper explains Cleveland’s feeling that there was a gap in the market for intellectual rigour (for want of a better phrase) in the application of statistical graphics. Cleveland’s contribution was to assess the cognitive difficulty of interpreting some of the key genres of data display, confirm the findings experimentally, and apply those findings to common trends in data display (pie graphs?)

Overall, Cleveland’s work history is too large to fit in one web page – he has two. One is his Bell Labs page, which remains preserved although he has since moved on. Cleveland’s new home is Purdue University, and his web page there has another selection of papers. Of course, of greatest interest to me in getting deeper in lattice was the trellis home page at Bell Labs, which includes the original Trellis manual, which can be used without changes with lattice in R to produce graphics.

A handy extra resource from Paul Murrell, the statistician who developed grid, the R package which underlies lattice is some sample chapters of his book ‘R Graphics’ that he has made available. In particular, he has made his chapter on lattice available.

Discovering Michael Friendly’s history of data visualisation has been a pleasantly surprising side benefit of looking into the lattice package, and related developments in data visualisation. Although there isn’t any detailed discussion of any of the specific topics I am researching, the timeline developed by Friendly showing peaks and troughs in data visualisation development (figure 1) gives some interesting context to Cleveland’s complaint of insufficient rigour in data visualisation. Ironically, Friendly posits the success of statistics via calculation as the leading cause of the lack of interest in visualisation throughout the first half of the twentieth century. Interestingly, Cleveland appears to be just after the rebirth of visualisation, (1950-1975) rather than part of the rebirth.