But today, I want to show how to perform a simple correlation between two columns of data. These two columns of data could be the the height of giraffes and the length of their legs. Or maybe even the number of people at a beach and the temperature recorded on that same day.
The command I will be using is cor. See here for details about this command and finer details about the output (sorry, but a lot of the technical stuff is technical to me too).
NB: your data needs to be arranged like this for the correlation to work using this method
Darling Gwydir
1 5 1
2 24 59
3 0 0
4 0 0
5 6 52
6 336 8
7 314 29
8 0 0
9 36 50
10 85 200
11 5291 406
12 0 0
13 57 231
14 0 8
Once your data has been imported to 'data',
> data <- read.csv('/path/to/file.csv', header=TRUE, row.names=1)
you can use 'cor' to do pairwise comparisons of all the data vectors. Of course, if you happen to have more than 2 columns, the method doesn't change, you will just be outputted with a larger matrix than the one below.
> cor(data)
Darling Gwydir
Darling 1.00000000 0.7878988
Gwydir 0.78789880 1.0000000
Another useful command is 'symnum'. The output is a computerized table with symbols indicating the level of correlation. Neat!
> symnum(cor(data))
D G N Mc L Mr
Darling 1
Gwydir , 1
Namoi B + 1
Macquarie . 1
Lachlan B + B 1
Murrumbidgee . B 1
attr(,"legend")
[1] 0 ‘ ’ 0.3 ‘.’ 0.6 ‘,’ 0.8 ‘+’ 0.9 ‘*’ 0.95 ‘B’ 1
Enjoy! Till next time!
No comments:
Post a Comment