Wednesday, August 31, 2011

Count the number observations for a specified Variable

Data that is not numerical can be a challenge to use in R - I have been finding this out over the last few days. The example data is for a venue which records the days on which they hold a gigt. Example:
Gig#     Day
1       Mon
2       Mon
3       Wed
4       Sat
5       Fri

At the moment, all I want to do is be able to count the number of gigs (Ofcourse, the answer is 5 here, but I need R to be able to count that correctly too).

Solution: use table(). Assume the data has been imported to the variable of your choice, 'data'. By specifying the variable you wish to look at, simply use $variable-of-interest
> table(data)
> table(data$Day)

The output should look something like this - this is a dataset I am working on atm.
      Day
Year   Fri Mon Sat Sun Thu Tue Wed
  2010  11   1  16   8   1   1   5
  2011  10   1  11   5   1   1   2

To be continued...

Sunday, August 14, 2011

Import .csv with header

First rule, always make the .csv file with a header. Later on, this header is used to call the specific data later on.

Second, for a univariate dataset (100 observations, 1 variable), use \n as the delimiter. Basically, put the 100 observations on 100 lines.

Example for football player height:
col1
1.9
2.1
2.0
1.9
1.9

Now to import the data into R. Open R, use the read.csv command, but make sure to use single quotes when giving the filename. Otherwise it will read the data as string.
> data <- read.csv('/path/to/file.csv', header = TRUE)

To check the size of the dataset, use the str() command:
> str(data)
'data.frame':	100 obs. of  1 variable:
 $ col1: num  -0.1128 -0.4808 -0.0156 -0.2525 0.0834 ...

As you can see, the 'num' that appears after '$ col1:' indicates that the data has been imported as numerical data. Perform a hist() on you data now using the name of the dataset (data) and the variable you wish to test (col1).

Ciao

Link to first R...

Just the link to the other R post that I made - I think I need to expand my knowledge here.

http://dirtyhabanero.blogspot.com/2011/05/rkward-cli-useful-commands.html