getting started

In case you’ve never used a command line stats tool before, I thought I should post some pointers. R has a pretty steep learning curve. But after the initial effort you’ll find it paying dividends. You already downloaded and installed R amiright? Great! When you click on the icon you get the console window. It has a prompt, like this:


Now you’re ready to go! Type some arithmetic at the prompt and we get the answer. What are the odds of rolling snake eyes on two dice? Each die has six options, but only one way of rolling a one.

> # We type commands at the prompt (>)
> # Text after a hash (#) is a comment
> # This text won't be interpreted by R
> 1 / (6 * 6)
[1] 0.02777778

The probability of snake eyes on two dice is 0.03, or 3%. Okay, so far so straight forward. A great thing about R is that we can save information as an object. Like this.

> p_snakeeyes <- 1 / (6 * 6)

The assign function “<-” creates the object p_snakeeyes, that I can then use for other things. For example, to work out the probability of hitting on anything but aces on either of two rolls.

> # use a previously created object
> 1 - (p_snakeeyes^2)
[1] 0.9992284

So far, so reasonable. But working out one answer at a time is not enough. We can do arithmetic on lots of numbers at once, by combining them into a single object with the c function.

> vals <- c(1, 2, 3)
> vals
[1] 1 2 3
> vals + 1
[1] 2 3 4

This object, vals, is a vector. We are not changing the contents when we perform operations with it. We’re just viewing the output. So if we want to collect the output of an operation, we assign it to a new object.

> out <- vals * p_snakeeyes
> out
[1] 0.02777778 0.05555556 0.08333333

The new object, out, contains three elements. We can refer to these with square brackets describing the element we want to see, set or use.

> out[1]
[1] 0.02777778
> out[2]
[1] 0.05555556
> # choose more than one number
> out[c(2, 3)]
[1] 0.05555556 0.08333333
> # select these numbers in any order, any number of times
> out[c(3, 3)] + p_snakeeyes
[1] 0.1111111 0.1111111
> # assigning to a position in an object is super useful
> out[2] <- vals[3]^2 - p_snakeeyes
> out
[1] 0.02777778 8.97222222 0.08333333

There are different types of data objects beyond vectors, but I won’t cover them here.

Another type of object is a function. A function returns the output of some operation or transformation. After the name of the function you use round brackets, then the arguments define the behaviour of the function.

> mean(out)
[1] 3.027778
> median(out)
[1] 0.08333333

What’s really fantastic is that it’s possible to create your own functions. These can do anything! Pretty much. The function called function defines a function! For example, I know that there’s a function sample (?sample for more information). I want to use it to simulate some dice rolls to choose a random column. Assign to an object the function with zero arguments and type the code between curly brackets.

> colmn <- function() { sample(seq_len(6), size = 1) }
> colmn()
[1] 2
> colmn()
[1] 4

From these building blocks, we can build up rather complex analyses. There’s a lot to learn, but it’s all good stuff!

| Tagged

One thought on “getting started

  1. Pingback: world team championships 2015 part 2 | analytical gaming

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s