I used to hate color-coding plots. 'Twas a big pain. Let's say we're trying to plot the relationship between awesomeness and attractiveness in R versus. First, let's read in the R know-how and awesomeness dataset.

Let's peak under the “head”, shall we?

require(fifer)
d = read.csv("Awesomeness_Rknowhow.csv")
head(d)
## Number.of.Friends R.Know.how Club
## 1 0.46841 -0.7202 Mat Black Labs
## 2 0.01932 0.3678 Mat Black Labs
## 3 0.68488 -0.9006 Mat Black Labs
## 4 -1.15628 1.8706 Mat Black Labs
## 5 -0.76576 -0.1406 Mat Black Labs
## 6 0.15211 -1.4046 Mat Black Labs

Nice! And let's peak under the “tail.” (Okay, bad joke).

tail(d)
## Number.of.Friends R.Know.how Club
## 95 -0.6766 -1.4407 Rs-R-Us
## 96 -0.3809 -1.0437 Rs-R-Us
## 97 1.5243 2.4358 Rs-R-Us
## 98 0.3377 -0.3047 Rs-R-Us
## 99 -1.3506 -1.0900 Rs-R-Us
## 100 -1.4359 -1.6769 Rs-R-Us

Let's say we're at an uber geek convention where SAS, R, and Matlab users alike meet to…er…mingle and speak of common interests. Being the research-minded student you are, you decide to measure three traits of the convention participants: how many friends they have (okay, I know you can't have negative friends and you can't have a fraction of a friend. Stop being so critical!), how much they know about R, and which club they belong to–the Mat Black Labs or the R-R-Us(es). You then plot the relationship betwixt the two quantitative traits:

plot(d[,1:2], ylab="Number of Friends",
xlab="R Know-How", xaxt="n", yaxt="n")

What a jumbled mess! Then you remember that you forgot you measured the two groups…but how to plot them. Why, let's color-code them!

This is where the string.to.color function comes in. It requires a vector of strings as inputs (and an optional vector of colors–one for each unique grouping value) and it will output a string of colors (the same length as the original string). Let's take a look:

#### let's look at that vector of strings (or factors)
d$Club
## [1] Mat Black Labs Mat Black Labs Mat Black Labs Mat Black Labs
## [5] Mat Black Labs Mat Black Labs Mat Black Labs Mat Black Labs
## [9] Mat Black Labs Mat Black Labs Mat Black Labs Mat Black Labs
## [13] Mat Black Labs Mat Black Labs Mat Black Labs Mat Black Labs
## [17] Mat Black Labs Mat Black Labs Mat Black Labs Mat Black Labs
## [21] Mat Black Labs Mat Black Labs Mat Black Labs Mat Black Labs
## [25] Mat Black Labs Mat Black Labs Mat Black Labs Mat Black Labs
## [29] Mat Black Labs Mat Black Labs Mat Black Labs Mat Black Labs
## [33] Mat Black Labs Mat Black Labs Mat Black Labs Mat Black Labs
## [37] Mat Black Labs Mat Black Labs Mat Black Labs Mat Black Labs
## [41] Mat Black Labs Mat Black Labs Mat Black Labs Mat Black Labs
## [45] Mat Black Labs Mat Black Labs Mat Black Labs Mat Black Labs
## [49] Mat Black Labs Mat Black Labs Rs-R-Us Rs-R-Us
## [53] Rs-R-Us Rs-R-Us Rs-R-Us Rs-R-Us
## [57] Rs-R-Us Rs-R-Us Rs-R-Us Rs-R-Us
## [61] Rs-R-Us Rs-R-Us Rs-R-Us Rs-R-Us
## [65] Rs-R-Us Rs-R-Us Rs-R-Us Rs-R-Us
## [69] Rs-R-Us Rs-R-Us Rs-R-Us Rs-R-Us
## [73] Rs-R-Us Rs-R-Us Rs-R-Us Rs-R-Us
## [77] Rs-R-Us Rs-R-Us Rs-R-Us Rs-R-Us
## [81] Rs-R-Us Rs-R-Us Rs-R-Us Rs-R-Us
## [85] Rs-R-Us Rs-R-Us Rs-R-Us Rs-R-Us
## [89] Rs-R-Us Rs-R-Us Rs-R-Us Rs-R-Us
## [93] Rs-R-Us Rs-R-Us Rs-R-Us Rs-R-Us
## [97] Rs-R-Us Rs-R-Us Rs-R-Us Rs-R-Us
## Levels: Mat Black Labs Rs-R-Us

And now let's see what string.to.colors does

string.to.colors(d$Club, col=c("red", "blue"))
## colors colors colors colors colors colors colors colors colors colors
## "red" "red" "red" "red" "red" "red" "red" "red" "red" "red"
## colors colors colors colors colors colors colors colors colors colors
## "red" "red" "red" "red" "red" "red" "red" "red" "red" "red"
## colors colors colors colors colors colors colors colors colors colors
## "red" "red" "red" "red" "red" "red" "red" "red" "red" "red"
## colors colors colors colors colors colors colors colors colors colors
## "red" "red" "red" "red" "red" "red" "red" "red" "red" "red"
## colors colors colors colors colors colors colors colors colors colors
## "red" "red" "red" "red" "red" "red" "red" "red" "red" "red"
## colors colors colors colors colors colors colors colors colors colors
## "blue" "blue" "blue" "blue" "blue" "blue" "blue" "blue" "blue" "blue"
## colors colors colors colors colors colors colors colors colors colors
## "blue" "blue" "blue" "blue" "blue" "blue" "blue" "blue" "blue" "blue"
## colors colors colors colors colors colors colors colors colors colors
## "blue" "blue" "blue" "blue" "blue" "blue" "blue" "blue" "blue" "blue"
## colors colors colors colors colors colors colors colors colors colors
## "blue" "blue" "blue" "blue" "blue" "blue" "blue" "blue" "blue" "blue"
## colors colors colors colors colors colors colors colors colors colors
## "blue" "blue" "blue" "blue" "blue" "blue" "blue" "blue" "blue" "blue"

So all it does is replace all the values of “Rs-R-Us” with “blue” and all the values of “Mat Black Labs” with “red.”

Now we can put that into the plot to tell R how we wanna display it:

plot(d[,1:2], ylab="Number of Friends", xlab="R Know-How",
xaxt="n", yaxt="n",
col = string.to.colors(d$Club, col=c("orange", "purple")))
legend("topleft", legend=c("Mat Black Labs", "Rs-R-Us"),
text.col=c("orange", "purple"), bty="n")

We can also “cheat” and use the string.to.colors function to use different symbols!

plot(d[,1:2], ylab="Number of Friends",
xlab="R Know-How", xaxt="n", yaxt="n",
pch = as.numeric(string.to.colors(d$Club, col=c(11, 16))))
legend("topleft", legend=c("Mat Black Labs", "Rs-R-Us"),
pch=c(11, 16), bty="n")

Neato, eh?