Let’s get logical
Quite frequently, we will want our functions to perform actions if and only if some conditions are met. We could for example want to calculate a square root of a number, only if this number is positive, divide by a number only if it’s different from 0. Or, for example, after sampling the population of a target species at different locations, we could want to compute some statistics if the species is present, and if it’s not, simply return a message saying that the species was not detected at those sites. The first thing we need to know to do that is to test for conditions. A set of characters is used to perform basic comparison.
If you want to test if 2 values are equal, you will use the double equal sign. If your values are equal, R will return TRUE, and FALSE otherwise.
Is 2 equal to 1?
2==1
[1] FALSE
Of course not.
Is 3*2 equal to 6?
(3*2)==6
[1] TRUE
Of course yes.
And of course it works with objects.
a="Wow"
b="Wow"
a=="Wow"
[1] TRUE
a==b
[1] TRUE
If you use vectors, matrices or data frames, R will compare the 1st element of the 1st object with the 1st element of the second object. Then the 2nd element of the 1st object with the 2nd element of the 2nd object, and so on. If needed (i.e. if one of the 2 objects is shorter), R will recycle the elements of the shortest object:
a=c(1,2)
b=c(1,2,3,6,1,2)
c=c(1,2,3,1,2,6)
a==1 # will return a vector with the results for a[1]==1 and a[2]==1
[1] TRUE FALSE
a==b # will return a vector with the results for a[1]==b[1], a[2]==b[2], a[1]==b[3], a[2]==b[4], a[1]==b[5], a[2]==b[6]
[1] TRUE TRUE FALSE FALSE TRUE TRUE
a==c
[1] TRUE TRUE FALSE FALSE FALSE FALSE
As your acute sense of sight will have detected, comparisons are made element by element. If you’d rather know if 2 objects are exactly the same, you will need to use the function “identical()“
d=c(1,2)
a==d
[1] TRUE TRUE
identical(a,d)
[1] TRUE
identical(a,b)
[1] FALSE
Assuming you want to know if values are different instead, you use the operator “!=”
2!=1 # Is 2 different from 1. Result: yes it is.
[1] TRUE
2!=2 # Is 2 different from 2. Result: no it is not.
[1] FALSE
The operator “!=” will return a mirrored version of the results obtained with the operator “==”. As a matter of fact, you can invert the results of any test this way, simply by preceding the object containing said test with an exclamation point.
test= c==2
test
[1] FALSE TRUE FALSE FALSE TRUE FALSE
!test
[1] TRUE FALSE TRUE TRUE FALSE TRUE
This become a fantastic asset when you are trying to pick some data in a dataset! You just need to perform a test on the discriminating variable, and then use the result of this test to subset the original dataset. For example, there is this really cool James Bond marathon on TV right now. Unfortunately, I have to write this course (well, you know what I mean), and therefore I will have time to watch only a couple of them. I have to pick carefully… Here are the Rotten Tomatoes ratings for each movie:
james=data.frame(
movie=c("DoctorNo","Thunderball","Skyfall","OnHerMajestysSecretService",
"TheSpyWhoLovedMe","TheLivingDaylights","NeverSayNeverAgain",
"DieAnotherDay","Octopussy","LicenceToKill","YouOnlyLiveTwice",
"DiamondsAreForever","CasinoRoyale(2006)","FromRussiaWithLove",
"AViewToAKill","LiveAndLetDie","GoldenEye","TomorrowNeverDies",
"QuantumOfSolace","ForYourEyesOnly","Moonraker","TheWorldIsNotEnough",
"Goldfinger","TheManWithTheGoldenGun","CasinoRoyale(1967)") ,
rating=c(98,85,92,81,78,75,59,57,43,74,71,65,95,
96,36,65,82,57,64,73,62,51,96,46,27),
actor=c("S.Connery","S.Connery","D.Craig","G.Lazenby","R.Moore",
"T.Dalton","S.Connery","P.Brosnan","R.Moore","T.Dalton",
"S.Connery","S.Connery","D.Craig","S.Connery","R.Moore",
"R.Moore","P.Brosnan","P.Brosnan","D.Craig","R.Moore",
"R.Moore","P.Brosnan","S.Connery","R.Moore","D.Niven")
)
View(james)
Quick! I want to know which movies have a rating higher than 90%! With my new skills, I can decide easily! First, let’s test which ratings are higher than 90%, the operator is simply ‘>’
james$rating>90
[1] TRUE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[13] TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE
[25] FALSE
Let’s put this test in an object we can use:
test= james$rating>90
Now, we can use it to select the rows of the object “james” that have their value in the column “rating” that matches our condition:
bond.to.watch= james[test,]
R will pick the rows where “test” is TRUE, and leave the ones where “test” is FALSE.
bond.to.watch
movie rating actor
1 DoctorNo 98 S.Connery
3 Skyfall 92 D.Craig
13 CasinoRoyale(2006) 95 D.Craig
14 FromRussiaWithLove 96 S.Connery
23 Goldfinger 96 S.Connery
And I can as easily make a list of the movies to not watch by inverting the results of my test:
bond.not.to.watch= james[!test,]
bond.not.to.watch
movie rating actor
2 Thunderball 85 S.Connery
4 OnHerMajestysSecretService 81 G.Lazenby
5 TheSpyWhoLovedMe 78 R.Moore
6 TheLivingDaylights 75 T.Dalton
7 NeverSayNeverAgain 59 S.Connery
8 DieAnotherDay 57 P.Brosnan
9 Octopussy 43 R.Moore
10 LicenceToKill 74 T.Dalton
11 YouOnlyLiveTwice 71 S.Connery
12 DiamondsAreForever 65 S.Connery
15 AViewToAKill 36 R.Moore
16 LiveAndLetDie 65 R.Moore
17 GoldenEye 82 P.Brosnan
18 TomorrowNeverDies 57 P.Brosnan
19 QuantumOfSolace 64 D.Craig
20 ForYourEyesOnly 73 R.Moore
21 Moonraker 62 R.Moore
22 TheWorldIsNotEnough 51 P.Brosnan
24 TheManWithTheGoldenGun 46 R.Moore
25 CasinoRoyale(1967) 27 D.Niven
Quick, get me a martini! Shaken not stirred.
* Movie: ON
James: “Q, make sure to install R in my computer-shoe! It could be useful if I ever need to sort things up!”
Q: “…”
… Funny… I didn’t remember this part… Anyway…
Here is a list of the available operators and their meaning:
- a == b : Are elements of ‘a’ equal to elements of ‘b’?
- a != b : Are elements of ‘a’ different from elements of ‘b’?
- a < b : Are elements of ‘a’ less than elements of ‘b’?
- a > b : Are elements of ‘a’ greater than elements of ‘b’?
- a <= b : Are elements of ‘a’ less than or equal to elements of ‘b’?
- a >= b : Are elements of ‘a’ greater than or equal to elements of ‘b’?
Additionally, is it possible to test if elements of a vector or matrix are missing with the function “is.na()”.
vec.with.NAs=c(3,6,1,NA,4)
is.na(vec.with.NAs)
[1] FALSE FALSE FALSE TRUE FALSE
And if we only want the vector without NAs, we can’t pick only the cells that DON’T match this test:
vec.without.NAs=vec.with.NAs[!is.na(vec.with.NAs)]
vec.without.NAs
[1] 3 6 1 4
Of course, there is the function “na.omit()” that works too, but it’s not the point of this section.
Combining logical tests
What about if we want to combine conditions? We might want to select sites in a data frame that for example have an iron concentration lower than a certain percentage AND belonging only to one state. Or, for example, we could want to select James Bond movies that have a rating higher than 90%, AND only with Sean Connery as James Bond. Or, pick movies with a really good rating OR with a really bad rating. Or movies without Timothy Dalton NOR George Lazenby (for those wondering, George Lazenby was the James Bond that got married…).
To make a test satisfying conditions A AND B (or more conditions), we use the operator ‘&‘. For example:
1==1 & 2==2 # Both tests are true, so the final result is TRUE
[1] TRUE
4==1 & 3==2 # Both tests are false, so the final result is FALSE
[1] FALSE
1=="blabla" & 2>1 # One test is false, the other is true.
[1] FALSE # At least one of them is false, therefore, the final result is FALSE
An even more explicit representation:
T & T
[1] TRUE
F & F
[1] FALSE
T & F
[1] FALSE
This operator will compare every element in test A with the corresponding element in B, and return an object in a format matching the ones from the inputted test (vector or matrix).
a=c(2,1)
b=c(2,2)
c=c(2,2)
a==b # First element of 'a' is equal to the 1st element of 'b', 2nd element of 'a' is different from the 2nd element of 'b' --> T F
[1] TRUE FALSE
b==c # First element of 'b' is equal to the 1st element of 'c', so is the 2nd element --> T T
[1] TRUE TRUE
a==b & b==c # First element of our response corresponds to T&T and the second to T&F --> T F
[1] TRUE FALSE
For a test meeting one condition A OR the condition B (or more conditions), we use the operator ‘|‘. (If you’re looking for it, just look above your ‘enter’ key on most laptops).
1==1 | 2==2 # Both tests are true, so the final result is TRUE
[1] TRUE
4==1 | 3==2 # Both tests are false, so the final result is FALSE
[1] FALSE
1=="blabla" | 2>1 # One test is false, the other is true. At least one of them is true, therefore, the final result is TRUE
[1] TRUE
As with ‘&’, this operator will compare each element in test A with the corresponding element in B.
a==b # First element of 'a' is equal to the 1st element of 'b', 2nd element of 'a' is different from the 2nd element of 'b' --> T F
[1] TRUE FALSE
b==c # First element of 'b' is equal to the 1st element of 'c', so is the 2nd element --> T T
[1] TRUE TRUE
a==b | b==c # First element of our response corresponds to T&T and the second to T&F --> T T
[1] TRUE TRUE
An even more explicit representation
T | T
[1] TRUE
F | F
[1] FALSE
T | F
[1] TRUE
So, for example, let’s see our previous requests.
James Bond movies with rating higher than 90%, AND only with Sean Connery as James Bond.
testsup90= james$rating>90 # Picking movies with a rating higher than 90
testConnery= james$actor=="S.Connery" # Picking movies with Sean Connery
combinedtest= testsup90 & testConnery # Combining the 2 tests
goodConnery=james[combinedtest,] # Subsetting the original dataset by picking only the rows that match our tests
goodConnery
movie rating actor
1 DoctorNo 98 S.Connery
14 FromRussiaWithLove 96 S.Connery
23 Goldfinger 96 S.Connery
Movies with a really good rating (>90%) OR with a really bad rating (<30%).
testsup90= james$rating>90 # Picking movies with a rating higher than 90
testinf30= james$rating<30 # Picking movies with a rating lower than 30
combinedtest= testsup90 | testinf30 # Combining the 2 tests
goodorbad=james[combinedtest,] # Subsetting the original dataset by picking only the rows that match our tests
goodorbad
movie rating actor
1 DoctorNo 98 S.Connery
3 Skyfall 92 D.Craig
13 CasinoRoyale(2006) 95 D.Craig
14 FromRussiaWithLove 96 S.Connery
23 Goldfinger 96 S.Connery
25 CasinoRoyale(1967) 27 D.Niven
Movies without Timothy Dalton NOR George Lazenby.
testDalton= james$actor!="T.Dalton" # Picking movies without T.Dalton
testLazenby= james$actor!="G.Lazenby" # Picking movies without G.Lazenby
combinedtest= testDalton & testLazenby # Combining the 2 tests (movies with no Dalton AND no Lazenby
noDaltonLazenby=james[combinedtest,] # Subsetting the original dataset by picking only the rows that match our tests
noDaltonLazenby
movie rating actor
1 DoctorNo 98 S.Connery
2 Thunderball 85 S.Connery
3 Skyfall 92 D.Craig
5 TheSpyWhoLovedMe 78 R.Moore
7 NeverSayNeverAgain 59 S.Connery
8 DieAnotherDay 57 P.Brosnan
9 Octopussy 43 R.Moore
11 YouOnlyLiveTwice 71 S.Connery
12 DiamondsAreForever 65 S.Connery
13 CasinoRoyale(2006) 95 D.Craig
14 FromRussiaWithLove 96 S.Connery
15 AViewToAKill 36 R.Moore
16 LiveAndLetDie 65 R.Moore
17 GoldenEye 82 P.Brosnan
18 TomorrowNeverDies 57 P.Brosnan
19 QuantumOfSolace 64 D.Craig
20 ForYourEyesOnly 73 R.Moore
21 Moonraker 62 R.Moore
22 TheWorldIsNotEnough 51 P.Brosnan
23 Goldfinger 96 S.Connery
24 TheManWithTheGoldenGun 46 R.Moore
25 CasinoRoyale(1967) 27 D.Niven
or
testDalton= james$actor=="T.Dalton" # Picking movies with T.Dalton
testLazenby= james$actor=="G.Lazenby" # Picking movies with G.Lazenby
combinedtest= testDalton | testLazenby # Combining the 2 tests (movies with either Dalton OR no Lazenby
noDaltonLazenby=james[!combinedtest,] # Subsetting the original dataset (by inverting the test, and transforming it in "all movies without Dalton nor Lazenby)
noDaltonLazenby
[Same table as above, duh!]
It is of course possible to skip writing 4 lines every time and combine everything:
goodConnery=james[james$rating>90 & james$actor=="S.Connery",]
goodorbad=james[james$rating>90 | james$rating<30,]
noDaltonLazenby=james[james$actor!="T.Dalton" & james$actor!="G.Lazenby",]
There is an alternative way to combine logical test. ‘&&’ and ‘||’ will perform the equivalent of ‘&’ and ‘|’, but ONLY on the first element of the inputted tests:
a=c(T,F,F,F,F,F)
b=c(T,T,T,F,T,F)
c=c(F,T,T,F,T,F)
a & b # Returns a vector of T and F values
[1] TRUE FALSE FALSE FALSE FALSE FALSE
a && b # Returns only the comparison between the first element of 'a' (here T) and the first element of 'b' (here T) --> result: T&T -> T
[1] TRUE
a & c
[1] FALSE FALSE FALSE FALSE FALSE FALSE
a && c # Returns only the comparison between the first element of 'a' (here T) and the first element of 'c' (here F) --> result: T&F -> F
[1] FALSE
a || b # Returns only the comparison between the first element of 'a' (here T) and the first element of 'b' (here T) --> result: T|T -> T
[1] TRUE
a || c # Returns only the comparison between the first element of 'a' (here T) and the first element of 'c' (here F) --> result: T|F -> T
[1] TRUE
It is possible to sum up the results of logical tests. We can for example ask if “all()” elements in a test object are TRUE:
test= c(1,2,3,4)>3
test
[1] FALSE FALSE FALSE TRUE
all(test) # Not all of the values in 'test' are TRUE, therefore, the function returns "F"
[1] FALSE
Or if “any()” of the elements in the test object are TRUE:
any(test) # We have at least one value that meets the condition '>3', therefore the function returns "T"
[1] TRUE
any(c(2,1)==c(1,1))
[1] TRUE
Using logical tests
Wow. That was a big piece, wasn’t it? If at this point you haven’t abandoned the ship, you’ll feel comfortable in this next section, otherwise… We’ll make it work! Now that we are able to express conditions to R based on tests, we can use that to tell it to do things if a test is positive. It is literally possible to tell to R: “If this condition is met, do this.”
The instruction “if()” will run the code in the brackets following it if and only if the test that is provided as its input is TRUE.
if(1==2) { print("Wow, I would have never thought that possible!") }
Since 1 is not equal to 2, nothing is printed.
if(1<2) { print("That makes sense!") }
[1] "That makes sense!"
Since 1 is lower than 2, R runs the piece of code between the brackets, and prints the sentence.
Even better, we can tell to R: “If this condition is met, do this. Else, do that!” In this case, the bracket closing the section of code describing what to do if the condition will be directly followed by the instruction “else”. This instruction “else” will in turn be followed by a piece of code describing what to do when the “if” condition is not met. Let’s create a function that computes the square root of a number, if it’s positive, and that returns an error message otherwise.
mysquare= function(x){
if (x>=0){
res = sqrt(x) # If x is positive, the result is the square root of x
return(res) # We are warning the function to return as output the object 'res'. This line is not necessary, but it's a good habit to have.
} else {
cat("Nope! Not going to do it! \n") # If x is not positive, the result is an “useful” error message
}
}
mysquare(9)
[1] 3
mysquare(-10)
Nope! Not going to do it!
Exercise 5.2Create a function “inverse()” that compute the inverse of a number – The function has to check if the number is different from 0. – If the number is different from 0, the function has to return the inverse of the number – If the number is equal to 0, the function has to return an error message saying that the function will not end the world just for your pretty eyes
Answer 5.2
[collapse]
|
INTRODUCTION
Let's look at one of R's most useful assets: functions, and how to write your own.
CREATING SEQUENCES OF NUMBER
You'll never have to count by yourself again!
CONCLUSION
And they lived happily ever after! The End!
FUNCTION, THE GENERAL STRUCTURE
Need your work to function? Know how functions work!
FUNCTION: IN THE LOOP
How to iterate, repeat, recur.