FUNCTION, THE GENERAL STRUCTURE

Function, the general structure

To write a function, at least 3 things need to be done:

Put this function in an object!
Let R know that you are going to write a function (and possibly letting it know about the elements that will be inputted in the function, a.k.a. ‘arguments’).
And finally, specify the instructions/operations that the function has to perform

For example, let’s write a function that will simply print some text on the screen:

simple_function = function()   { print("Hello!")}

The first step here is accomplished by creating and defining the object “simple_function” with the instruction “simple_function=”. The second step is done with “function()” that let’s R know that what will follow will be the operations performed by the function “simple_function”. The third step is done by giving the operations performed by the function between brackets.

Now, let’s try to run this function and see the result:

simple_function
function()   { print("Hello!")}

What? This just printed what was in the object “simple_function”, what a crook! Oh, wait a minute… Yes, as with every object, if you simply call it by its name, the content of the object is displayed on screen. To make sure that we actually run the function, it has to be followed by parentheses (as we have always done so far when using functions).

simple_function()
[1] "Hello!"

Hi!

Nice… Your first function!

By default, the value returned by the function you created is the internal object created in the last line of your code. This value is not displayed on the screen unless specified otherwise (implicitly or explicitly).

“Minute platypus! Internal object???”

Each function works in its own microcosm. It’s a “What happens in the function stays in the function” kind of deal. It means that if you create an object inside your function named “amypond”, this doesn’t create an object in your current workspace, nor will it overwrite any existing object that would share this name. Some say that functions are bigger on the inside…

Example:

# creating a function "tardis()"
# that internally adds 5 to 6, 
# and then stores this value in an internal object called "amypond"
tardis = function()   { amypond=5+6}

We run the function:

tardis()

Nothing happens… Yep, of course, as said above, we didn’t tell the function to print the value of the object, so there is no reason to display it. But it did the computation, I swear! Let’s check that it didn’t create an object “amypond” in our workspace:

ls()  # Listing the object present in the workspace
[1] "simple_function" "tardis"

Nope, no “amypond”

Now, if you want to actually save the results of the function, you can simply store it in an object:

riversong=tardis()
riversong
[1] 11

Tadaa, it did compute our addition, and stored it. Now, let’s make an experiment to check that internal objects that are created in the function (and therefore only exist there) would not overwrite objects that exist in our workspace.

amypond="rory"   # Creating a character vector with one element: "rory"
amypond          # "amypond" is equal to "rory"
[1] "rory"
riversong=tardis()
riversong        # "riversong" is equal to 11
[1] 11
amypond          # "amypond" is still equal to "rory"
[1] "rory"

And finally, another experiment, to actually show you that everything that happens in the function actually exists in the function microcosm, and not outside of it. Let’s ask the function to print the list of the objects that exist in it:

rm("amypond")   # removing the object "amypond" currently in our workspace
# function that creates an internal object "amypond" equal to 5+6,
# then displays the list of the objects
# present in the function own workspace/microcosm
# and finally return the result of the computation
tardis2 = function() { amypond=5+6 ; print(ls()) ; amypond}  
tardis2()    # Running the function.
[1] "amypond"  # print(ls()) displays the internal workspace composition
[1] 11         # Actual value returned by the function

We see that the function had an internal object named “amypond”.

ls()  # Last check to see that this internal object was not
      # simultaneously created in our general workspace!
[1] "riversong"    "simple_function" "tardis"    "tardis2"

Bottom line, the general structure to create a function: myfunction <- function() { }

Functions like arguments

All of this is already pretty cool (yes, it is!), but let’s face it, what we really want it to be able to input something in a function for it to be modified in some way. This is where arguments come in to play. We have been constantly using arguments so far. Arguments are simply a way to transfer information to our function. It can be data or instructions on how to take care of the operations conducted by the function. For example, remember the “plot()” function?

x=c(1,10)
y=x=c(1,10)
plot(x,y,type="l")

Here, “x”, “y” and “type” are arguments. The first two are data, and the third one is there to let the function know that you want it to do something particular.

To specify the use of arguments in our function, we need to include it when we are letting R know that we are writing a function. All the elements that are supposed to be inputted have to appear between the parentheses:

div = function(a,b) { c=a/b }   # Making a function "div()" that creates 
                                # an internal object (that will be outputted) equal to
                                # the ratio of two inputted numbers "a" and "b"

Let’s try it:

div(3,4)

Damn! Once again, I forgot to tell R to save the output… Remember, if you make an operation without assigning it to an object, R will simply display it, and immediately forget about it

3/4
[1] 0.75

but if you put in an object, it will simply put it in a box, without telling you about it.

tmp=3/4

In our function, we put the result in an internal object, therefore nothing is immediately displayed. We need to save the output.

res=div(3,4)
res
[1] 0.75

Did it work? Of course it did!

We have several options to now input data in our function We can specify each argument by calling it by its name (because your mother raised y’all right!):

div = function(a,b) { a/b }  # Redefining the function so we don't have to
                             # tell R to display the result every time
div(a=3,b=4)
[1] 0.75
div(b=4,a=3)
[1] 0.75

Or we can specify arguments without using their names, as long as we follow the order they were sorted in when the function was created (or the order indicated in the help file)

div(3,4)      # R will automatically match the 1st inputted argument
              # to the 1st argument of the function,
              # here it will assign the value '3' to argument "a"
[1] 0.75
div(4,3)      # and here the value '4' to argument "a"
[1] 1.333333

It is possible to set an argument without calling it by its full name, as long as the sequence of letters used to call the argument is unambiguous.

div = function(the_numerator,the_denominator) {
                           the_numerator/the_denominator }  
div(the_numerator=3,the_denominator=2)
[1] 1.5
div(the_=3,the_=2)      # Return an error message indicating that R is not
                        # able to tell which entry correspond to which argument
Error in div(the_ = 3, the_ = 2) :
  formal argument "the_numerator" matched by multiple actual arguments
div(the_n=3,the_d=2)    # Now, no problem!
[1] 1.5

You can set a default value to an argument when defining the function. This value will then be taken if not specified during the function call (and replaced by the inputted value otherwise).

div = function(a,b=2) { a/b }     # Setting the denominator to 2 by default
div(a=3)   # Using the default value to perform 3/2  (and print the result)
[1] 1.5
div(a=3,b=10)    # Replacing the default value
[1] 0.3

Through the function “return()“, it is possible to specify, inside the function we are writing, what object(s) should be returned as an output.

div = function(a,b) {
                    c=a/b
                    return(c)
                    }
div(3,4)
[1] 0.75

This is especially convenient if we want to output several objects, that we can group into a list!

div = function(a,b) {
                    c=a/b
                    return( list(num=a, denom=b, res=c) )
                    }
div(3,4)
$num
[1] 3
$
denom
[1] 4
$res
[1] 0.75

On a side note, the function “cat()” allows you to print a line with whatever is specified as its argument. This can come handy if you want your function to return for example error messages or give instructions.

cat("Bonjour")   
Bonjour>

Oops, sorry, if you are currently typing this as you read, something strange happened: your prompt is now directly after “Bonjour”. I forgot to tell you that the function “cat()” doesn’t change line unless specifically told to. As previously said in the chapter about graphs, the combination of characters “\n” indicates to go to the next line.

cat("Bonjour  \n")
Bonjour

Better. We can now use that in our function.

div = function(a,b) {
                    c=a/b
                    cat(a,"divided by",b,"is equal to",c,"\n")
                    }
div(5,6)
5 divided by 6 is equal to 0.8333333

A quick side comment

Writing a function (or R code for that matter) can quickly become confusing. As lines of code pile up, it becomes more and more difficult to catch at a glance what’s happening. It is therefore not only useful but also essential to add comments to your code to document what each (at least main) step is doing. In order to do so, R uses the sign ‘#’. Everything on a line that follows a pound sign is not interpreted by R as instructions. And as soon as a line is jumped, text is again interpreted as instructions. For example:

div = function(a,b=2) {         # Yeah! I'm creating a function!
                      a/b       # It's computing stuff (without creating an internal object)
                      }         # End of function
div(40,10)
[1] 4

Now you understand, if it wasn’t the case already, why I have a bunch of lines with pound signs! It’s not simply because I’m a lunatic, but to make it possible to directly copied-pasted all of this text into R to run those commands!

Any comment?

Exercise 5.1

– Create a function “standard.error()” that compute the standard error of a vector

Reminder, the standard error of a set of values is equal to the standard deviation of the set divided by the square root of the number of values in the dataset (i.e. the sample size)

Answer5.1

standard.error=function(x){
                           st.err=sqrt(  var(x)/length(x)  )
                           st.err
                           }
standard.error(c(2,365,2,7,5,2,7,4,828,7,4,78,62))

[collapse]

FUNCTION, THE GENERAL STRUCTURE