The apply function in R
So as discussed in this post I will be investigating the different members of the 'apply function family' in R. This post starts with the most basic one, called apply().
The R manual states the following
apply(X, MARGIN, FUN, ...)
With the following arguments
X 
an array, including a matrix. 
MARGIN 
a vector giving the subscripts which the function will be applied over. E.g., for a matrix 1 indicates rows, 2 indicates columns, c(1, 2) indicates rows and columns. Where X has named dimnames, it can be a character vector selecting dimension names. 
FUN 
the function to be applied: see ‘Details’. In the case of functions like + , %*% , etc., the function name must be backquoted or quoted. 
So what does this mean in practice?
Basically it means that the user can apply a standard function (eg. mean, sum, etc.) or a user written function on a each element in a row/colum of the array X and do this per row and/or column as set in the MARGIN attribute. This MARGIN attribute is:
 1 if you want to calculate the FUN across all elements for each row
 2 if you want to calculate the FUN across all elements for each column
Example
To illustrate the different applications of the apply() function I will make use of the USPersonalExpenditure dataset. So first I am going to load this data by using the data() function.
1  data(USPersonalExpenditure) 
This data set consists of United States personal expenditures (in billions of dollars) in the categories; food and tobacco, household operation, medical and health, personal care, and private education for the years 1940, 1945, 1950, 1955 and 1960.
And it looks like this:
1940 
1945 
1950 
1955 
1960 

Food and Tobacco 
22.200 
44.500 
59.60 
73.2 
86.80 
Household Operation 
10.500 
15.500 
29.00 
36.5 
46.20 
Medical and Health 
3.530 
5.760 
9.71 
14.0 
21.10 
Personal Care 
1.040 
1.980 
2.45 
3.4 
5.40 
Private Education 
0.341 
0.974 
1.80 
2.6 
3.64

Now let's assume we are interested in to total expenditure per year. I can sum the values for a column by doing
1  sum(USPersonalExpenditure[,1]) 
However this is only for the first column (1940) and I want it for all years, so here we can start using apply. Because we want to apply the sum function across all values in a column, for each column.
1  apply(USPersonalExpenditure,2,sum) 
is all we need to do. If something equal with a for loop needed to be produced it would become something like:
1 2 3 4 5  a<NULL; for (i in 1:dim(USPersonalExpenditure)[2]) { a[i]<sum(USPersonalExpenditure[,i]) } names(a)<colnames(USPersonalExpenditure) 
As you can see, it takes much more lines to get the same result..
If we want to calculate the average spend across the 5 years in the matrix per category we get this through
1  apply(USPersonalExpenditure,1,mean) 
This ends my first tutorial. For questions/remarks/etc. please feel free to leave comments below or contact me through @geoffrey_stoel on twitter or on google+