Geoffs hangout on the interwebs about stuff I like and do…


The apply function in R

So as discussed in this post I will be investigating the different members of the 'apply function family' in R. This post starts with the most basic one, called apply().

The R manual states the following

apply(X, MARGIN, FUN, ...)

With the following arguments

X an array, including a matrix.
MARGIN a vector giving the subscripts which the function will be applied over. E.g., for a matrix 1 indicates rows, 2indicates columns, c(1, 2) indicates rows and columns. Where X has named dimnames, it can be a character vector selecting dimension names.
FUN the function to be applied: see ‘Details’. In the case of functions like +%*%, etc., the function name must be backquoted or quoted.

So what does this mean in practice? 

Basically it means that the user can apply a standard function (eg. mean, sum, etc.) or a user written function on a each element in a row/colum of the array X and do this per row and/or column as set in the MARGIN attribute. This MARGIN attribute is:

  • 1 if you want to calculate the FUN across all elements for each row
  • 2 if you want to calculate the FUN across all elements for each column


To illustrate the different applications of the apply() function I will make use of the USPersonalExpenditure dataset. So first I am going to load this data by using the data() function.


This data set consists of United States personal expenditures (in billions of dollars) in the categories; food and tobacco, household operation, medical and health, personal care, and private education for the years 1940, 1945, 1950, 1955 and 1960.

And it looks like this:

Food and Tobacco
Household Operation
Medical and Health
Personal Care
Private Education


Now let's assume we are interested in to total expenditure per year. I can sum the values for a column by doing


However this is only for the first column (1940) and I want it for all years, so here we can start using apply. Because we want to apply the sum function across all values in a column, for each column.


is all we need to do. If something equal with a for loop needed to be produced it would become something like:

for (i in 1:dim(USPersonalExpenditure)[2]) {

As you can see, it takes much more lines to get the same result..

If we want to calculate the average spend across the 5 years in the matrix per category we get this through


This ends my first tutorial. For questions/remarks/etc. please feel free to leave comments below or contact me through @geoffrey_stoel on twitter or on google+

Tagged as: , , No Comments

Moving up in the ranks: from an R-Rookie to an R-Pro

R_logoI am playing with R now for little over a year. Not very intensive, but once in a while I start up R Studio and do some coding and analysis. But I am still far, far away from becoming an R-Pro. If you talk to or read some of the posts of the more seasoned R users, it seems that one of the major steps an R-Rookie can make is in using the 'apply' family of functions instead of using for-loops. It seems to be more efficient and faster. I have been trying out some of these apply functions with a lof struggles. And some of the times I jumped back to the for-loop, because I could not use them in the right manner.

Ever since it has been on my 'someday/maybe' list to develop a better understanding of these functions and document it for myself in such a manner that I understand them and can apply them in the future. So that is the quest that I am on for the next couple of weeks. During this quest I will be posting updates on this blog to share my steps and basically build a  set of tutorials about the apply functions.

I will start with the normal apply() function and then move on to lapply(), sapply(), etc. from the base R package (I still have to think about the right order though). Afterwards I will have a look at the plyr package by Hadley Wickham.

I posted a question for input on this subject on google+, Twitter and LinkedIn and I received interesting and relevant feedback on this (and confirmation that I am not the only one struggling with understanding the apply functions). See you soon on my first post about the apply() function.

Tagged as: , , No Comments

Today is the day I found out there is no real alternative…..


Yesterday I was stumbled by the fact that Google decided to shut down Google Reader as part of their Spring cleanup. It was something at least I had never seem coming. As did a lot of my fellow Google Reader users who are disappointed by this fact.

So as of the 1st of July we need to find an alternative..... But is there one? I checked for the last 24 hours... installed some clients, but unfortunately.. Was not able to find a real alternative. Yes I looked at feedly, yes I looked at The old reader and yes I looked at newsblur.... This seem to be the 3 most mentioned alternatives.... But they all lack something I really liked about Google Reader. So developers out there.... please rebuild Google Reader from the ground up again with all the great features that I like so much about it.

  1. It's fast.... loading time for the web and on my Android devices was always short (Both newsblur and feedly really lack this speed - however this might have to do with the increase of new users after the Google announcement)
  2. It has multiplatform clients (Web, Android, iOS) that sync - I really hate it to see my feeds unread in my browser when I have read an article  5 minutes ago on my mobile phone. I access RSS feeds a lot on my mobile phone and tablets.. Honestly I think Google Reader is the most used app on my Android device, accessing it more than 15 times a day to see if something interesting has popped up.
  3. I like to be able to easily share it to other services (Pocket, Evernote, eMail) - interesting articles are not there for myself only. I share it with friends and colleagues... or when on the go and not having the time to extensively read it, post it to Pocket
  4. It has to have a 'clean browsing' option - I like all the fancy interfaces that feedly and newsblur offer, with all the colors, tiles, fancy HTML 5 magic, etc... but most of all I would like to navigate my feeds quickly and efficiently and read a lot of subject lines in one glance.

Rants and raves over.... Developers start making something beautiful please (that includes the above requirements)..... I can't believe RSS reading isn't a very nice mechanism to offer targeted ads - so even some money can be made here....

Are you aware of new development activities going on? Please let me know in the comments.... Looking forward to waste or uhhh .... spend time reading my RSS feeds in a worthy alternative before July 1st...