Geoffs hangout on the interwebs about stuff I like and do…

28Mar/13

The apply function in R

So as discussed in this post I will be investigating the different members of the 'apply function family' in R. This post starts with the most basic one, called apply().

The R manual states the following

apply(X, MARGIN, FUN, ...)

With the following arguments

X an array, including a matrix.
MARGIN a vector giving the subscripts which the function will be applied over. E.g., for a matrix 1 indicates rows, 2indicates columns, c(1, 2) indicates rows and columns. Where X has named dimnames, it can be a character vector selecting dimension names.
FUN the function to be applied: see ‘Details’. In the case of functions like +%*%, etc., the function name must be backquoted or quoted.

So what does this mean in practice? 

Basically it means that the user can apply a standard function (eg. mean, sum, etc.) or a user written function on a each element in a row/colum of the array X and do this per row and/or column as set in the MARGIN attribute. This MARGIN attribute is:

  • 1 if you want to calculate the FUN across all elements for each row
  • 2 if you want to calculate the FUN across all elements for each column

Example

To illustrate the different applications of the apply() function I will make use of the USPersonalExpenditure dataset. So first I am going to load this data by using the data() function.

1
data(USPersonalExpenditure)

This data set consists of United States personal expenditures (in billions of dollars) in the categories; food and tobacco, household operation, medical and health, personal care, and private education for the years 1940, 1945, 1950, 1955 and 1960.

And it looks like this:

1940
1945
1950
1955
1960
Food and Tobacco
22.200
44.500
59.60
73.2
86.80
Household Operation
10.500
15.500
29.00
36.5
46.20
Medical and Health
3.530
5.760
9.71
14.0
21.10
Personal Care
1.040
1.980
2.45
3.4
5.40
Private Education
0.341
0.974
1.80
2.6
3.64

 

Now let's assume we are interested in to total expenditure per year. I can sum the values for a column by doing

1
sum(USPersonalExpenditure[,1])

However this is only for the first column (1940) and I want it for all years, so here we can start using apply. Because we want to apply the sum function across all values in a column, for each column.

1
apply(USPersonalExpenditure,2,sum)

is all we need to do. If something equal with a for loop needed to be produced it would become something like:

1
2
3
4
5
a<-NULL;
for (i in 1:dim(USPersonalExpenditure)[2]) {
  a[i]<-sum(USPersonalExpenditure[,i])
}
names(a)<-colnames(USPersonalExpenditure)

As you can see, it takes much more lines to get the same result..

If we want to calculate the average spend across the 5 years in the matrix per category we get this through

1
apply(USPersonalExpenditure,1,mean)

This ends my first tutorial. For questions/remarks/etc. please feel free to leave comments below or contact me through @geoffrey_stoel on twitter or on google+


Tagged as: , , No Comments
28Mar/13

Moving up in the ranks: from an R-Rookie to an R-Pro

R_logoI am playing with R now for little over a year. Not very intensive, but once in a while I start up R Studio and do some coding and analysis. But I am still far, far away from becoming an R-Pro. If you talk to or read some of the posts of the more seasoned R users, it seems that one of the major steps an R-Rookie can make is in using the 'apply' family of functions instead of using for-loops. It seems to be more efficient and faster. I have been trying out some of these apply functions with a lof struggles. And some of the times I jumped back to the for-loop, because I could not use them in the right manner.

Ever since it has been on my 'someday/maybe' list to develop a better understanding of these functions and document it for myself in such a manner that I understand them and can apply them in the future. So that is the quest that I am on for the next couple of weeks. During this quest I will be posting updates on this blog to share my steps and basically build a  set of tutorials about the apply functions.

I will start with the normal apply() function and then move on to lapply(), sapply(), etc. from the base R package (I still have to think about the right order though). Afterwards I will have a look at the plyr package by Hadley Wickham.

I posted a question for input on this subject on google+, Twitter and LinkedIn and I received interesting and relevant feedback on this (and confirmation that I am not the only one struggling with understanding the apply functions). See you soon on my first post about the apply() function.


Tagged as: , , No Comments
15Mar/13

Today is the day I found out there is no real alternative…..

google-reader-logo

Yesterday I was stumbled by the fact that Google decided to shut down Google Reader as part of their Spring cleanup. It was something at least I had never seem coming. As did a lot of my fellow Google Reader users who are disappointed by this fact.

So as of the 1st of July we need to find an alternative..... But is there one? I checked for the last 24 hours... installed some clients, but unfortunately.. Was not able to find a real alternative. Yes I looked at feedly, yes I looked at The old reader and yes I looked at newsblur.... This seem to be the 3 most mentioned alternatives.... But they all lack something I really liked about Google Reader. So developers out there.... please rebuild Google Reader from the ground up again with all the great features that I like so much about it.

  1. It's fast.... loading time for the web and on my Android devices was always short (Both newsblur and feedly really lack this speed - however this might have to do with the increase of new users after the Google announcement)
  2. It has multiplatform clients (Web, Android, iOS) that sync - I really hate it to see my feeds unread in my browser when I have read an article  5 minutes ago on my mobile phone. I access RSS feeds a lot on my mobile phone and tablets.. Honestly I think Google Reader is the most used app on my Android device, accessing it more than 15 times a day to see if something interesting has popped up.
  3. I like to be able to easily share it to other services (Pocket, Evernote, eMail) - interesting articles are not there for myself only. I share it with friends and colleagues... or when on the go and not having the time to extensively read it, post it to Pocket
  4. It has to have a 'clean browsing' option - I like all the fancy interfaces that feedly and newsblur offer, with all the colors, tiles, fancy HTML 5 magic, etc... but most of all I would like to navigate my feeds quickly and efficiently and read a lot of subject lines in one glance.

Rants and raves over.... Developers start making something beautiful please (that includes the above requirements)..... I can't believe RSS reading isn't a very nice mechanism to offer targeted ads - so even some money can be made here....

Are you aware of new development activities going on? Please let me know in the comments.... Looking forward to waste or uhhh .... spend time reading my RSS feeds in a worthy alternative before July 1st...


11Jun/10

Dag 5: Operator bezoek en terugreis

Vandaag al weer de laatste dag in Uganda. Vandaag is weer een 'normale' werkdag. Uitwerken van de workshop tot een goed workshop verslag dat ook gebruikt kan worden voor de subsidie aanvraag en nog een aantal operators bezoeken.


Filed under: News Continue reading
11Jun/10

Dag 4: Gulu (Cruyff court en Echo Bravo)

Cruyff Court
Vandaag is Heroes day in Uganda - een nationale feestdag n.a.v. eerste regeerdag van de regering. Jennifer komt ons ophalen bij het hotel, waarna we na een korte stop bij het War Child kantoor doorgaan naar het Cruyff Court. Dit is een sportveld aan de rand van Gulu dat in februari dit jaar geopend is a.g.v. een samenwerking tussen War Child en de Johan Cruyff Foundation. Het bewaakte Cruyff Court in Gulu ligt in de buurt van zes scholen, met in totaal zo’n 7.000 kinderen. Zij profiteren samen met andere kinderen uit de omgeving van het veld, dat onder meer gebruikt zal worden voor de Interparish Football and Netball Charity League. In deze mede door War Child georganiseerde sportcompetitie worden sportieve met sociale activiteiten gekoppeld. Een  comité van vertegenwoordigers van alle betrokken scholen, de bond voor gehandicapten en de lokale overheid ziet, in samenwerking met War Child, toe op het beheer en onderhoud van het Cruyff Court.

Filed under: News Continue reading