Clip raster with vector and calculate mean of clipped area, revisited

Two months ago, I posted  “Clip raster with vector and calculate mean of clipped area” blog post. I mentioned in this post that extraction and cellstats were taking time. I posted this blog on as well. One of the users (REORS) responded with few suggestions (see below).

Neat, I’ve got a similar tool in my own code (though my handling of shapefiles is really basic, not got to grips with them in R yet – had other things on my to-do list). Just a few ideas:

  • The extract() function might be closer to what you’re after it bypasses the need to mask out cells, and actually has some fairly advanced settings – this is kind of another case where the raser package has a really useful tool, but keeps somewhat quiet about it.
  • You might consider adding a crop() function infront of the mask(), where the polygon is a lot smaller than the raster this reduces the number of cells R has to mask out (though my notes on the speed of this are fairly subjective).
  • The drawPoly() function is a neat little tool that lets you define a shapefile by clicking on the plot window – though that’s kind of different to what you’re doing here.

and about speed, he sad

If your computer’s as old as the one I use – then speed is really a concern…
My version’s more aimed as a spectral profiler (along the same lines as can be found in ERDAS Imagine), but the broad idea is similar. If you’re interested the code’s here. Would you mind if I borrowed your idea of adding the results to the shapefile (if I can work out how to make it work)?

You can read full comments on this link.  I applied one of the suggestion (the use of crop() function), following is the new code (with couple of changes).

# install.packages("raster") #into your R console 


# (1) I assume that raster and vector are in one EPSG + vector is overlaping raster (polygons)
# (2) Vector and raster are in in project directory

#read shape file
shp = shapefile("Africa_boundaries_4326.shp")

# load reaster
img = raster("monthlyPCP1998_1.tif");

# filter by polygon ( this will go into loop)
country <- c("Algeria","Angola","Benin","Botswana","Burkina Faso","Burundi","Cameroon","Cape Verde","Central African Republic","Chad","Comoros","Congo","Djibouti","Egypt","Equatorial Guinea","Eritrea","Ethiopia","Gabon","Gambia, The","Gaza Strip","Ghana","Guinea","Guinea-Bissau","Ivory Coast","Kenya","Lebanon","Lesotho","Liberia","Libya","Madagascar","Malawi","Mali","Mauritania","Mauritius","Morocco","Mozambique","Namibia","Niger","Nigeria","Reunion","Rwanda","Sao Tome and Principe","Senegal","Seychelles","Sierra Leone","Somalia","South Africa","Sudan","Swaziland","Tanzania, United Republic of","Togo","Tunisia","Uganda","Western Sahara","Zaire","Zambia","Zimbabwe");

output <- matrix(ncol=3, nrow=length(country))

for (i in 1:length(country))

  # now filter
  filtered = shp[match(toupper(country[i]),toupper(shp$name)),]

  #now use the mask function
  # crop() has reduced the area for mask() thus very fast.
  rr <- mask(crop(img, filtered),filtered)

  country_sum <-cellStats(rr, mean, na.rm=TRUE);

  output[i,1] <- i;
  output[i,2] <- country[i];
  output[i,3] <- country_sum;


write.csv(output, file ="data.csv")


Thanks to REORS, Crop() function is reducing the areas to be masked (clipped), thus script became super fast.

Open Source rocks 🙂