A couple years ago, Benoit Simon-Bouhet ended up sharing an office with Eric Pante, then a post-doc fellow in his former lab. The two quickly realized they were in a lab in which few people had the expertise or taste for coding. Thus, on a daily basis, they were both approached by colleagues and students to take a look at their data analysis and graphics. Meanwhile, they had their own not-so-easy tasks of creating publication-quality maps for themselves as well as their colleagues.
They both had bits and pieces of R scripts scattered around their hard drives to (i) import bathymetric data previously downloaded locally from public databases, (ii) reshape these data in a form suitable for plotting in R and (iii) plot the bathymetry together with other data such as sampling sites or other locations of interest. The process was tedious, convoluted (especially for the manually download online bathymetric data) and required a good knowledge of the R scripting langage.
In order to ease this process, the two embarked on creating marmap (short for marine maps) …
Benoit and Eric quickly realized that, for most ecologists, the existing workflow to produce publication-quality maps was to either spend a lot of money by using proprietary software or to spend a lot of time by using free alternatives with a steep learning curve. Since more and more ecologists perform their data analyses within R, they thought it might be of interest to a lot of people to be able to easily import bathymetric data from public databases without having to leave the R environment and to plot these data without having to write dozens of lines of code.
So gathering their scripts, they translated them into functions and wrote help files. Most of the code is still rough around the edges, but today, marmap users are able to produce great-looking bathymetric maps in minutes with only a few lines of code as compared to 50-80 lines of code without marmap.
# Load package library(marmap)
# Fetch data on NOAA servers and write on disk bat <- getNOAA.bathy(-20, 0, 40, 60, res = 4, keep=TRUE)
# Create nice looking color palettes blues <- c("lightsteelblue4", "lightsteelblue3", "lightsteelblue2", "lightsteelblue1") greys <- c(grey(0.6), grey(0.93), grey(0.99))
# Plot plot(bat, image = TRUE, land = TRUE, lwd = 0.1, bpal = list(c(0, max(bat), greys), c(min(bat), 0, blues))) plot(bat, lwd = 0.8, deep = 0, shallow = 0, step = 0, add = TRUE) # highlight coastline
When they created marmap, the original goal was to make the typical marine ecologist’s life easier. However, these functions do more than just plot bathymetric maps. They can be used to produce elevation maps as well, making marmap a useful package for marine and terrestrial ecologists alike. Indeed, most public databases (e.g. ETOPO1 hosted by NOAA or the gridded bathymetry data hosted by GEBCO) do not differentiate between bathymetric and hypsometric data.
Spatial analysis features
Besides easy plotting, Eric and Benoit wanted to go further. Once again, the starting point was to make an ecologist’s life easier!
As molecular ecologists, the two need to evaluate the degree of connectivity of the populations they study. They frequently use landscape genetic methods and classic isolation by distance approaches. For both methods, the computation of realistic geographic distances between pairs of populations/individuals is critical. This is a challenge for marine organisms since the movement of individuals is often restricted to specific slices of water. Moreover, they can almost never cross land (peninsulas, islands, etc). The use of euclidian or great circle distances are inadequate, even though they are still widely used in the literature.
One alternative is to compute least cost paths (which is not perfect either, but it is a step in the right direction, see for instance McRae & Beier 2007). As for mapping bathymetric data, computing least cost paths could be done with proprietary software (often requirering specific and costly additional modules) or with free alternatives that required the user to type dozens of lines of code. Here, they used the existing R package gdistance (van Etten 2011) to develop an easy-to-use function with sensible defaults for marine ecologists.
As an illustration, Viricel (2012) used least cost paths to infer the connectivity of spotted dolphins (Stenella frontalis) in the Gulf of Mexico. This species almost exclusively uses habitats located above the continental shelf and are virtually never found in waters deeper than 300m. Thus, these animals cannot cross the Gulf of Mexico from West to East along a straight line. To estimate potential gene flow (or lack thereof) between populations, computing accurate geographic distances between individuals must take into account the fact that individuals are restricted to the continental shelf when moving/migrating. The script used in the paper was similar to the following:
# Import bathymetry bat <- getNOAA.bathy(-100, -80, 22, 31, res = 1, keep = TRUE) # Load location of individuals (these are NOT from Viricel 2012) loc <- data.frame( x = c(-96.92707, -96.60861, -96.86875, -96.14351, -92.82518, -90.86053, -90.14208, -84.64081, -83.81274, -81.13277, -80.33498, -88.52732, -94.46049), y = c(25.38657, 25.90644, 26.57339, 27.63348, 29.03572, 28.16380, 28.21235, 26.71302, 25.12554, 24.50031, 24.89052, 30.16034, 29.34550) ) # Compute least cost paths between -5m and -300m. # Beware! Computation takes time with high resolution bathymetries! tr <- trans.mat(bat, min.depth = -5, max.depth = -300) cost <- lc.dist(tr, loc, res="path") # Plot map with isobaths every 1000m plot(bat, image = TRUE, land = TRUE, deep=-4000, shallow=-1000, step=1000, drawlabels = FALSE, bpal = list(c(min(bat,na.rm=TRUE), 0, blues), c(0, max(bat, na.rm=TRUE), greys)), lwd = 0.1) # Add -300m isobath plot(bat, deep = -300, shallow = -300, step = 0, lwd = 0.5, add = TRUE, drawlabels = TRUE) # Add coastline plot(bat, deep = 0, shallow = 0, step = 0, lwd = 1, add = TRUE) # Add least cost paths and the position of individuals dummy <- lapply(cost, lines, col = col2alpha("orange", 0.5), lwd = 0.8, lty = 1) points(loc, bg = "orange", cex = 0.8, pch = 21)
They also decided to include a set of spatial analysis features to simplify the computation of various quantitative variables related to bathymetry/hypsometry, geographic distances and areas. As for other aspects of marmap, they endeavored to keep things as simple and user-friendly as possible. Marmap functions take advantage of other packages (e.g. geosphere, raster, sp, ncdf… see here for an exhaustive list) to make the following functionalities as easy to use as possible:
- compute (and plot) great circle distances between points
- compute (and plot) the shortest great circle distance between each point in a set and an isobath
- compute (and plot) projected surfaces within a depth/altitude range
- compute least cost paths within a depth/altitude range
- retrieve depth/altitude of arbitrary points, including points gathered by gps tracking/biologgers, etc.
- retrieve depth/altitude following a transect or a belt transect and plot in 2D or 3D
- create (and plot) circular buffer areas around any point
- use projections
Examples of most of these methods are provided as a package vignette called marmap-DataAnalysis.pdf. Two other vignettes describing (i) plotting functionalities (including maps in the antimeridian region and projections) and (ii) data import and export strategies, are available on the marmap cran page or using the
vignette() function in R. They have also published a paper introducing the package and its functionalities.
What marmap users taught them
Marmap and its accompanying PLoS One paper have never not yet become viral. However, marmap is today among the top 15% of the most downloaded R packages (data from Rstudio mirror). They’ve crossed the 10000 views checkmark of the PONE article last march after only 18 months. The article is in the top 5% of the +3.5 million articles tracked by altmetric. This demonstrates the real need in the ecology community for free and easy-to-use graphing and spatial analysis tools.
They’ve received great feedback from marmap users either by e-mail or through questions and bug reports on stack overflow and encourage the community to try marmap and to send feature requests or suggestions.
In the past year this has led to the initiation of multiple collaborations. For instance, Jean-Olivier Irisson is now listed as co-author of the marmap package since he contributed code to produce
ggplot2 bathymetric maps:
library(ggplot2) # Load North West Atlantic dataset data(nw.atlantic) atl <- as.bathy(nw.atlantic) # Plot with ggplot2 autoplot(atl, geom=c("raster", "contour"), colour="white", size=0.1) + scale_fill_gradient2(low="dodgerblue4", mid="gainsboro", high="darkgreen")
This came in handy for at least one user on stackoverflow!
Another example is the implementation of buffer areas often used by ornithologists or marine mammologists studying central place foragers. For such species, individuals rest and breed on a colony on land, but they travel at sea to feed. In a conservation context, exploring habitat use by protected species is critical to delineate pertinent Marine Protected Areas. On the request of Louise Soanes, research fellow at the university of Liverpool, they included functions allowing the computation of buffer zones and projected areas within slices of user-defined depths/altitudes:
# load and plot a bathymetry data(florida) plot(florida, lwd = 0.2) plot(florida, n = 1, lwd = 0.7, add = TRUE) # Create a point and a buffer around this point loc <- data.frame(-80, 26) buf <- create.buffer(florida, loc, radius=1.8) # Get the surface within the buffer for several depth slices surf1 <- get.area(buf, level.inf=-200, level.sup=-1) surf2 <- get.area(buf, level.inf=-800, level.sup=-200) surf3 <- get.area(buf, level.inf=-3000, level.sup=-800) s1 <- round(surf1$Square.Km) s2 <- round(surf2$Square.Km) s3 <- round(surf3$Square.Km) # Add buffer elements on the plot col.surf1 <- rgb(0.7, 0.7, 0.3, 0.3) col.surf2 <- rgb(0, 0.7, 0.3, 0.3) col.surf3 <- rgb(0.7, 0, 0, 0.3) plotArea(surf1, col = col.surf1) plotArea(surf2, col = col.surf2) plotArea(surf3, col = col.surf3) plot(outline.buffer(buf), add = TRUE, lwd = 0.7) points(loc, pch = 19, col = "red") ## Add legend legend("topleft", fill = c(col.surf1, col.surf2, col.surf3), legend = c(paste("]-200 ; -1] -",s1,"km2"), paste("]-800 ; -200] -",s2,"km2"), paste("]-3000 ; -800] -",s3,"km2")))
Finally, on several occasions, colleagues asked them to implement features enabling an easy way to plot data from gps tracking devices or biologgers deployed on various species. Here is an example of what can be done in just a few lines of code if there are two gps track records stored in data.frames called
## lon lat time ## 1 174.7257 -35.85750 1 ## 2 174.6547 -35.85723 30 ## 3 174.6352 -35.88704 30 ## 4 174.6379 -35.90232 48 ## 5 174.6611 -35.89979 79 ## 6 174.6538 -35.90193 79
## lon lat time ## 1 174.5459 -35.50225 5 ## 2 174.5510 -35.51112 59 ## 3 174.5547 -35.56361 62 ## 4 174.5588 -35.57187 103 ## 5 174.5606 -35.58852 130 ## 6 174.5726 -35.59403 160
# Download bathymetric data and save on disk bat <- getNOAA.bathy(174, 176, -37, -35, res = 1, keep = TRUE) # Get depth profile along both tracks and remove positive depths (since random fake values can be on land) # path.profile() gets depth value for all cells of the bathymetric grid below the gps tracks path1 <- path.profile(track1[,-3], bat) ; path1 <- path1[-path1[,4]>0,] path2 <- path.profile(track2[,-3], bat) ; path2 <- path2[-path2[,4]>0,] # Get depth values for each gps tracking point # get.depth() retrieve depth values only for gps tracking points depth1 <- get.depth(bat, track1$lon, track1$lat, locator = FALSE, distance = TRUE) ; depth2 <- get.depth(bat, track2$lon, track2$lat, locator = FALSE, distance = TRUE) ; # Add depth values to tracks 1 and 2 and remove positive depths track1$depth <- depth1$depth ; track1 <- track1[-track1$depth > 0,] track2$depth <- depth2$depth ; track2 <- track2[-track2$depth > 0,] # Plot layout(matrix(c(1, 2, 1, 3, 4, 4), ncol = 2, byrow = TRUE), height = c(1, 1, 1)) ## Bathymetric map with gps tracks plot(bat, land = TRUE, image = TRUE, lwd = 0.2, bpal = list(c(min(bat,na.rm=TRUE), 0, blues), c(0, max(bat, na.rm=TRUE), greys)), ylim=c(-36.5,-35.25)) plot(bat, deep = 0, shallow = 0, step = 0, lwd = 0.8, add = TRUE) lines(track1, col = "brown3") lines(track2, col = "darkorange1") legend("topright", legend = c("Track 1", "Track 2"), lwd = 1, col = c("brown3", "darkorange1"), pch = 1, pt.cex = 0.5, bg="white") ## Depths profiles along gps tracks plotProfile(path1, main = "Track 1") plotProfile(path2, main = "Track 2") ## Depth as a function of time since the deployment of gps tags par(mar=c(5,4,1,2)) plot(track1$time, track1$depth, xlab = "Time (arbitrary unit)", ylab = "Depth (m)", type = "o", cex = 0.5, col = "brown3", xlim = range(c(track1$time, track2$time)), ylim = range(c(track1$depth, track2$depth))) lines(track2$time, track2$depth, type = "o", cex = 0.5, col = "darkorange1") legend("bottomright", legend = c("Track 1", "Track 2"), lwd = 1, col = c("brown3", "darkorange1"), pch = 1, pt.cex = 0.5)
Benoit and Eric are still actively working on the development of marmap, based on their own evolving needs and on requests from users. For instance, they are currently working on adding EEZ and plate boundary data. Also, they are welcoming new developers, notably to help implement efficient methods to load and analyze custom bathymetric data that are irregularly spaced (i.e. data that do not fit in a grid). This is one of the most requested features and they do not yet have an easy way of implementing it. Krigging methods seems to be the way to go but their knowledge in the matter is more than limited.
So they told me to publicize the call to contribute,provide suggestions for improving marmap, or if you want them to add new functionalities, drop them a line!
Thanks Benoit (for writing the majority of this post) and Eric!