Phylogenetic trees in R using ggtree

Recently, one R package which I like to use for visualizing phylogenetic trees got published. It’s called ggtree, and as you might guess from the name it is based on the popular ggplot2 package. With ggtree, plotting trees in R has become really simple and I would encourage even R beginners to give it a try! When you’ve gotten the hang of it, you can modify and annotate your trees in endless ways to suit your needs.

ggtree supports the two common tree formats Newick and Nexus. It also reads outputs from a range of tree-building software such as BEAST, EPA, HYPHY, PAML, PHYLDOG, pplacer, r8s, RAxML and RevBayes.

library("ape")
library("Biostrings")
library("ggplot2")
library("ggtree")
nwk <- system.file("extdata", "sample.nwk", package="ggtree")
tree <- read.tree(nwk)
tree

After you’ve loaded your tree in R, visualization is really simple. The ggtree function directly plots a tree and support several layouts, such as rectangular, circular, slanted, cladogram, time-scaled, etc.

ggtree(tree)

Add a tree scale.

ggtree(tree) + geom_treescale()

You can easily turn your tree into a cladogram with the branch.length = “none” parameter.

ggtree(tree, branch.length="none")
ggtree(tree, layout="circular") + ggtitle("(Phylogram) circular layout")

You can visualize a time-scaled tree by specifying the parameter mrsd (most recent sampling date).

beast_file <;- system.file("examples/MCC_FluA_H3.tree", package="ggtree")
beast_tree <- read.beast(beast_file)
ggtree(beast_tree, mrsd='2013-01-01') + theme_tree2() +
ggtitle("Divergence time")

With the groupClade and groupOTU methods you can cluster clades or related OTUs, and assign them different colors for example.

tree <- groupClade(tree, node=c(21, 17))
ggtree(tree, aes(color=group, linetype=group)) + geom_tiplab(aes(subset=(group==2)))

Here’s an example of how to display taxa classifications with groupOTU.

data(chiroptera, package="ape")
groupInfo <- split(chiroptera$tip.label, gsub("_\\\w+", "", chiroptera$tip.label))
chiroptera <- groupOTU(chiroptera, groupInfo)
ggtree(chiroptera, aes(color=group), layout='circular') + geom_tiplab(size=1, aes(angle=angle))


Annotate selected clades in trees.

set.seed(2015-12-21)
tree2 = rtree(30)
p <- ggtree(tree2) + xlim(NA, 6)
p + geom_cladelabel(node=45, label="test label", align=T, color='red') +
    geom_cladelabel(node=34, label="another clade", align=T, color='blue')



 Or perhaps you want to annotate the tree tips with silhouettes of your organisms from the PhyloPic database?

ggtree(tree, branch.length="none") %>% phylopic("9baeb207-5a37-4e43-9d9c-4cfb9038c0cc", color="darkgreen", alpha=.8, node=4) %>%
  phylopic("2ff4c7f3-d403-407d-a430-e0e2bc54fab0", color="darkcyan", alpha=.8, node=2) %>%
  phylopic("a63a929b-1b92-4e27-93c6-29f65184017e", color="steelblue", alpha=.8, node=3)

For more functions and help, be sure to check out the ggtree package info on Bioconductor where most of these examples come from. Guangchuang Yu and Tommy Tsan-Yuk Lam have written a great tutorial, highlighting some of the possibilities you can do within this package. If you encounter any problems, be sure to direct your questions to the authors.

If ggtree isn’t your thing, there are of course other ways and packages to plot trees in R. Ethan Linck has previously written a post here on TME about “Quick and dirty tree building in R”.

References

Yu, G., Smith, D. K., Zhu, H., Guan, Y. and Lam, T. T.-Y. (2017) ggtree: an r package for visualization and annotation of phylogenetic trees with their covariates and other associated data. Methods Ecol Evol, 8: 28–36. doi: 10.1111/2041-210X.12628

This entry was posted in bioinformatics, howto, phylogenetics, R and tagged , , , . Bookmark the permalink.