A population genetic R-evolution

Uphill, both ways, in the snow, without shoes … quite apt when thinking of the dark days, in the not too distant past, in which a separate input file was needed for each popgen analysis in order to use a handful of separate programs (often for idiosyncratic reasons).
Add a complex life cycle into the mix, such as an alternation between haploid and diploid free-living phases, and you can multiply the number of input files by two. Yet, then you’d have to maintain a list of programs that only were compatible with diploids and the ones that will take diploids and haploids, but separately.
The reality is that not all organisms fit nicely into a diploid-only or even haploid-only box.
When I began my PhD and my first foray into the population dynamics of haploid-diploid seaweeds, GenAlEx (Peakall and Smouse 2006, 2012) was a revelation in terms of ease of use. There was one input sheet (though still per ploidy) and it could be stored with all output sheets in the same, albeit massive, Excel file, reducing the seemingly endless array of individual input and output files.
As 2015 dawns, the brave new world of population genetic analyses in R may make the multiple popgen input files of yesteryear a relict, not unlike floppy disks or Beta-decks.
For population geneticists, and especially those with a penchant for organisms that don’t conform, R is a limitless palette with a much larger popgen repertoire than before.
Continue reading

Posted in howto, methods, population genetics, R, software, Uncategorized | 3 Comments

Whip it. Population structure and cross-species transmission of Whipworms

Whipworm (photo from WikiMedia commons)


This may be my second worm-related post, but it comes from the PLoS journal that is first in my heart: PLoS Neglected Tropical Diseases. And, as the journal name suggests, it is about a neglected tropical disease: the Whipworm (Trichuris sp.).
Continue reading

Posted in Uncategorized | Leave a comment

Linking gene expression and phenotype in an emerging model organism

Female Tigriopus californicus with egg sack. Photo by Morgan Kelly

Female Tigriopus californicus with egg sack. Photo by Morgan Kelly


Last week in his post “Transcriptomics in the wild (populations),” TME contributor Noah Snyder-Mackler focused on a recent paper by Alvarez et al. that reviews the last decade of transcriptomic research including the goal of linking gene expression and phenotype. Researchers today routinely collect transcriptomic data for non-model organisms but without robust genomic resources, (for example, a well-annotated genome) and/or the ability to perform genomic manipulations (for example, knockout organisms), it is often difficult (and sometimes controversial) to assign function to candidate genes.
The tide pool copepod Tigriopus californicus (pictured above) is an up and coming model system for a wide range of research areas including physiology, neurobiology, ecology, speciation, hybridization, and local adaptation. The Burton, Edmands, Kelly, and Willett labs (among others) continue to generate genomic and transcriptomic data for Tigriopus and a new method published recently in Molecular Ecology Resources by Barreto, Schoville and Burton is an important contribution to the Tigriopus genomic toolbox.
Continue reading

Posted in genomics, howto, methods, Uncategorized | Leave a comment

Species and sensibility

Ciona intestinal is a species complex composed of 4 species. © SA Krueger-Hadfield 2012

Ciona intestinalis a species complex composed of 4 species. © SA Krueger-Hadfield 2012


Pante et al. (2014) performed a literature review of marine population connectivity in order to illustrate the biased estimates of connectivity which can result from the failure to recognize an evolutionary-relevant unit, such as a species.
When exploring the connectivity of a set of populations, it may be necessary to revise and reassess taxonomic status.  This is particularly true in the marine environment, which is vastly under-sampled as compared to terrestrial habitats.
Poor species delimitation doesn’t just affect an individual connectivity study, but can affect meta-analyses and reviews investigating evolutionary and ecological trends.  It can also affect studies of speciation, phylogenetic studies, invasion biology and biodiversity inventories.
The authors review relevant examples of over- and under-estimation of connectivity due to poor species delimitation.  They also provide a primer on delimiting a species and treating them as scientific hypotheses.
But, it’s important to note that the results from careful connectivity studies can provide evidence about divergence between different lineages.  However, in order to carefully explore connectivity, we need to keep in mind:

(1) the state of knowledge on the biology of the studied organisms, (2) the state of taxonomic treatments of the studied organisms, (3) the spatial and temporal scales of sampling, (4) the characters used to infer connectivity patterns and (5) how to synthetize information in multimarker studies

In other words, we need to take into account life history and ecological traits.  If the above knowledge is limited or nonexistent, the authors propose incorporating this uncertainty into the sampling design.  It could also be possible to include closely related taxa for groups in which the phylogeny is poorly understood, for example, deep-sea organisms.
The authors also stress the importance of including life-history traits and their spatio-temporal variability into the design of sampling effort, such as clonal reproduction.
Finally, they articulate the use of multiple and diverse markers, while pointing out the importance of moving away from the sole use of mitochondrial genes.
Pante E, Puillandre N, Viricel A, Arnaud-Haond S, Aurelle D, et al. (accepted) Species are hypotheses: avoid connectivity assessments based on pillars of sand. DOI: 10.1111/mec.13048

Posted in adaptation, community ecology, conservation, DNA barcoding, natural history, next generation sequencing, phylogenetics, population genetics, speciation, theory | 6 Comments

Recent Ancestry of the USA and the 100k Genome Project

Holiday presents for pop-gen enthusiasts come in the form of data – boatloads of it! The past two weeks saw the announcements of two neat studies that spell monumental steps toward our understanding of the genetics of mixed populations.

With a relatively recent migratory history, much of North America has been a mixture of peoples. While a lot of the ancestry analysis of North America has been anecdotal, a large scale study of the genetic make-up of the USA has yet to be conducted. In a recent study, Bryc at al., as a culmination of large scale genotyping from stocking-stuffers by 23andMe, fill in some of these blanks.

Mean European/Native American/Latino ancestry among 23andMe customers across North America. Image courtesy: http://www.cell.com/ajhg/ppt/S0002-9297(14)00476-5.ppt

Important conclusions from the study include a) greater variation in African ancestry among self-identified African-Americans, primarily Iberian ancestry among self-identified Latino-Americans, and localized (by state) variation in European ancestry across the USA, b) sex bias in ancestral composition, indicative of social contributors to genomic admixture, and c) larger correlation between self-identified ancestry and genomic ancestry than detected by previous studies.

The pipeline utilized in the study (termed “Ancestral Composition”) has been detailed in another study by Durand et al. In brief, the steps involved are (1) phasing high-density SNP chip genotype data, (2) identifying IBD (Identical By Descent, here used to represent phased genomic regions, with most SNP’s in the region being directly derived from the common ancestor) tracts, (3) assigning local ancestry to these IBD tracts using an SVM-based classifier.

Perhaps most importantly, however, our results reveal the impact of centuries of admixture in the US, thereby undermining the use of cultural labels that group individuals into discrete non-overlapping bins in biomedical contexts “which cannot be adequately represented by arbitrary ‘race/color’ categories.”

In other news, the NHS just announced plans to sequence 100,000 human genomes to quantify the dynamics of 110 hereditary disorders, including leukemia, breast, bowel, ovarian, and lung cancers. More data! 2015 definitely has a very promising outlook towards the applications of genomics in personalized medicine.

References:

Bryc, Katarzyna, et al. “The Genetic Ancestry of African Americans, Latinos, and European Americans across the United States.” The American Journal of Human Genetics (2014). http://dx.doi.org/10.1016/j.ajhg.2014.11.010

Durand, Eric Y., et al. “Ancestry Composition: A Novel, Efficient Pipeline for Ancestry Deconvolution.” bioRxiv (2014): 010512. http://dx.doi.org/10.1101/010512

Posted in genomics, population genetics | Tagged , , | 1 Comment

Totally RAD, Part 2


Edit (8/20/15): I used the wrong web address for Kimberly Andrews! Go check out her work here. Sorry Kim!
Restriction site-associated DNA sequencing (RADseq) is quickly becoming the go-to methodology for collecting population genetic data, and the methodological difficulties of a technique that is exploding in popularity are coming along with it.
Last month, Stacy pointed you towards a review of RADseq protocols that detailed some methodological differences, but of course, there is always more detail out there. In the most recent issue of Molecular Ecology, Kimberly Andrews and colleagues provided a reply to the Puritz et al. paper, adding some additional clarity to the nuances that separate the different RADseq protocols.
Specifically, Andrews and colleagues go into more depth considering the consequences of PCR duplicates, the product of amplification biases during PCR.

The impact of PCR duplicates on population genomics analyses has not been quantified in the literature, but high frequencies of duplicates are expected to impact analyses by falsely increasing homozygosity and by making PCR errors appear to be true alleles (false alleles, Pompanon et al. 2005).

The simplest way to deal with this problem, as well as avoiding other issues of fragment size bias, is to make fragments different sizes from the beginning.

…the most straight-forward method currently developed for identifying RADseq PCR duplicates can only be used for data generated using methods that have a random-shearing step and also generate paired-end sequences (PE-RADseq). For these methods, PCR duplicates can be identified as fragments that are identical in insert length and sequence composition, because random shearing ensures that fragments at a given locus are unlikely to be of equal length unless they are duplicates

Unfortunately for RADseq protocols without a random-shearing step (which is most), there is currently no well-supported way to correct this issue. However, you can bet that there are a number of approaches in the works.
Lastly, the authors reiterate an important consideration for anyone who is considering RAD-seq data as an option for answering the scientific question of their choice: think hard about costs and technical complexity. Depending on whether you have the option to pool samples or not, resources devoted to a project can vary widely.
Welcome to the RAD fad. Better get a big cup of coffee, because you’ve got a lot of reading to do.
 
Andrews K.R., Michael R. Miller, Brian Hand, James E. Seeb & Gordon Luikart (2014). Trade-offs and utility of alternative RADseq methods, Molecular Ecology, n/a-n/a. DOI: http://dx.doi.org/10.1111/mec.12964
Pompanon F., Eva Bellemain & Pierre Taberlet (2005). Genotyping errors: causes, consequences and solutions, Nature Reviews Genetics, 6 (11) 847-846. DOI: http://dx.doi.org/10.1038/nrg1707

Posted in genomics, Molecular Ecology views, next generation sequencing, population genetics | Tagged , , | 1 Comment

Transcriptomics in the wild (populations)

modified book cover from "Wild" by Cheryl Strayed
The genomics revolution is coming has already come. The past decade has seen countless advances in genomic techniques – many of which are now commonly found in any molecular ecologist’s toolbox. For example, instead of measuring gene expression in one or a few genes using RT qPCR, we can now measure genome-wide transcriptional activity using microarrays and RNA-sequencing (‘RNA-seq’). The amount of data being generated using these techniques has been growing exponentially over the past few years. So, Mariano Alvarez and colleagues decided that it was as good time as any to take stock of the past decade of transcriptomics studies in the wild.
Continue reading

Posted in genomics, next generation sequencing | Tagged , , | 1 Comment

Hybrid speciation is for the birds (and plants, reptiles, fish, and insects)

The Italian sparrow

The Italian sparrow, Passer italiae, a hybrid species whose parentals are the house sparrow, Passer domesticus, and the Spanish sparrow, Passer hispaniolensis. Photo courtesy of Alessandro Landi


R. A. Fisher once called hybridization ‘‘the grossest blunder in sexual preference which we can conceive of an animal making.” While there may be negative fitness consequences for an individual who mates across species boundaries, the evolutionary significance of hybridization in speciation, introgression, and adaptive radiation is a fascinating question gaining research attention, particularly given the relative ease with which we can now collect genomic data.
Hybridization can lead to a reduction in biodiversity through “despeciation.” If we consider species to be distinct, relatively stable, genotypic clusters, it is easy to imagine that ecological or geographical change may facilitate gene flow sufficient to homogenize both species into one cluster if reproductive barriers are weak. Examples of species fusion include Darwin’s finches and cichlid fish.
In some cases, hybridization can lead to establishment of a new, third species, hence increasing biodiversity. Keeping with our definition of species as genotypic clusters, the hybrid species would be a third cluster of genotypes that remains distinct even when in contact with the parental species.
Continue reading

Posted in population genetics, speciation | Tagged | 2 Comments

Not everyone likes it hot … winter or not

On this Boxing Day, many of us may be bracing against winter storms.  For those of us in the Northern Hemisphere, we might all be dreaming of summer weather (including those of us who think a Southern Californian version of winter downright chilly). Blissful summer months of fieldwork which are seemingly filled with ample time to dust off that old dataset or manuscript …
Yet, as we see in a new paper, not everybody likes it hot.
Mota et al. (2014) have described the relationship between the thermal environment and in situ molecular heat shock response (HSR) is investigated at microhabitat scales.  Usually, we study environmental change over large temporal and/or geographical scales.  Yet, microhabitat thermal conditions may be just as important.
Four distinct patches are described in the intertidal fucoid Fucus vesiculosus from a southern range edge population: canopy surface, patch edge, subcanopy and submerged channels. These four patches, or microhabitats, had distinct thermal and water stress profiles during low tide emersion.  And, in fact, the range edge population studied has become extinct.

An intertidal site with patches of fucoids, or rockweeds, at low tide in Brittany, France. © SA Krueger-Hadfield, 2010

An intertidal site with patches of fucoids at low tide. © SA Krueger-Hadfield, 2010


Perhaps surprisingly, the top of the canopy, which is the hottest and driest patch, was the most benign for this species of fucoid. Rapid desiccation may result in a metabolically inactive state in which fronds can’t respond to thermal stress and are thereby protected.
HSR data, accompanied by meteorological and microenvironmental thermal data, indicated that the maximum HSR is met or exceeded at low tides over much of there year, even during daytime immersion in summer. This is critical, as it will prevent fucoids from recovering from thermal stress due to continual HSR even at high tide.
The other important result from this study indicated that microhabitat scales in intertidal zones might best explain the impact of climatic changes.
Mota CF, Engelen AH, Serrão EA, Pearson GA (2014) Some don’t like it hot: microhabitat-dependent thermal and water stresses in a trailing edge population.  Functional Ecology DOI: 10.1111/1365-2435.12373

Posted in adaptation, natural history | Leave a comment

The best of TME (for the last two months)

I’ll admit that I’m a sucker for year-end lists. Ten biggest science discoveries. Fifty best albums of 2014. They make fantastic procrastination fodder, and I’ll comb through each one that crosses my desktop before the New Year.
In the same spirit, I wanted to take a quick reflective look at the past couple months here at The Molecular Ecologist. We are already a third of the way through the tenure of our new influx of contributors (me included), and it might be informative to look at trends in what this diverse group is talking about and think about where the readers of TME would like to see us go in the immediate future.
Since the beginning of November, the new contributors (Arun, Karen, Melissa, Noah, Rob, and Stacy) have authored 36 posts (19,536 words!). Each post averages around 500 words, and is shared on Facebook 37.2 times and gets Tweeted 12.4 times. Speaking of social media, the social media impact awards go to:
Continue reading

Posted in blogging, community | 1 Comment