Getting started with Ultra Conserved Elements

Cross posted on ngcrawford.com

From Brant Faircloth

From Brant Faircloth


If you attended Evolution 2013, you probably heard quite a lot of chatter about ultra conserved elements. Essentially, ultra conserved elements (UCEs) are parts of the genome that are highly conserved between different species. Although UCEs carry little phylogenetic information, they are surrounded by increasingly variable flanking sequence (see figure). When combined with their flanking sequence these ‘UCE loci’ make ideal markers to study evolutionary relationships across variable time scales. For example, we have used UCEs identified in birds and reptiles to identify homologous UCE loci in amphibians, birds and reptiles. We have also identified these same UCEs in many published mammal genomes.
Continue reading

Posted in genomics, methods, phylogenetics | 1 Comment

People behind the Science: Loren Rieseberg

The first in a series of monthly interviews on the Molecular Ecologist was a logical choice: Dr. Loren Rieseberg, the Chief Editor of our parent journal Molecular Ecology. Dr. Rieseberg is both a Professor in the Department of Botany at the University of British Columbia and a Distinguished Professor at Indiana University. He is best known for his work on the role of hybridization in evolution and speciation, particularly in sunflowers. He has won numerous awards including a MacArthur Fellowship. Below, we ask Dr. Rieseberg about his background, his thoughts on the field of molecular ecology, and how he does everything he does:
1) How did you come to work on sunflowers?  

Loren Rieseberg with his favorite plant

Loren Rieseberg with his favorite plant

When I arrived at Washington State University (WSU) in the fall of 1984 to begin my PhD, my advisor, Doug Soltis, handed me a copy of Verne Grant’s Plant Speciation and told me to find a problem. I was especially intrigued by Grant’s discussion of the potential role of hybridization in adaptation and speciation. Sunflowers were one of four classic examples of this process discussed by Grant, and were especially attractive to me because the sunflower genus also included two domesticated plants and several weedy species. Thus, it was an easy decision. I wrote and defended my proposal within three months after my arrival at WSU and also worked in a collecting trip to California, where I had the opportunity to take a short excursion with Ledyard Stebbins to observe a sunflower hybrid zone that he had been studying near Davis since the 1940s.
Continue reading

Posted in interview | Tagged | 2 Comments

What we're reading: GWAS hits lost in translation, the mutational load of range expansions, and killing the comments section to save science

Reading Corner
In the journals
Carlson, C. S., Matise, T. C., North, K. E., Haiman, C. a., Fesinmeyer, M. D., Buyske, S., … Kooperberg, C. L. (2013). Generalization and dilution of association results from European GWAS in populations of non-European ancestry: The PAGE study. PLoS Biology, 11(9):e1001661. doi: 10.1371/journal.pbio.1001661.

… 25% of tagSNPs identified in EA [European ancestry] GWAS have significantly different effect sizes in at least one non-EA population, and these differential effects were most frequent in African Americans where all differential effects were diluted toward the null.

Peischl, S., Dupanloup, I., Kirkpatrick, M., & Excoffier, L. (2013). On the accumulation of deleterious mutations during range expansions. Molecular Ecology. doi: 10.1111/mec.12524.

We find that deleterious mutations accumulate steadily on the wave front during range expansions, thus creating an expansion load. Reduced fitness due to the expansion load is not restricted to the wave front but occurs over a large proportion of newly colonized habitats.

In the news
Popular Science closes its comments section, citing evidence that they’re bad for science communication.
More advice during academic job-hunting season: the one way to guarantee you don’t get a position is, don’t apply to it.
A new “task view” for R focuses on packages necessary to interact with online resources and websites.

Posted in linkfest | Leave a comment

Scientific computing doesn't have to hurt

Amy Brown handles communication and scheduling for Software Carpentry. The post title alludes to the goals of Software Carpentry, a volunteer organization whose members teach basic software skills to researchers in science, engineering, and medicine. It’s a great organization, and we’re really excited to have Amy tell our readers more about it in this week’s guest post.
SWC-hi-res-logo
Regular readers of The Molecular Ecologist will know that it’s a great idea to harness the power of the command line, master a programming language, and share your code. But if you’re new to the shell and version control systems like Git it’s hard to know where to start. Even scientists who have been programming for a while have often not received formal training on their tools.
Software Carpentry was created to fill in this knowledge gap. Continue reading

Posted in community, howto, methods, software | 1 Comment

What we're reading: The tiger genome, pooled sequencing for population genomics, and more fretting about academic careers

Man reading book
In the journals
Cho YS et al. 2013. The tiger genome and comparative analysis with lion and snow leopard genomes. Nature Communications 4:2433. doi: 10.1038/ncomms3433.

Through comparative genetic analyses of these genomes, we find genetic signatures that may reflect molecular adaptations consistent with the big cats’ hypercarnivorous diet and muscle strength. We report a snow leopard-specific genetic determinant in EGLN1 (Met394Lys39), which is likely to be associated with adaptation to high altitude.

Ferretti L, SE Ramos-Onsins, and M Pérez-Enciso. Population genomics from pool sequencing. Molecular Ecology doi: 10.1111/mec.12522.

Next generation sequencing of pooled samples is an effective approach for studies of variability and differentiation in populations. In this paper we provide a comprehensive set of estimators of the most common statistics in population genetics based on the frequency spectrum, namely the Watterson estimator θW , nucleotide pairwise diversity Π, Tajima’s D, Fu and Li’s D and F , Fay and Wu’s H, McDonald-Kreitman and HKA tests and Fst, corrected for sequencing errors and ascertainment bias.

In the news
Identifying functional variation across the genome using transcriptome sequencing.
Concerning the infamous two-body problem.
An insider account of how faculty hiring committees work.
No, PhDs finding non-academic careers is not a sign that we should make more PhDs.
“She was a professor?” Yeah, but she was an adjunct.

Posted in linkfest | Leave a comment

For viruses, ecology shapes the speed of evolutionary change

Polio virus (picornavirus)

Particles of polio virus, an RNA virus. Image from Flickr/Sanofi Pasteur


Molecular ecologists are interested in understanding what patterns in genetic variation across and among populations can tell us about the ecology of the living things we study. But a paper published in the latest issue of The American Naturalist demonstrates that this relationship between ecology and genetic variation can operate on a deeper level than we may usually think about—it suggests that the rate at which some viruses evolve may be determined not by natural selection imposed by their hosts, but by simple population dynamics.
Continue reading

Posted in population genetics, theory | Tagged , , | 1 Comment

What we're reading: Next-generation admixture estimates, mutation rates shaped by epidemiology, and whatever happened to that data?

Books
In the journals
Skotte L, TS Korneliussen, and A Albrechtsen. 2013. Estimating individual admixture proportions from next generation sequencing data. Genetics doi: 10.1534/genetics.113.154138.

This paper presents a new method for inferring individ- ual’s ancestry that takes the uncertainty introduced in next generation sequencing data into account. This is achieved by working directly with genotype likelihoods which contains all relevant information of the unobserved genotypes.

Scholle SO, RJF Ypma, AL Lloyd, and K Koelle. 2013. Viral substitution rate variation can arise from the interplay between within-host and epidemiological dynamics. The American Naturalist 182(4):494-513. doi: 10.1086/672000.

This work shows that even in neutrally evolving viral populations, epidemiological dynamics can alter substitution rates via the interplay between within-host replication dynamics and population-level disease dynamics.

In the news
Charles Goodnight on quantitative genetics in metapopulations.
A list of (rather more than) ten things to keep in mind when you sit down to work with that nice new dataset.
And some personal logrolling
Tim presented some worrying results from a study of the accessibility of old (and not-even-that-old) datasets at the International Congress on Peer Review and Biomedical Publication last week.
Jeremy posted the first preliminary analysis from a project on LGBTQ experiences in scientific careers.

Posted in linkfest | Leave a comment

Analytical software management for your Mac? Homebrew to the rescue!

Source: http://www.popsci.com/diy/article/2007-08/ultimate-all-one-beer-brewing-machine

Much of the big processing tasks in biological research remain the domain of clusters of computer nodes, whether local or an Amazon EC2 instance, running various flavors of Linux. It is perhaps safe to say that this fact will remain true in at least the near future. There is a momentum behind this particular approach for multifarious reasons, resulting in versions of Linux catered to particular tasks (e.g., see https://www.scientificlinux.org/), and a package management system (usually distribution/version specific) that simplifies the process of installation and execution of thousands of programs. Yet Linux is not the dominant operating system on researchers’ personal computers.
Having a firm grasp the command line (CL),  secure shell, or ssh, and associated protocols is very important, and will allow access to these larger computational resources. Yet, there remains the simpler CL computations that can be completed on one’s own desktop/laptop. Chances are, this machine is not running Linux, and so all the time saving automation that comes with Linux is lacking. Instead, one is confronted with the task of downloading source code, compiling (hoping there is a makefile or your compiler is just the right version; see http://software-carpentry.org/ for info for developing better standards for software development), etc.. Having a package manager, or an application that allows one to download and install software in an automated way, can be a huge time saver, and prevent any number of headaches that can often accompany software developed for research purposes.
Continue reading

Posted in howto, software | 1 Comment

What we're reading: Compressed genomes, drafting genes, and the third post-publication peer reviewer

readers
In the journals
Deorowicz, S., A. Danek, and S. Grabowski. 2013. Genome compression: A novel approach for large collections. Bioinformatics 1–7. doi: 10.1093/bioinformatics/btt460.

More precisely, our novel Ziv-Lempel-style compression algorithm squeezes a single
human genome to ~400KB. The key to high compression is to look for similarities across the whole collection, not just against one refer- ence sequence, what is typical for existing solutions.

Kosheleva, K. a, and M. M. Desai. 2013. The dynamics of genetic draft in rapidly adapting populations. Genetics 1–46. doi: 10.1534/genetics.113.156430.

In both asexually reproducing organisms and in regions of low recombination in sexual organisms, the chance congregation of beneficial mutations on competing genetic backgrounds skews evolutionary dynamics. Because of this “clonal interference” effect, the success of a mutation depends not only on its fitness effect, but also on the quality of the genetic background in which it occurs and the fortune of the mutant’s progeny in amassing more beneficial mutations …

In the news
At what point have we shared enough data?
A top-notch long-form piece on the vital importance of gene expression.
How post-publication peer review fails.
An exceptionally detailed guide to the R base graphics package.

Posted in linkfest | Leave a comment

Using R to run parallel analyses of population genetic data in STRUCTURE: ParallelStructure

The structure of human populations across Eurasia, as estimated by Rosenberg et al (2002)
The structure of human populations across Eurasia, as estimated by Rosenberg et al (2002)

In this guest post, Francois Besnier explains how to use ParallelStructure, his new R package for running STRUCTURE analyses in parallel computing environments.

To start with, thanks to The Molecular Ecologist blog team (Tim and Jeremy) for the invitation to write post about the R package ParallelStructure. Briefly introducing myself: I am a post-doctoral researcher at the Institute of Marine Research in Bergen, Norway.

Among other things, our institute is in charge of monitoring the genetic structure of salmon populations in Norway, which often puts us in the situation to play with fairly large data sets (thousands of individuals, dozens of populations) to assess population admixture and perform individual assignment tests in STRUCTURE. In such situations, we are often limited by the computation time. Thus, to make the best out of our computer resources, we wrote some scripts to run STRUCTRURE analyses in parallel on multi-core computers. As this might be of use for a broader range of purposes, those scripts were encapsulated in a R package called ParallelStructure, which is available on R-forge, and described in Besnier & Glover 2013.

Continue reading
Posted in howto, population genetics, R, software, STRUCTURE | 5 Comments