When you think about it, an awful lot of the things you can do with a genome sequence amount to lining it up next to another genome sequence, and then listing all the places where they differ. That’s more or less what Hans Ellegren and colleagues do in a study just released online at Nature, using genome-scale data from two European flycatcher species. Fortunately, Ellegren et al. don’t treat that comparison as the end of their analysis, but the beginning.
The collared flycatcher (Ficedula albicollis) and its close relative the pied flycatcher (F. hypoleuca), look pretty distinctive. But they’ve probably only been separate species for a couple million years, and they haven’t really been that separate — they hybridize wherever their ranges intersect across Europe. In other words, they’re an ideal case study of speciation in progress. Given genome-scale data from the two species, it would be possible to identify genome regions where they differ, and, knowing those regions, start to answer the more interesting question of why they differ.
The team started by building a whole-genome sequence for the collared flycatcher. With reference to an existing linkage map for the flycatcher and the zebra finch genome, they were able to piece together a billion DNA bases of DNA sequence, almost ninety percent of the flycatcher’s complete genome. One billion bases is pretty big, as genomes go; the human genome is over three billion bases long, the chicken and zebra finch genomes are about the same size as the flycatcher’s.
Equipped with the core genome sequence, the team collected still more sequence data from ten male flycatchers of each species, and aligned these additional sequences to the genome sequence, identifying millions of sites that vary within the two species, and millions of sites where they share variants. They scanned through all these sites to identify points in the genome where differences between the two small samples of flycatchers were completely fixed — that is, sites where all the collared flycatcher sequences carried one variant, and all the pied flycatcher sequences carried a different variant. The frequency of these fixed differences varied considerably across the genome, but there are dozens of spots where they’re especially concentrated, forming peaks of differentiation.
Within those peaks of differentiation, one or the other flycatcher species also showed reductions in diversity, an indication that those sites may have recently been experiencing natural selection. Using the genomewide data, the team estimated a rate of gene flow between the two species, which turned out to be low but non-zero — consistent with regular hybridization, and the fact that hybrid birds are apparently somewhat less fit than pure-bred birds of either species. That gene flow acts to homogenize the two species’ genomes; but selection is apparently acting to differentiate them at specific regions.
What’s special about most of those regions isn’t very clear; the authors report that, out of more than 500 protein-coding genes identified within the differentiation peaks, no particular functional group dominates the list. But when they examined the expression of genes across the whole genome and compared expression levels in the two species, genes in divergence peaks were more likely to show differences in expression, as well as their sequences. This suggests that in differentiating the two species, natural selection has acted not just on protein-coding regions, but on the non-coding regions that regulate when and how proteins are made.
The authors also found that sites on the Z chromosomes, which determine birds’ sex — males carry two copies of this chromosome, and females carry one copy plus a W chromosome — had uniformly higher degrees of differentiation between the two species, to the point that the whole chromosome was, on average, a divergence peak. That’s consistent with previous work — and a lot of speculation before that — that sex chromosomes may play a key role in the formation of new species, differentiating more rapidly and completely than the rest of the genome.
This paper is still a lot of raw description, lining up genome sequences and listing the differences. But by examining how the regions that differ stand out from the rest of the genome, Ellegren et al. are able to show with high confidence that natural selection is responsible for those differences, and begin to identify where and how that selection is acting.
Ellegren, H., L. Smeds, R. Burri, P. I. Olason, N. Backström, T. Kawakami, A. Künstner, H. Mäkinen, K. Nadachowska-Brzyska, A. Qvarnström, S. Uebbing and J. B. W. Wolf. 2012. The genomic landscape of species divergence in Ficedula flycatchers. Nature, DOI: 10.1038/nature11584.