Happy Thanksgiving: COVID-19 style

It was the Ides of March in 2020 when I moved from California to Europe. Thanksgiving marks March 271st. I was still a postdoc in Jonathan Eisen’s lab at UC Davis and my contract would have ended in the end of August 2020. In March 2020, my husband and I were in the process of booking a container to bring our belongings through the Panamá Canal to Europe. He was applying for jobs in Germany, I had already an offer, and we were looking at schools for our children. I was in the middle of analyzing my data I had collected during one of the many field trips to Central America in the months before, when my mom from Switzerland called and told us we have to come now. Switzerland is closing its borders in the next few days and many European countries are already closed!

Graphical abstract for Thanksgiving methods. Idea by Grace Ho @Gracegusta. Sorry Canadians, that this idea comes a bit too late!

Continue reading

Posted in adaptation, career, postdoc | Tagged , , , , | Leave a comment

GEOME is putting genetic data in its place

Records for sequenced mollusk samples along the southern California coast (GEOME)

Infrastructure to make genetic data widely available for research beyond its initial publication has been a theme of the genomics revolution, from GenBank to the Sequence Read Archive. For molecular ecologists, though, genetic data is only half of our field — the other half is the ecological context in which that data is collected. This month, Molecular Ecology Resources highlights an initiative to bring that ecological context to genetic data archiving: the Genomic Observatories Metadatabase, or GEOME.

Led by Cynthia Riginos at the University of Queensland, Eric Crandall at Penn State, Libby Liggins at Massey University, and Michelle Gaither at the University of Central Florida, the GEOME collaborators present the case for creating yet another data deposition service: although there are a number of established databases for public deposition of genetic and ecological data, no one repository linked both types together. GEOME, which launched in 2017, offers a single metadata framework to link DNA sequence or marker data to sample locality and ecological measurements.

GEOME allows researchers to create records linked to sequence data they’ve already posted to a public repository — or, now, to upload samples to the International Nucleotide Sequence Data Collaboration SRA alongside ecological data through a single unified portal. Datasets are then searchable through the GEOME website, which includes multiple levels of search control alongside a useful map visualization, or through a new R package that interacts with the GEOME API.

Posted in bioinformatics, community, data archiving | Leave a comment

Fieldwork in the time of COVID

Life as we knew it came to a screeching halt back in March. Almost a year ago, how is that possible??? Yet, at the same time it feels like several lifetimes have passed …

At a recent editorial meeting, we were talking about TME posts and in the past, I’ve written about fieldwork. I always felt fortunate to be able to travel to far flung places, but I don’t think I truly appreciated how much being out in the field really meant to me. In between bouts of existential dread and complete overwhelmed-ness over the last 9-odd months, I’ve realized how much I took for granted. Fieldwork was one thing that was simply part of the fabric of my life.

Our time in the field entails long days of driving hither and yon, sampling (often in hot, humid weather or the freezing rain – we like extremes I guess), and then processing late into the night. At some point Cher, Céline, or some other guilty pleasure musician make an appearance to get us through the slog – whether in the lab or on the road. Sometimes, we’re processing samples in a nice lab. Other times, we’re sitting backwards on a toilet in a Motel6 using the tank lid as a makeshift bench. We eat too much McDonald’s, go back for more and regret it immediately. Yet, these are the times of the year I find myself anxiously awaiting, counting down the days on my calendar until we are on the road.

Continue reading

Posted in blogging, career, chat, ecology, evolution, fieldwork, haploid-diploid, just for fun, mating system, natural history, population genetics, postdoc, Science Communication | Tagged , , , , , , , , , | Leave a comment

Join the Molecular Ecologist team in 2021!

Blue skies and white clouds mirrored in a broad bay
One of Kelle Freel’s fieldwork nostalgia photos, from Kāneʻohe Bay, Oʻahu, Hawaiʻi

The Molecular Ecologist is seeking two new regular contributors for 2021! Join us in blogging about “ecology, evolution, and everything in between.”

Ideal candidates should have expertise and experience in our core topic, the use of genetic data to understand the past and future of the living world. We’re particularly interested in senior graduate students, postdoctoral researchers, and other working scientists who can discuss basic science on a level that engages research biologists, as well as explaining fundamental molecular ecology concepts to the general public. The two contributors in the 2021 cohort will receive small stipends for their first year with the blog, in exchange for committing to posting on a monthly basis, helping to manage social media for TME — either our Twitter account or our presence on Facebook — and contributing to the Molecular Ecologist Podcast.

In addition to the direct compensation, blogging for The Molecular Ecologist can be an excellent way to hone familiarity with current molecular ecology research, establish connections within the scientific community, and build a portfolio of science writing for a broader audience. In light of this, we are particularly interested in applications from candidates whose racial, ethnic, sexual, or gender identities are underrepresented in science careers.

To apply, please e-mail Jeremy Yoder at jbyoder@gmail.com with a brief cover letter explaining (1) why you want to write for The Molecular Ecologist and (2) what topics you would write about for the site, along with (3) an appropriate sample of your writing. Applications should be received by the end of the day on 11 December, 2020 to ensure consideration.

Posted in blogging, community, housekeeping | Tagged | Leave a comment

Hosts select symbionts for greater mutual benefit, an evolutionary experiment shows

The roots of a barrel medick plant, with pink nodules housing rhizobia. (Wikimedia Commons: Ninjatacoshell)

Who’s in charge of a symbiotic mutualism? You might think the host organism, whose body is the venue for an exchange of nutrients or services with a microbial symbiont, is running the show, able to evict or punish symbionts that don’t play nice. However, there are many examples of hosts making do with symbionts that aren’t particularly good partners, and some evolutionary theory has suggested that competing symbionts can gain the upper hand. Results from an evolutionary experiment recently reported in the journal Science lend support to the host-in-the-driver-seat view, though — bacterial symbionts selected by five generations of hosts evolved to be better mutualists.

Continue reading

Posted in adaptation, Coevolution, evolution, microbiology | Tagged , , | Leave a comment

Take the Molecular Ecologist reader survey!

Following up on this being our tenth year of blogging operations, we thought it was past time to check in with you, our readers. To that end, we’ve put together a brief survey about how you read The Molecular Ecologist, what kinds of posts you follow us for and what you’d like to see more of, and who you are — in terms of career stage and scientific interests. There’s also an open-ended suggestion box, to tell us what we should have asked about but didn’t think to.

In total it should take less than ten minutes, and if you’ve got the time to spare, it’ll be very helpful. You can fill the survey form in right here on the blog, or follow this link to the Google Form. Thanks in advance!

Continue reading

Posted in community, housekeeping | Tagged | Leave a comment

Marmots, seasons, and climate change

I love when nostalgia for a project, place, or species intersects with a current interest, as happened this week for me with a paper by Cordes et al. 2020, about the contrasting effects of climate change on the seasonal survival of yellow-bellied marmots in the Colorado Rocky Mountains.

Continue reading

Posted in climate change, ecology, mammals | Leave a comment

Simple rules for organizing data in a spreadsheet

Most scientists collect and organize at least some data in spreadsheets, usually Excel or Google Sheets, despite the potential pitfalls of using such products (there are even archives of spreadsheet horror stories). The most commonly bemoaned problem in Biology, that of Excel converting some gene names to dates, even caused the HGNC (HUGO Gene Nomenclature Committee) to change the names of at least 27 gene this year to avoid this issue. No matter your feelings about spreadsheets, they are generally the first program students learn to use for creating a database of samples, recording data, or doing simple calculations. Furthermore, for people without extensive coding or experience, spreadsheets are the default. Fortunately, by following some simple guidelines, we can avoid most of the hassles as well as countless hours re-formatting data tables for analysis and endless confusion trying to decipher color-codes from 10 years ago.

This paper by Broman & Wu is from 2018, but it came to my attention this week and I have now added it to my canon of “Must read” literature for future students.

Karl W. Broman, lead author

Many of these tips seem obvious, but I’m guessing if you think back, you will recall an instance(s) where you (or a co-author) violated each of these tips and in retrospect knew you had erred. These days you are wiser but could probably use a refresher. This paper prevents the re-invention of the wheel during every PhD. I urge you to read the full paper, but here I’m providing the lightly edited (I combined some tips and re-arranged them a bit) cliffs notes. These guidelines, if implemented across the lab, also allow for easy hand-off and transfer of data between students and colleagues.

Tip 1 – Be consistent. In categorical variable codes, missing values, variable names, subject identifiers, dates, data layouts, and files names, both within and across spreadsheets. E.g., don’t use both “M” and “male”, don’t list the day first in some files and the month first in others. This one hits home – I once inherited a database of samples from a former French student who sometimes used the European date format and sometimes the American on both the sample label and within the database (they also labeled all variable names in French, but that’s another story!).

Tip 2 – Choose wisely. When choosing names or codes for variables, think about how your choice or a file format conversion will affect the analyses. E.g., don’t choose names with special characters and use underscores or hyphens instead of spaces. Think about how easy it will be to type out the variable name repeatedly in R code. It’s best to do this before you start collecting data. Also, choose wisely when it comes to how you represent any date variables.

Tip 3 – No empties allowed. Have a code that indicates a value is missing rather than the cell being intentionally left blank. This is especially important if you are continuing to collect data and are leaving cells blank to fill them in later! It’s also important for sorting data later. If you’re really fancy, you may have one missing code for data that wasn’t collected and another for data that is yet to be collected!

Tip 4 – One cell = One item. Each cell should contain only one piece of data, no more. The example given in the paper is position on a 96 well plate (e.g., A11 or B02), but I’ve also run into trouble with coding an individual as “adult_male” or “juvenile_female”. My solution is to keep the column with the “group” designation so I can easily visualize each group, but to add two columns, one for age and one for sex, for ease of sorting. And put ‘extra’ information, like units, into the header, a Notes column, or your ReadMe file (see Tip 6).

Tip 5 – Rectangles with one header row are gold. This honestly is pretty self-explanatory. See the figures below from the paper and imagine trying to analyze them.

Figure 5 from Broman & Wu 2018. Examples of spreadsheets with nonrectangular layouts.

Additionally, if you have bits and pieces of data scattered around, put them in separate files for ease of analysis later on. I corrected this very mistake today for a project I was just starting.

Tip 6 – Create a Data Dictionary (And Data ReadMe – For more information about ReadMe files, see here and here). Have a separate document of metadata that explains the overarching goal of the project, the data being collected with brief notes about the methods, and an explanation of what each variable in the spreadsheet is. These notes should include the variable name in the spreadsheet, a longer explanation of what the variable means, the measurement units if any, potential categories, etc. The article suggests separating the ReadMe and the data dictionary, but I advocate for having the information about variables both your data dictionary and your ReadMe file.

Tip 7 – Keep a raw version and back-up your data often. This tip feels obvious, but needs to be said. You should always keep a raw, protected version of your data that has no calculations included in the spreadsheet and contains all of the data. Save a copy and work within the copy. If you then exclude values or do calculations, you can save edited versions and even keep an explanation of the different versions in your ReadMe file, but always keep a ‘clean’ raw version that you don’t touch in case you need to go back. Similarly, save back-ups regularly and in different locations. If you don’t already do this one, stop reading and go do it, then come back.

Tip 8 – Do not color-code. I made this mistake a lot early on. Don’t. You will not remember what these highlighted cells represent or why some of the values are blue versus black when you re-open this file a year from now. Also, you can’t sort colored text or highlighted cells and these visualizing aids will usually be lost if you save in a different format or import the data into a different program. Instead, add Notes or a new variable to convey the information.

Now, you are empowered to use (and not abuse) spreadsheets for data collection! Go forth and collect all the data!

Additional Resources

https://datacarpentry.org/spreadsheet-ecology-lesson/

A Guide to Data Management in Ecology & Evolution by the British Ecological Society Guides to Better Science

References

Karl W. Broman & Kara H. Woo (2018) Data Organization in Spreadsheets, The American Statistician, 72:1, 2-10, DOI: 10.1080/00031305.2017.1375989

Posted in data archiving, howto, methods | Leave a comment

Molecular Ecology and Molecular Ecology Resources are recruiting new Associate Editors

Molecular Ecology and Molecular Ecology Resources are looking for new Editorial Board members to join the journals as Associate Editors in the key subject areas below:

  • Eco-immunology/emerging diseases/disease resistance
  • Proteomics/protein evolution
  • Computer programs/statistical approaches
  • Environmental DNA/metabarcoding

Experience with genome assemblies would also be advantageous.  

Nominations and personal applications are welcome, and whilst scientific qualifications are paramount, we would particularly appreciate nominations and applications from suitably qualified researchers in underrepresented groups, including women, ethnic minority scientists, and scientists with disabilities, among others. Please email nominations/applications by October 15th, 2020 to manager.molecol@wiley.com with the following items:

  • Cover letter stating the reasons for your nomination, of if applying for yourself, your interest in the role and familiarity with the journals,
  • Abbreviated CV (Education, Publications, Outreach) if you have it.
Posted in community, Molecular Ecology, the journal, science publishing | Leave a comment

Genetic Rescue – Fitness and genomic consequences

As a PhD student studying the effects of genetic diversity overall and immunogenetic diversity specifically on survival and reproductive success in an endangered primate in captive and wild populations, I thought a lot about the potential effects of inbreeding and outbreeding depression. I read literally 100s of papers on the topic. Inbreeding depression describes the negative fitness effects that can occur in small populations when relatives breed with each other for multiple generations, thus genetic diversity is lost through genetic drift and negative alleles are expressed. Outbreeding depression, by contrast, is the negative consequence of breeding two genetically distinct populations leading to a loss of local adaptation. Concerns about outbreeding depression are one of the major theoretical limitations to re-introductions and attempts at ‘genetic rescues’ when small populations and/or endangered species might be suffering from inbreeding depression. For the most part, however, evidence of outbreeding depression has mostly been limited to plants and captive or laboratory studies. Earlier this year, however, Dr. Sarah Fitzpatrick and her co-authors documented an extremely cool example of genetic rescue in populations of wild Trinidadian guppies, contradicting the hypothesis about the potential for maladaptive gene flow in population introductions (Fitzpatrick et al. 2020).

Dr. Sarah Fitzpatrick, lead author.
Photo Credit:
https://www.kbs.msu.edu/people/sarah-fitzpatrick/

Trinidadian guppies. Photo Credit:
https://phys.org/news/2020-03-guppies-brothers-sex.html

After repeatedly sampling two isolated guppy ‘recipient’ populations (Figure 1A, dark blue circles, N < 100 individuals per population) in the Caigual and Taylor rivers in Trinidad, the authors introduced populations of guppies upstream (dashed red circles) of these recipient populations, in previously guppy-free areas. These trans-located guppies, from downstream populations (solid red circles), occasionally (or frequently!) migrated downstream into the recipient populations located either ~5m or ~700m from the introduction location. For ~8-10 guppy generations after the trans-location, the recipient populations have been monitored with mark-recapture to assess population size as well as individual overall genetic diversity, hybrid ancestry, lifespan, and reproductive success. Following the onset of immigration and subsequent gene flow, both recipient populations experienced nearly a 10-fold increase in population size, from less than 100 individuals to an estimated 1,000 individuals each (Figure 1B). Based on the hybrid index, which ranges from 0 to 1 based on the amount of native or immigrant ancestry of an individual respectively, of the generations, it’s clear that 10 generations after the first wave of immigration, the population consists almost entirely of admixed individuals (Figure 1C).

Figure 1 – Gene Flow Manipulation Experiments in Trinidad
(A) Map of the Guanapo River drainage. In 2009, guppies were translocated from a downstream high-predation locality (red) into two headwater sites (dashed red) that were upstream of native recipient populations in low-predation environments (dark blue). Unidirectional, downstream gene flow began shortly after the introductions, indicated by black arrows.
(B) Census sizes in Caigual (solid) and Taylor (dashed) following the onset of gene flow from the upstream introduction sites. Gray box indicates the time span in which all captured individuals were genotyped at 12 microsatellite loci.
(C) Temporal patterns of continuous hybrid index assignments throughout the first 17 months of the study (∼four to six guppy generations). Individuals from recipient populations prior to gene flow had a hybrid index = 0, and pure immigrant individuals had a hybrid index = 1. Hybrid indices were assigned using data from 12 microsatellite loci. Red arrows indicate the onset of gene flow.

Contradicting the predictions of outbreeding depression, individuals with intermediate to high (0.5-0.75) hybrid indices had the highest longevity and reproductive success in both locations and across sexes (Figure 2). Interestingly, although hybrids and pure immigrants had similar levels of genetic heterozygosity, hybrids had higher fitness, suggesting that increased genomic diversity alone does not explain the increased fitness and pointing towards a potential maintenance of locally adapted alleles.

Figure 2 – Relationships between Hybrid Index and Fitness
Fitness metrics (longevity and total lifetime reproductive success [LRS]) varied quadratically with hybrid index (0, pure recipient genotype; 1, pure immigrant genotype). Maxima of the quadratic functions are indicated by vertical dashed lines/diamonds; uncertainty in their positions is indicated by (horizontal) 95% confidence bars. Shading around regression lines displays approximate 95% confidence bands obtained through simulation.
(A and B) Longevity differed between males (red) and females (blue). Generally, females lived longer than males, and fish in (A) Caigual lived longer than those in (B) Taylor. In Taylor, male and female longevity had quadratic relationships with hybrid index that differed in magnitude but peaked at similar parameter estimates; this differed by sex in Caigual (A versus B).
(C and D) LRS varied quadratically with hybrid index, and this trend did not differ between males and females. Individuals from Taylor generally had lower LRS than Caigual (C versus D) and were more likely to not reproduce at all, especially those with recipient genotypes (hybrid indices near zero).

Pre-introduction, 95% and 96% of >12,000 genotyped SNPs were monomorphic in the Caigual and Taylor populations respectively and average nucleotide diversity was 0.01 in both populations (Figure 4b). 8-10 generations later, only 22 and 24% of SNPs are monomorphic and nucleotide diversity has increased to 0.21 and 0.22. Genome-wide average Fst between source and recipient populations also decreased from 0.29-0.31 to 0.01.

To determine if gene flow swamped locally adaptive variants, the authors identified 146 loci with allele frequencies in the pre-immigrant recipient populations that might indicate candidacy for locally adapted alleles. Post-immigration, although overall genome homogenization increased between immigrant and recipient populations, the authors found evidence for selective maintenance of some of the candidate alleles in the recipient populations in the form of an excess of pre-immigrant ancestry at these loci (Fig 4). Unfortunately, none of these candidate loci matched previously identified loci under selection nor were any gene ontology terms enriched, but they provide interesting potential targets for future investigation.

Figure 4 – Genomic Consequences of Gene Flow
New gene flow caused overall genomic homogenization, but candidate adaptive alleles were maintained at higher than expected frequencies.
(A) PCA plot showing overall population differentiation based on polymorphic SNP loci from the RAD-seq data.
(B) Comparison of nucleotide diversity patterns along linkage group two among pre-gene flow (dark blue) and post-gene flow (light blue) Caigual (solid) and Taylor (dashed) populations and the introduction source (red). Similar patterns were found across all 23 linkage groups.
(C) Distributions of ancestry-polarized deviations in candidate loci versus frequency-matched non-candidates for both populations. In each stream, the allele frequencies of the candidate loci were significantly closer to the headwater ancestral frequency compared to a set of frequency-matched non-candidates.

This study documents the phenomenon of genetic rescue in two multi-generational wild populations, showing that contrary to expectations, gene flow does not necessarily swam local adaptation, and actually can significantly increase fitness in the form of longevity and reproductive success, subsequently substantially increasing population size. Further, at laest some locally adapted loci appear to have been maintained in both Caigual and Taylor, despite a 10-fold difference in the number of immigrants to each population, suggesting a range of gene flow rates might still allow the maintenance of local adaptation, with extremely important and interesting implications for future conservation-based introduction efforts.

References

Fitzpatrick, S.W., G.S. Bradburd, C.K. Kremer, P.E. Salerno, L.M. Angeloni, W.C. Funk (2020) Genomic and fitness consequences of genetic rescue in wild populations. Current Biology 30: 517-522.e5.

Posted in conservation, genomics, hybridization | Tagged , , , , | Leave a comment