Want to share your code?

In this line of work, we have all encountered tasks that are tedious, time consuming, and repetitive.  (Or if not, maybe give it a bit more time.)

When confronted with these situations, people tend to fall into one of two classes:

GeekNongeek

Though I might relabel the lines more along the lines of “code-savvy” and “yet-to-be coder”, the point is that having scripts to perform the tasks we need to get done can improve our efficiency, reduce errors, allow for reproducibility, and often restore sanity. Though if any of you remember this xkcd:

Task efficiency?

… you may fall on the side of not automating tasks because it will take longer to automate it than to do it manually. And this may be true for anyone new to or still learning a coding language. It can be a struggle. And even if you are a coding expert, according to the above chart at least, you may not be saving yourself much time by writing a personalized script for some tasks.

However! If that script was already written and available, you jump straight into time savings and proceed to whatever next step you have in your analysis or data formatting. So, we at the Molecular Ecologist have decided to set up a forum for people to submit any sorts of code or scripts they have that they could see being helpful to others at any point in the future. This is anything you might see as not quite worthy of its own publication but still a useful contribution to the field. Do you have a script you regularly run to convert between data formats? A quick and easy way to run a certain analysis? Making a common figure for a given type of data? If you’re willing to share your code, we’ll put it online for public access with credit to your name.

This is how it will work:

  1.  Write up a description of your code, and maybe run through an example for readers to learn from. Submit this with your code to us via email at molecularecologist [at] gmail.com (replace with an @) Please include the name(s) to be credited as authors of the code, what language it is in, a description of what the code does and how someone can use it (preferably this will be done via a concise readme file as well as well-commented code), and the code itself. If you want to send along an example dataset or suggest a general category of use the script would fall under, that would also be helpful.

Edit  OR, if you are already on GitHub, we have changed our account to now be an organization! This means members can join and add their code more easily while still being able to maintain it themselves!

  1. We’ll then put it up with its description and any worked examples on the blog as well as store the code at our GitHub page.

  2.  Access yours and others’ code by going to our GitHub page for the Molecular Ecologist HERE. You can browse the different repositories by category to find code that is useful for you. Do not fear if you have never used GitHub before. If you do not want to learn Git, you can simply navigate the site online, view the files you like, and copy the code from them manually.

If you do want to learn Git, the GitHub website also has some great and very straightforward tutorials on getting started. You can use either the command line or a GUI (graphical user interface) to access the scripts stored online in different repositories (repos for short). At least initially, we are only going to allow repos to be cloned, so that no code is altered from its original state. This means you can pull as much code down as you want, but can’t push any new or edited code back up. It will have to come through us via email. And stay tuned in the future, as Mark will write a post on linking Rstudio to GitHub (for submitting R scripts to GitHub).

As a quick example: if I wanted to get the scripts from the “DataConversion” repo, in the command line I would just navigate to the location I wanted to put the files on my computer, e.g. “cd Desktop” then type “git clone https://github.com/TheMolecularEcologist/DataConversion.git”, and that folder of scripts would then be on my computer’s desktop.

A few more things to note for this process.

  • If you are already on GitHub and would like to, we are now an organzation so that you can become a member and maintain and edit your own code while at the same time keeping it in a location that may be easier for others to find.

  • We will look over all submissions before posting them, but clearly will not be able to spot all potential mistakes. Especially as tasks not in our respective areas of knowledge are submitted. Therefore we cannot guarantee foolproof methods. But that is another great part about this. If you spot an error, you can get in touch with the author and then back with us to post a corrected or improved version.

  • We are likely to get multiple scripts to accomplish the same or similar goals. Some code may be better annotated, some code may be faster. In some cases it may make most sense to compile code or simply credit multiple authors. We will work through these situations as they come along and see how it goes as we proceed into the future.

  • And lastly, any coding language is welcome. GitHub does not work for Word documents or the like, so make all readme’s in plain text, but scripts such as .R files are just fine.

  • We’ll start posting received submission in the coming days, so be sure to check back here and also at our GitHub page.

We hope this will serve as a valuable resource to many, so thank you in advance for the submissions and happy coding!

RedditDiggMendeleyPocketShare and Enjoy

About kimgilbert

Kim Gilbert is a PhD candidate in the Department of Zoology at the University of British Columbia, and can also be found on twitter @kj_gilbert.
This entry was posted in bioinformatics, community, genomics, howto, methods, next generation sequencing, phylogenetics, population genetics, quantitative genetics, R, software, theory. Bookmark the permalink.