I’m starting this hoping it will be quick, but thinking about it this morning, I realize there is a lot of ground I could be covering here. So here goes…
Updates: Last week I tweaked the ethogram (taxonomy) view so that entering the name of a higher level taxon will retrieve behaviors for all included (subsumed) taxa. This is implemented in the simple, non-elegant way – crawl the tree and retrieve the annotations using SPARQL for navigating, but the control is all implemented in java. Of course traversing the tree has one advantage over a reasoner query to retrieve all included taxa – the results are guaranteed to come back in some sort of tree traversal order. It works (try ‘Tetragnatha‘), but it is a bit slow. I’ve also configured a more capable server, but haven’t deployed it yet, so be patient with these queries (there are some that seems to require 2-3 minutes to complete, I’ll let you figure out which).
Taxonomy: There’s not a lot new to report here – OpenTree has been keeping me busy these past few weeks. I have been doing some more curation tool work to support taxa outside of NCBI and thanks to Chris Mungall and James Overton, there will soon be a new OWL rendering of the NCBI taxonomy in OWL which should make its way into the backend database soon. I’m still tracking the addition of Arachnid taxa into NCBI – the majority of updates seem are sample records which won’t help with behavior, new species for ticks and spiders are trickling in as well.
Also, yesterday was Taxonomist Appreciation Day. Although I have dabbled in taxonomy informatics (TDWG, VTO, a bit in OpenTree, as well as the taxonomy work here) I would never consider myself to be a taxonomist. I do, as should any biologist, appreciate and thank the generations of taxonomists in the 250+ years since Linnaeus who have brought order and names for the millions of species we share this planet with.
Curation and Post Publication Review: A couple of items I found in twitter over the past few days have struck an interesting thought. The first was a discussion of how curators of the UniProtKB database deal with changing understanding of the activity of the SiRT-5 protein. This paper looked at how the UniProt curators responded to a changing understanding of the activity of this protein. Initially this protein was understood to exhibit deacetylase activity, based primarily on documented activity of other members of the family and some in vitro assays that demonstrated the deacetylase activity. More recent papers have documented that the in vivo activity of this protein is more likely to be succinylation. The paper describes how annotations in the UniProtKB were modified to incorporate both classes of activity in the appropriate contexts, providing a review process for the earlier reports in high of later results. Thus the curation process provides a post-publication, albeit specialized, peer review.
This is relevant in light of this post I saw this morning on the likely limits of post-publication peer review. Now, the particular papers discussed in the UniProt example were published in high profile journals such as Cell and Science, so the particular case does not speak against the 1% notion mentioned in the Dynamic Ecology post. But not all curation is focused on the sort of topics that make it into the elite 1% of published papers. My publication database does have a few papers from Science, Nature and one or two other high profile publications. But the majority come from places such as the Journal of Arachnology, Animal Behavior, or lesser known journals from Japan or Latin America. This leads me to a somewhat more optimistic conclusion about the future of post-publication peer review than Jeremy Fox.
Next week, I hope to discuss some of the papers I’m in the process of adding and possibly return to the issue of front-end data stores.