Putting reasoning in the right place (not the frontend)

Spent some time this evening refining the handling of participants and their display on the assertion entry page.  It looks like this is getting close to the point where the owl builder tool will need some attention so that assertions can actually pass through the chain from arachadmin to the online knowledge-base.

I realized this afternoon that spending much time worrying about inference in the arachadmin (frontend) tool was wasted – reasoning should be reserved mostly for the owl builder, which uses OWLAPI and a reasoner (haven’t decided which yet), and to a lesser extent, the Sesame server on the web server.  There is a bit of reasoning, or more properly filtering, in the frontend to keep junk or out of scope terms out of the drop-down lists, but deciding which subsumers (parents) of used terms should be included in the knowledge base is more properly done with a full reasoner, rather than graph traversal hacks in python.

I’ve noticed a few people have stumbled on this blog or the arachnolingua front page – thanks for having a look.  Meanwhile, I work towards finishing the gaps in the workflow so I can start sharing the wealth of spider behavior I’ve collected (and don’t worry, other arachnids will be represented as well).



Assertions, participants, and other artifacts

I’ve been quiet for over a month now, though not inactive with Arachnolingua.  I spent the first couple of weeks working through the forms chapter in the web2py book which was helpful both in the context of Arachadmin as well as a couple of issues that came up with the day job.  This was all in support of making a reasonably useful page for entering assertions, which are the primary records for generating behavior instances in arachnolingua.  Along with the web2py review, there was a fair amount of database redesign as I worked through the relation between participants (animals, their parts, and environmental substrates).  Participants might be individuals or quantified (e.g., some Habronattus californicus, portion of substance granite).  While animal and term terms come easily from existing taxonomy and anatomy ontologies, environmental participants will require pulling terms from one or more environment related ontologies.  Since it looks like I have an invite to the next Phenotype RCN in February, which will focus on environment, I should have an opportunity to size up the options.

Once I had a basic version of the assertion page up (no screenshots yet, it’s very much a work in progress), I started realizing just how long the drop down list for taxa would be.  I’ve not doing anything fancy with text completion (haven’t had any success with the text completion widget in web2py), so making taxon selection more manageable has focused on reducing the length of the list by filtering out irrelevant terms.  NCBI taxonomy, as anyone who has worked with term exports knows, contains, in addition to Linnean terms, identifiers for incompletely identified samples (e.g., Lycosidae sp.) which will often include lab identifiers.  Since these deposits are very unlikely to form the basis of any behavior observations, I’ve implemented filtering, currently just removing any children of nodes with labels of the form ‘unclassified x’.  This does not remove all the problem terms, the messier stage of filtering out terms by regex matching against labels will wait for another day.