Arachnolingua is still still up, still serving pages under https. Otherwise, there hasn’t been a lot of visible work – there have been some updates on the spider behavior list of terms. Happily, things seem be moving on the NBO – after Anne Clark and I pushed some changes from the ABO into NBO in the fall of 2017, things stayed put until recently, now someone working under David Osumi-Sutherland has been getting some updates into the ontology and updated its build process. I’m glad someone has been in a position to do something about this.
Behind the scenes, I’m still working on the curation tool, a fair amount of reworking and backtracking has happened and the database model is rather more OWL-like than it was. That isn’t necessarily a good thing, but it’s where things are going and it hopefully will be easier to understand than the participant-element abstraction I had used previously.
All for now.
Well, I got the website back up in under a week. I had to build from scratch since I wasn’t able to salvage anything from the AWS instance. Of course the data and code are all sitting on my laptop, so it might have been just configuration and reloading. However, time marches on, and since I was installing from scratch, I took the opportunity to update OS (Ubuntu 16.04), Java (1.8), Apache (2.4), Tomcat (minor update) and to upgrade from Sesame 2.8 to rdf4j, its replacement. One thing that didn’t get upgraded was CORS-filter. Although it was available, I discovered that Tomcat had equivalent functionality available, just waiting to be configured. CORS-filter has served me well and it seems to still be maintained.
Once I had everything configured, it seems that queries are somewhat faster, which I assume is thanks to rdf4j and possibly upgrading java.
A few days later, I took advantage of some posts I had found while figuring out how to reconnect apache and tomcat to set things up for https, so now all transactions to arachb.org and arachnolingua should be redirected to an https connection. Switching over to https wasn’t much more difficult that configuring Apache in the first place. I went with a commercial certificate provider since I had paid for a certificate I never activated a few years ago. Hopefully the renewal will be easy.
On the inside, Apache and Tomcat are speaking over the JServ protocol (ajp13) which may help with performance as well.
Work on the curation tool is continuing, and I think I have found the sweet spot mix of methods, url redirects, and jinja2 templates to get the claim editor to finally lay down. The story is all about switching between individual and class expression participants, but I’ll save it for another time, as I do need to finish the post I’m righting on my other blog on the role of machine learning in the study of behavior.
The AWS server I host on died on Wednesday. It did seem to be a slow death, it was completely unavailable within hours of Amazon notifying me. I tried to capture images from the EBS volume but the image I created didn’t boot. So I’m rebuilding from scratch. I have all the data stored locally, but I am taking the opportunity to update some of the software (Linux to Ubuntu 16.04, Apache to 2.4, Java to 8, Sesame to rdf4j 2). I’m sticking with Tomcat7 on the backend, though I should probably do some experimenting with Tomcat8 since it claims to be easier to manage. I will definitely make an image once everything is in place. There’s a first time for everything and this is my first catastrophic failure on AWS. Live and learn.
I’m working toward having things back up in the coming week.
Arachnolingua focuses its OWL expressions on the claim (= assertion, statement) of an individual behavior event or class expression and properties should start there and work out. Thus: courtship_event –has_participant–>palp–part_of–>male–has_type–>Habronattus sp. There may be consequences for this decision (especially for class level statements), but it is better to be consistent and document the design decision here for now.
This should eventually make it into the ‘primer’ documents for the curation tool and the database schema as well. I wonder if there are any tools in Protege for looking at graphs of individuals – maybe lego?
Writing lots of unit tests and associated refactoring. Part of this was inspired by reading though Martin’s (2008) Clean Code, which had been sitting on my shelf for a couple of years. Most useful thing I found was Martin’s admission that even he writes big ugly functions on the first pass. Definitely a lot of cleanup in the new arachcurator editor. Also triggering some simplification in the database – I’ve removed a many-to-many mapping table (participant2claim) since the relation is really many-to-one. I think there are a couple of other tables that will suffer the same fate.
I should get back to cleaning up my list of terms as well.
I’ve been quiet since April, but I’ve also been pretty busy. Still fighting with the reimplementation of claim editing (a big, messy web page that if I could figure how to simplify further, I would). I have also been focusing my efforts on a new ontology specifically for spider behavior (something to find in between the NBO/ABO and the data in arachnolingua). I gave a talk about it at the 20th International Congress of Arachnology a few weeks ago. The slides, rendered as PDF, are available here.
There is a link to the work in progress on the arachnolingua home page. It is currently just the initial google doc sheet I used to collect usages across the two source texts. I am finishing the first cleanup pass over the data and will provide an updated (and cleaned and better organized) sheet linked from a proper landing page in the coming week.
I am reworking the semantics of claims (statements about behavior dispositions or behavior events). In the existing database (mysql arachadmin on figshare), there is a table called ‘participant_type’. This table has been renamed to ‘expression_type’ in the updated psql database. This is a better name since the values are really types of OWL expressions (e.g., some, individual, conjunction, etc.). This morning I am adding a new table called ‘participant_type’ which has two values: individual or class. A participant is either an individual or a class, although there are some funny, potentially confusing cases. For example, some portion of tissue that is part_of an individual or an area that is part_of the surface of an individual. In these cases, the portion of tissue or surface area may not constitute the entirety of the part (so not comprising an anatomical structure). For the moment, I am thinking of modeling this as a class (though certainly not a universal) that still can participate in an individual event. The other other seems to be to generate an anonymous individual as a instance of that class, but that seems even more wrong. Dispositions of individuals towards a class (e.g. ‘subject tarantula showed increased consumption of crickets’) seems less problematic – it’s a statement about an individual spider, but not an individual event.