09 December 2012

Introducing Pictish, an image-processing library for web browsers

Boy, between RaphaëlTS, SHA-1, and Haeckel, I've been releasing an awful lot of TypeScript/JavaScript libraries lately, haven't I? Anyway, here's another!


Pictish takes advantage of canvas elements and Typed Arrays to provide fast routines for processing raster image data. Here is a rundown of the currently available functions:
  • createImageData()
  • fromFile()
  • fromHTMLImage()
  • crop()
  • flipX()
  • flipY()
  • scaleDown()
  • quarter()
  • silhouettize()
All of these functions have been tested and optimized. More information is available in the documentation at the BitBucket site. Note that since canvas elements and Typed Arrays are based on more recent specifications, not all browsers support it. (For what it's worth, I've been testing in Chrome.)

(The sharp-eyed may notice that last function and wonder if it might have something to do with another project of mine. The answer is yes.)

Next planned step is to create a PNG file encoder  that could take a while, though.

08 December 2012

SHA-1 with Typed Arrays

There are already SHA-1 implementations for JavaScript, but all the ones I found use strings as input. This is fine for generating hashes from small amounts of data, but not so great for large binary files. So I've created a SHA-1 library that optionally accepts ArrayBuffer objects.


Note that this will not work on browsers that do not support Typed Arrays.

I ran some tests and found that in Chrome it takes less than half a second to hash 10Mb of data.

Hope someone finds this useful! (At the very least, I will.)

02 December 2012

Haeckel: A Code Library for Browser-Based Evolutionary Diagrams

For a while now I've been writing posts with diagrams like this one, showing the evolution of cranial capacity in mangani over the past seven million years:


How did I make them? Originally it was all ad-hoc ActionScript code, but more recently I've begun to organize the code into a library and translate it into TypeScript (which, in turn, is automatically translated into JavaScript). Although this library is still in progress, I've decided it's at a stage where I can open it to the general public.

This library includes functionality for:

  • Modeling scientific concepts such as taxa, phylogeny, character states, stratigraphy, and geography.
  • Processing scientific data (notably calculating morphological distance and inferring unknown character states).
  • Rendering data into charts as Scalable Vector Graphics, using RaphaëlJS.
For a while I struggled with what to call this library. It's neither purely about science nor purely about graphics. Finally I got my inspiration from RaphaëlJS, a graphics library named after a great artist. I named my library after a man who was both a great artist and a great biologist:

19 October 2012

RaphaëlJS + TypeScript = RaphaëlTS

RaphaëlJS is a library, written in JavaScript, for creating vector graphics on the web. It's commonly used because it supports all web browsers, including the good ones (which use SVG) and Internet Explorer (which uses VML). Among other things, it's the rendering engine of the ExtJS and Sencha frameworks.

TypeScript is a new web language that extends JavaScript, adding optional strict typing. This is a big improvement on JavaScript because it allows more errors to be caught at compile-time rather than at run-time (for example, if you pass a number to a function that expects an array). However it's not as obnoxious as some strictly-typed languages, because the typing is optional and often implicit.

I've spent some time recently playing around with TypeScript and my verdict: it's wonderful. I only want to use this from now on (at least for large projects). It's a great language, and, since it's an extension (or superset) of JavaScript, all existing JavaScript code works just fine in it. (This is a big advantage it has over Dart, an otherwise excellent language as well.)

But while you can just include a JavaScript library in your TypeScript projects, you don't get the full benefit of the compile-time checking (or the code-hinting) unless that library's entities have been declared. TypeScript allows for ambient declarations: defining or partially defining an entity (class, interface, method, variable, etc.) without actually creating it. As an example (and a very useful one at that), here is a TypeScript declaration file for jQuery: jQuery.d.ts

Today I wondered if anyone had made something like this for RaphaëlJS. I wasn't able to find one, so ... I made one. Here you go: https://bitbucket.org/keesey/raphaelts/src

If you're using the Visual Studio plugin or the TypeScript playground (other editors are bound to come out soon), you can now program type-safe RaphaëlJS code with auto-complete (=IntelliSense).

I've verified that it compiles, but I haven't fully tested it—if anyone experiences issues please let me know. (Or feel free to fork the project).

Hope someone else finds it useful ... I know I will!

21 August 2012

Using Morphological Distance to Determine Genera

A genus is not an empirical entity. It's a bookkeeping convention, up to the personal whims of the taxonomist. And of course this leads to a huge mess.
Homo ergaster,
from PhyloPic

Depending on the taxonomist (and, for early forms, the phylogeny), the human total group includes anywhere from one genus (Homo) to ten (Ardipithecus, Australopithecus, Homo, Kenyanthropus, Orrorin, Paranthropus, Paraustralopithecus, Praeanthropus, Sahelanthropus, and Zinjanthropus  not even mentioning obsolete ones like Telanthropus and Pithecanthropus). We could try to clean up this mess by creating phylogenetic definitions and reinterpreting the genera as clades, except that all of the type species are thought by at least some researchers to be ancestral forms (with the exceptions of Homo sapiens, Paranthropus robustus, and Zinjanthropus boisei). For example, if Australopithecus is a clade that includes Australopithecus africanus, then it might also include Homo, and genera are not allowed to overlap under the ICZN.

After playing around with morphological distances based on Strait & Grine's (2004) character matrix, it occurred to me you could base genera on morphological distance. You have to make subjective decisions as to how many genera you want and which matrix to use, but the rest follows naturally. You just look at each species and see which valid type species it's closest to. Here's what I found:

If we use all of the above genera, then the taxonomy looks like this:
  • Ardipithecus
    • Ardipithecus anamensis (Not in Praeanthropus, despite sometimes being synonymized with afarensis! Although it should be noted that ramidus is much better known now than in 2004, so this may have changed.)
    • Ardipithecus ramidus
  • Australopithecus
    • Australopithecus africanus
  • Homo
    • Homo ergaster
    • Homo habilis (although it is closer to garhi than to sapiens!)
    • Homo sapiens
  • Kenyanthropus
    • Kenyanthropus garhi (!)
    • Kenyanthropus platyops
    • Kenyanthropus rudolfensis (although it is closer to ergaster and habilis than to platyops, it is still closer to platyops than to sapiens)
  • Orrorin
    • Orrorin tugenensis (not included in the study, but this is nomenclaturally where it would go unless found to be a synonym)
  • Paranthropus
    • Paranthropus robustus
  • Paraustralopithecus
    • Paraustralopithecus aethiopicus
  • Praeanthropus
    • Praeanthropus afarensis
  • Sahelanthropus
    • Sahelanthropus tchadensis
  • Zinjanthropus
    • Zinjanthropus boisei
Not included: species that are not types and were not included in the study, like Ardipithecus kadabba (scrappy craniodental remains), Australopithecus sediba (hadn't been discovered in 2004), Homo heidelbergensis (pretty close to sapiens anyway), etc.

It must be said that Zinjanthropus and Paraustralopithecus are not that commonly used. If we remove Paraustralopithecus, then aethiopicus predictably falls into Zinjanthropus. If we remove Zinjanthropus as well, then both aethiopicus and boisei predictably fall into Paranthropus.

Praeanthropus is also not that widely used. If we remove that as well, we get:
Praeanthropus afarensis,
from PhyloPic
  • Ardipithecus
    • Ardipithecus anamensis
    • Ardipithecus ramidus
  • Australopithecus
    • Australopithecus afarensis
    • Australopithecus africanus
  • Homo
    • Homo ergaster
    • Homo habilis
    • Homo sapiens
  • Kenyanthropus
    • Kenyanthropus garhi
    • Kenyanthropus platyops
    • Kenyanthropus rudolfensis
  • Orrorin
    • Orrorin tugenensis
  • Paranthropus
    • Paranthropus aethiopicus
    • Paranthropus boisei
    • Paranthropus robustus
  • Sahelanthropus
    • Sahelanthropus tchadensis
Kenyanthropus is a rather controversial genus. If we remove it, we get:
  • Ardipithecus
    • Ardipithecus anamensis
    • Ardipithecus garhi
    • Ardipithecus platyops
    • Ardipithecus ramidus
  • Australopithecus
    • Australopithecus afarensis
    • Australopithecus africanus
    • Australopithecus rudolfensis (still refuses to go with sapiens!)
  • Homo
    • Homo ergaster
    • Homo habilis
    • Homo sapiens
  • Orrorin
    • Orrorin tugenensis
  • Paranthropus
    • Paranthropus aethiopicus
    • Paranthropus boisei
    • Paranthropus robustus
  • Sahelanthropus
    • Sahelanthropus tchadensis
If we also remove Sahelanthropus, then tchadensis goes easily into Ardipithecus. I assume tugenensis would as well, if we removed Orrorin, but that wasn't included in the study since the craniodental material is so scant. If we remove Paranthropus, its species go very, very reluctantly into Australopithecus (by which I mean despite not being that close to africanus):
Sahelanthropus tchadensis,
from PhyloPic
  • Ardipithecus
    • Ardipithecus anamensis
    • Ardipithecus garhi
    • Ardipithecus platyops
    • Ardipithecus ramidus
    • Ardipithecus tchadensis
    • Ardipithecus tugenensis?
  • Australopithecus
    • Australopithecus aethiopicus
    • Australopithecus afarensis
    • Australopithecus africanus
    • Australopithecus boisei
    • Australopithecus robustus
    • Australopithecus rudolfensis
  • Homo
    • Homo ergaster
    • Homo habilis
    • Homo sapiens
Ardipithecus ramidus was originally named as Australopithecus ramidus. If we remove Ardipithecus, its species predictably end up in Australopithecus:
  • Australopithecus
    • Australopithecus anamensis
    • Australopithecus aethiopicus
    • Australopithecus afarensis
    • Australopithecus africanus
    • Australopithecus boisei
    • Australopithecus garhi
    • Australopithecus platyops
    • Australopithecus ramidus
    • Australopithecus robustus
    • Australopithecus rudolfensis
    • Australopithecus tchadensis
    • Australopithecus tugenensis?
  • Homo
    • Homo ergaster
    • Homo habilis
    • Homo sapiens
Now for the final cut. What happens when we remove Australopithecus?
Pan paniscus,
from PhyloPic
  • Gorilla
    • Gorilla aethiopicus (!!! although gorilla only beats sapiens by a hair)
    • Gorilla beringei
    • Gorilla garhi (!!!)
    • Gorilla gorilla
    • Gorilla tchadensis (admittedly, this was suggested by Senut and Pickford)
  • Homo
    • Homo africanus
    • Homo boisei
    • Homo ergaster
    • Homo habilis
    • Homo platyops
    • Homo robustus
    • Homo rudolfensis
  • Pan
    • Pan afarensis (!!)
    • Pan anamensis (!!)
    • Pan paniscus
    • Pan ramidus (!!)
    • Pan troglodytes
  • incertae sedis
    • Orrorin tugenensis
Yes, some of the stem-humans get pulled in with chimpanzees or gorillas! And there seems to be little rhyme or reason as to which go where. The splitting of the "robust australopithecines" is most bizarre (although, as noted, it only takes a tiny nudge to put aethiopicus in Homo with the others).

In summary:
  • This is just one matrix, only focusing on one area of the anatomy.
  • There are huge amounts of uncertainty with some of these taxa.
  • Even without the uncertainty, there is no objective way to measure morphological distance.
  • Even if there were, it might not be a good idea to use it to determine generic boundaries.
  • Generic boundaries are stupid, anyway.

Closest to Humans: The Skull & Tooth Version

Previously I posted a diagram showing how different various primates are from humans, based on soft tissue characters.  Here is a similar diagram, but using the craniodental characters from Strait & Grine's (2004) matrix. Unlike the soft-tissue diagram, this includes fossil taxa.



This black lines indicate the probable distance, as inferred from the phylogeny. The gradients show the actual range of uncertainty. (They'd also show polymorphism, but this matrix has none.) Ordering is according to probable distance—using the mean of the range of uncertainty yields slightly different results.

As in the soft-tissue diagram, chimpanzees are closer to humans than any other living primates are. But, oddly, gibbons and colobus monkeys are closer to humans than gorillas and orangutans! My guess is that this is because gorillas and orangutans are more derived from the ancestral catarrhine state than gibbons or colobus monkeys. (This probably also explains why the Paranthropus species, a.k.a. "robust australopithecines", are further than chimpanzees, although it is strange that the earliest one, P. aethiopicus, is furthest.)

To the right of chimpanzees is a very unsurprising pattern: "gracile australopithecines", then basal Homo species, then the large-brained Homo ergaster, and finally the huge-brained Homo sapiens.

You can also see how well-known the fossil crania are, ranging from the very well-known Australopithecus africanus to the crushed skull of Kenyanthropus platyops. Note that Ardipithecus ramidus is much better known now than when this study was done.

Again a disclaimer: this is not an objective measure of morphological similarity (there is no such thing), and it is definitely not a phylogenetic analysis (even if the data is taken from one).



References



  • Strait & Grine (2004). Inferring hominoid and early hominid phylogeny using craniodental characters: the role of fossil taxa. Journal of Human Evolution 47:399–452. doi:10.1016/j.jhevol.2004.08.008

17 August 2012

Refinement: Primate Anatomical Similarity

Earlier, I posted a chart showing how similar humans are to other primates (and other euarchontoglires),  as measured from Diogo & Wood's (2011) soft-tissue character matrix. A problem with the earlier version was that it doesn't reflect uncertainty in that matrix. (It also wouldn't show polymorphisms, although that matrix doesn't have any, anyway.) I've created a new version that shows the maximum and minimum possible distance, given the uncertainties in the matrix.




08 August 2012

How similar are we, anatomically, to other primates?

There is no objective way to measure anatomical similarity, but you can get a sense by converting character matrices into distance matrices. I've done this for the matrix used by Diogo & Wood (2011), which looked at soft tissue anatomy. Here is a bar chart showing how similar each taxon is to humans:


Dangit, 2011, not 2010. I'll fix it later.
Click for full size.
As you can see, the distances for great apes are well-marked and exactly what you'd expect based on phylogeny, but past that it gets a bit fuzzy. Moving outward from the great apes we get to Old World monkeys, then gibbons (from phylogeny you'd expect gibbons first, but the difference is so minor I'm sure it's meaningless), then a mixture of non-catarrhine primates, and finally non-primates.

This figure was generated using a JavaScript library I'm developing. I'll say more later, but rest assured it will be free and open source.

Expect to see some more stuff like this on A Three-Pound Monkey Brain soon.

References


  • Diogo & Wood (2011). Soft-tissue anatomy of the primates: phylogenetic analyses based on the muscles of the head, neck, pectoral region and upper limb, with notes on the evolution of these muscles. J. Anat. 219:273359. doi:10.1111/j.1469-7580.2011.01403.x

02 May 2012

The PhyloCode Will Not Be Amended

At least for now.


In a 10-1 decision, the Committee on Phylogenetic Nomenclature voted to reject the wholesale adoption of a proposal to amend the PhyloCode that would have greatly changed how it handles species and species names. However, the CPN has decided to discuss the possibility of using some ideas in the proposal.

02 April 2012

An Idea for the EOL Phylogenetic Tree Challenge

Earlier this year, the Encyclopedia of Life announced the EOL Phylogenetic Tree Challenge. The goal: to produce "a very large, phylogenetically-organized set of scientific names suitable for ingestion into the Encyclopedia of Life as an alternate browsing hierarchy". The prize: an all-expenses-paid trip to iEvoBio 2012 in Ottawa!

This interested me greatly, because:

  1. It's exactly the sort of thing I'm working on for PhyloPic.
  2. I can't really justify paying for a trip to iEvoBio this year. (Phyloinformatics is my hobby, not my profession!)
After reading Rod Page's thoughts on the challenge, I came up with a basic idea, and started to implement it. Unfortunately, now that we're two weeks from the deadline, I'm realizing that:
  1. I do not have the time to complete it.
  2. Even if it were paid for, I can't justify a trip on my own out of town right now.
Why not? Simply put, this.

So, instead, I'm going to outline the general approach I was going to take, and if someone else wants to run with it, knock yourself out. (Just give me partial credit.)

27 February 2012

What Is Phylogenetic Nomenclature?

Sometimes when discussing the PhyloCode, I get the feeling a lot of potentially interested parties don't understand what phylogenetic nomenclature actually is. I have gone into excruciating detail on this topic elsewhere, but who wants to be excruciated? So here's a brief summary of the process of creating a phylogenetic taxonomy.

1. Declare Operational Taxonomic Units
Result: Alpha Taxonomy

The very first step is to decide what your units are. Are you dealing with individual organisms? Populations? Species? Which ones? Whatever you select, there should be an unambiguous way of referring to these taxonomic units (specimen numbers, species names, etc.).

Phylogenetic nomenclature is flexible as to how you determine and name taxonomic units. (Although the names must be relateable to those used in definitions [see Step 3].)

Example: My operational taxonomic units are the whale species Aetiocetus cotylalveusBalaena mysticetus, Balaenoptera physalus, Delphinus delphis, and Monodon monoceros.

Operational Taxonomic Units
Silhouettes by Chris huh and T. Michael Keesey, taken from PhyloPic.
Image license: CC-BY-SA 3.0

22 February 2012

Guest Post: The Consolidation of Language


Today we have another guest post by Elaine Hirsch, this time on the thorny issue of language consolidation. One the one hand, it's a terrible tragedy that so many languages are going extinct. On the other hand, it's difficult to function as a global society when za nafrur hun tnayr nart nir nils.

The need to learn a commonly-spoken language in today's world has been accelerated by the prevalence of communication technology, which has turned the world into a global community. An increase in the use of the internet throughout the world has resulted in a small set of languages dominating the world population, resulting in the elimination of many others. The consolidation of language has become especially important in industry and business. However, language consolidation has buttressed barriers to a wide range of studies, ranging from marketing to engineering to medical transcription. This is due to the fact that the ability to communicate, regardless of culture, can mean the difference between success and failure.

15 February 2012

Amending the PhyloCode: The Species Problem

Earlier I mentioned a proposal by Cellinese, Baum, and Mishler to make a major revision to the PhyloCode, removing pretty much all mention of "species". In this post I'm going to take a high-level look at some of the proposed changes.

13 February 2012

Half a Thousand Silhouettes

Last night PhyloPic reached 500 images! Here's the 500th, a Siberian tiger (Panthera tigris altaicus) by Steven Traver:


Steven submitted 71 silhouettes in the past week! (All vector, too.) I'd like to take a moment to recognize all the people who have submitted silhouettes numbering in the double digits:

26 January 2012

PhyloPic Is Back!

Last year, I launched a project called PhyloPic. The goal of this project was to create an open database of freely reuseable silhouette images of organisms. Furthermore, it featured a phylogenetic taxonomy so that, if a taxon wasn't illustrated, an approximation could be found.

I launched it as a "public alpha", meaning that it wasn't complete and still had some bugs. The year turned out to be very busy for me: an awful thing happened, and a wonderful thing happened. And I didn't really have time to push PhyloPic to the next level.

Unfortunately, I hadn't built it well enough in the first place. The architecture was not optimized, and the site became extremely slow and buggy. I took it offline, hoping to release a new version in short order, but it turned out to need a lot more work than I first thought.

Happily, it is now ready again!

16 January 2012

In Anticipation: The Evolution of the Raven, in Silhouettes

Any day now there will be a relaunch of a certain project I launched last year. (Just working through some technical details.) In anticipation of that, here's the evolutionary history of the Common Raven (Corvus corax), illustrated with silhouettes:

click to enlarge

A Proposal to Amend the PhyloCode


The draft PhyloCode has been in a pretty stable form for a while. But recently, there has been a proposal to drastically change how it handles species. You can read the proposal here: 



The first paragraph:

The overarching goal of this proposal is to remove all mention of "species" from the  PhyloCode. Detailed justifications for this goal are given in a supporting paper (Cellinese, Baum, and Mishler, in review); here we present a summary of the main arguments, along with specific proposals for change.

Before I weigh in on this, I'm curious as to what other people think. Please comment below, or send comments to David Marjanović, the Secretary of the Committee on Phylogenetic Nomenclature.

UPDATE:
If anyone would like a Microsoft Word version of this document, just ask.

ANOTHER UPDATE:
I have weighed in.