08 December 2012

SHA-1 with Typed Arrays

There are already SHA-1 implementations for JavaScript, but all the ones I found use strings as input. This is fine for generating hashes from small amounts of data, but not so great for large binary files. So I've created a SHA-1 library that optionally accepts ArrayBuffer objects.


Note that this will not work on browsers that do not support Typed Arrays.

I ran some tests and found that in Chrome it takes less than half a second to hash 10Mb of data.

Hope someone finds this useful! (At the very least, I will.)

02 December 2012

Haeckel: A Code Library for Browser-Based Evolutionary Diagrams

For a while now I've been writing posts with diagrams like this one, showing the evolution of cranial capacity in mangani over the past seven million years:


How did I make them? Originally it was all ad-hoc ActionScript code, but more recently I've begun to organize the code into a library and translate it into TypeScript (which, in turn, is automatically translated into JavaScript). Although this library is still in progress, I've decided it's at a stage where I can open it to the general public.

This library includes functionality for:

  • Modeling scientific concepts such as taxa, phylogeny, character states, stratigraphy, and geography.
  • Processing scientific data (notably calculating morphological distance and inferring unknown character states).
  • Rendering data into charts as Scalable Vector Graphics, using RaphaëlJS.
For a while I struggled with what to call this library. It's neither purely about science nor purely about graphics. Finally I got my inspiration from RaphaëlJS, a graphics library named after a great artist. I named my library after a man who was both a great artist and a great biologist:

19 October 2012

RaphaëlJS + TypeScript = RaphaëlTS

RaphaëlJS is a library, written in JavaScript, for creating vector graphics on the web. It's commonly used because it supports all web browsers, including the good ones (which use SVG) and Internet Explorer (which uses VML). Among other things, it's the rendering engine of the ExtJS and Sencha frameworks.

TypeScript is a new web language that extends JavaScript, adding optional strict typing. This is a big improvement on JavaScript because it allows more errors to be caught at compile-time rather than at run-time (for example, if you pass a number to a function that expects an array). However it's not as obnoxious as some strictly-typed languages, because the typing is optional and often implicit.

I've spent some time recently playing around with TypeScript and my verdict: it's wonderful. I only want to use this from now on (at least for large projects). It's a great language, and, since it's an extension (or superset) of JavaScript, all existing JavaScript code works just fine in it. (This is a big advantage it has over Dart, an otherwise excellent language as well.)

But while you can just include a JavaScript library in your TypeScript projects, you don't get the full benefit of the compile-time checking (or the code-hinting) unless that library's entities have been declared. TypeScript allows for ambient declarations: defining or partially defining an entity (class, interface, method, variable, etc.) without actually creating it. As an example (and a very useful one at that), here is a TypeScript declaration file for jQuery: jQuery.d.ts

Today I wondered if anyone had made something like this for RaphaëlJS. I wasn't able to find one, so ... I made one. Here you go: https://bitbucket.org/keesey/raphaelts/src

If you're using the Visual Studio plugin or the TypeScript playground (other editors are bound to come out soon), you can now program type-safe RaphaëlJS code with auto-complete (=IntelliSense).

I've verified that it compiles, but I haven't fully tested it—if anyone experiences issues please let me know. (Or feel free to fork the project).

Hope someone else finds it useful ... I know I will!

21 August 2012

Using Morphological Distance to Determine Genera

A genus is not an empirical entity. It's a bookkeeping convention, up to the personal whims of the taxonomist. And of course this leads to a huge mess.
Homo ergaster,
from PhyloPic

Depending on the taxonomist (and, for early forms, the phylogeny), the human total group includes anywhere from one genus (Homo) to ten (Ardipithecus, Australopithecus, Homo, Kenyanthropus, Orrorin, Paranthropus, Paraustralopithecus, Praeanthropus, Sahelanthropus, and Zinjanthropus  not even mentioning obsolete ones like Telanthropus and Pithecanthropus). We could try to clean up this mess by creating phylogenetic definitions and reinterpreting the genera as clades, except that all of the type species are thought by at least some researchers to be ancestral forms (with the exceptions of Homo sapiens, Paranthropus robustus, and Zinjanthropus boisei). For example, if Australopithecus is a clade that includes Australopithecus africanus, then it might also include Homo, and genera are not allowed to overlap under the ICZN.

After playing around with morphological distances based on Strait & Grine's (2004) character matrix, it occurred to me you could base genera on morphological distance. You have to make subjective decisions as to how many genera you want and which matrix to use, but the rest follows naturally. You just look at each species and see which valid type species it's closest to. Here's what I found:

If we use all of the above genera, then the taxonomy looks like this:
  • Ardipithecus
    • Ardipithecus anamensis (Not in Praeanthropus, despite sometimes being synonymized with afarensis! Although it should be noted that ramidus is much better known now than in 2004, so this may have changed.)
    • Ardipithecus ramidus
  • Australopithecus
    • Australopithecus africanus
  • Homo
    • Homo ergaster
    • Homo habilis (although it is closer to garhi than to sapiens!)
    • Homo sapiens
  • Kenyanthropus
    • Kenyanthropus garhi (!)
    • Kenyanthropus platyops
    • Kenyanthropus rudolfensis (although it is closer to ergaster and habilis than to platyops, it is still closer to platyops than to sapiens)
  • Orrorin
    • Orrorin tugenensis (not included in the study, but this is nomenclaturally where it would go unless found to be a synonym)
  • Paranthropus
    • Paranthropus robustus
  • Paraustralopithecus
    • Paraustralopithecus aethiopicus
  • Praeanthropus
    • Praeanthropus afarensis
  • Sahelanthropus
    • Sahelanthropus tchadensis
  • Zinjanthropus
    • Zinjanthropus boisei
Not included: species that are not types and were not included in the study, like Ardipithecus kadabba (scrappy craniodental remains), Australopithecus sediba (hadn't been discovered in 2004), Homo heidelbergensis (pretty close to sapiens anyway), etc.

It must be said that Zinjanthropus and Paraustralopithecus are not that commonly used. If we remove Paraustralopithecus, then aethiopicus predictably falls into Zinjanthropus. If we remove Zinjanthropus as well, then both aethiopicus and boisei predictably fall into Paranthropus.

Praeanthropus is also not that widely used. If we remove that as well, we get:
Praeanthropus afarensis,
from PhyloPic
  • Ardipithecus
    • Ardipithecus anamensis
    • Ardipithecus ramidus
  • Australopithecus
    • Australopithecus afarensis
    • Australopithecus africanus
  • Homo
    • Homo ergaster
    • Homo habilis
    • Homo sapiens
  • Kenyanthropus
    • Kenyanthropus garhi
    • Kenyanthropus platyops
    • Kenyanthropus rudolfensis
  • Orrorin
    • Orrorin tugenensis
  • Paranthropus
    • Paranthropus aethiopicus
    • Paranthropus boisei
    • Paranthropus robustus
  • Sahelanthropus
    • Sahelanthropus tchadensis
Kenyanthropus is a rather controversial genus. If we remove it, we get:
  • Ardipithecus
    • Ardipithecus anamensis
    • Ardipithecus garhi
    • Ardipithecus platyops
    • Ardipithecus ramidus
  • Australopithecus
    • Australopithecus afarensis
    • Australopithecus africanus
    • Australopithecus rudolfensis (still refuses to go with sapiens!)
  • Homo
    • Homo ergaster
    • Homo habilis
    • Homo sapiens
  • Orrorin
    • Orrorin tugenensis
  • Paranthropus
    • Paranthropus aethiopicus
    • Paranthropus boisei
    • Paranthropus robustus
  • Sahelanthropus
    • Sahelanthropus tchadensis
If we also remove Sahelanthropus, then tchadensis goes easily into Ardipithecus. I assume tugenensis would as well, if we removed Orrorin, but that wasn't included in the study since the craniodental material is so scant. If we remove Paranthropus, its species go very, very reluctantly into Australopithecus (by which I mean despite not being that close to africanus):
Sahelanthropus tchadensis,
from PhyloPic
  • Ardipithecus
    • Ardipithecus anamensis
    • Ardipithecus garhi
    • Ardipithecus platyops
    • Ardipithecus ramidus
    • Ardipithecus tchadensis
    • Ardipithecus tugenensis?
  • Australopithecus
    • Australopithecus aethiopicus
    • Australopithecus afarensis
    • Australopithecus africanus
    • Australopithecus boisei
    • Australopithecus robustus
    • Australopithecus rudolfensis
  • Homo
    • Homo ergaster
    • Homo habilis
    • Homo sapiens
Ardipithecus ramidus was originally named as Australopithecus ramidus. If we remove Ardipithecus, its species predictably end up in Australopithecus:
  • Australopithecus
    • Australopithecus anamensis
    • Australopithecus aethiopicus
    • Australopithecus afarensis
    • Australopithecus africanus
    • Australopithecus boisei
    • Australopithecus garhi
    • Australopithecus platyops
    • Australopithecus ramidus
    • Australopithecus robustus
    • Australopithecus rudolfensis
    • Australopithecus tchadensis
    • Australopithecus tugenensis?
  • Homo
    • Homo ergaster
    • Homo habilis
    • Homo sapiens
Now for the final cut. What happens when we remove Australopithecus?
Pan paniscus,
from PhyloPic
  • Gorilla
    • Gorilla aethiopicus (!!! although gorilla only beats sapiens by a hair)
    • Gorilla beringei
    • Gorilla garhi (!!!)
    • Gorilla gorilla
    • Gorilla tchadensis (admittedly, this was suggested by Senut and Pickford)
  • Homo
    • Homo africanus
    • Homo boisei
    • Homo ergaster
    • Homo habilis
    • Homo platyops
    • Homo robustus
    • Homo rudolfensis
  • Pan
    • Pan afarensis (!!)
    • Pan anamensis (!!)
    • Pan paniscus
    • Pan ramidus (!!)
    • Pan troglodytes
  • incertae sedis
    • Orrorin tugenensis
Yes, some of the stem-humans get pulled in with chimpanzees or gorillas! And there seems to be little rhyme or reason as to which go where. The splitting of the "robust australopithecines" is most bizarre (although, as noted, it only takes a tiny nudge to put aethiopicus in Homo with the others).

In summary:
  • This is just one matrix, only focusing on one area of the anatomy.
  • There are huge amounts of uncertainty with some of these taxa.
  • Even without the uncertainty, there is no objective way to measure morphological distance.
  • Even if there were, it might not be a good idea to use it to determine generic boundaries.
  • Generic boundaries are stupid, anyway.

Closest to Humans: The Skull & Tooth Version

Previously I posted a diagram showing how different various primates are from humans, based on soft tissue characters.  Here is a similar diagram, but using the craniodental characters from Strait & Grine's (2004) matrix. Unlike the soft-tissue diagram, this includes fossil taxa.



This black lines indicate the probable distance, as inferred from the phylogeny. The gradients show the actual range of uncertainty. (They'd also show polymorphism, but this matrix has none.) Ordering is according to probable distance—using the mean of the range of uncertainty yields slightly different results.

As in the soft-tissue diagram, chimpanzees are closer to humans than any other living primates are. But, oddly, gibbons and colobus monkeys are closer to humans than gorillas and orangutans! My guess is that this is because gorillas and orangutans are more derived from the ancestral catarrhine state than gibbons or colobus monkeys. (This probably also explains why the Paranthropus species, a.k.a. "robust australopithecines", are further than chimpanzees, although it is strange that the earliest one, P. aethiopicus, is furthest.)

To the right of chimpanzees is a very unsurprising pattern: "gracile australopithecines", then basal Homo species, then the large-brained Homo ergaster, and finally the huge-brained Homo sapiens.

You can also see how well-known the fossil crania are, ranging from the very well-known Australopithecus africanus to the crushed skull of Kenyanthropus platyops. Note that Ardipithecus ramidus is much better known now than when this study was done.

Again a disclaimer: this is not an objective measure of morphological similarity (there is no such thing), and it is definitely not a phylogenetic analysis (even if the data is taken from one).



References



  • Strait & Grine (2004). Inferring hominoid and early hominid phylogeny using craniodental characters: the role of fossil taxa. Journal of Human Evolution 47:399–452. doi:10.1016/j.jhevol.2004.08.008

17 August 2012

Refinement: Primate Anatomical Similarity

Earlier, I posted a chart showing how similar humans are to other primates (and other euarchontoglires),  as measured from Diogo & Wood's (2011) soft-tissue character matrix. A problem with the earlier version was that it doesn't reflect uncertainty in that matrix. (It also wouldn't show polymorphisms, although that matrix doesn't have any, anyway.) I've created a new version that shows the maximum and minimum possible distance, given the uncertainties in the matrix.




08 August 2012

How similar are we, anatomically, to other primates?

There is no objective way to measure anatomical similarity, but you can get a sense by converting character matrices into distance matrices. I've done this for the matrix used by Diogo & Wood (2011), which looked at soft tissue anatomy. Here is a bar chart showing how similar each taxon is to humans:


Dangit, 2011, not 2010. I'll fix it later.
Click for full size.
As you can see, the distances for great apes are well-marked and exactly what you'd expect based on phylogeny, but past that it gets a bit fuzzy. Moving outward from the great apes we get to Old World monkeys, then gibbons (from phylogeny you'd expect gibbons first, but the difference is so minor I'm sure it's meaningless), then a mixture of non-catarrhine primates, and finally non-primates.

This figure was generated using a JavaScript library I'm developing. I'll say more later, but rest assured it will be free and open source.

Expect to see some more stuff like this on A Three-Pound Monkey Brain soon.

References


  • Diogo & Wood (2011). Soft-tissue anatomy of the primates: phylogenetic analyses based on the muscles of the head, neck, pectoral region and upper limb, with notes on the evolution of these muscles. J. Anat. 219:273359. doi:10.1111/j.1469-7580.2011.01403.x