18 October 2013

The PhyloCode Has a Deadline

As most of you probably know, the PhyloCode (more verbosely, the International Code of Phylogenetic Nomenclature) is a proposed nomenclatural code, intended as an alternative to the rank-based codes. It was first drafted in April 2000, and at that time the starting date was given as "1 January 200n". On this date the code would be enacted and published along with a companion volume, which would provide the first definitions under the code, establishing best practices and defining the most commonly-used clade names across all fields of biology.

Well, the '00s (the zeroes? the aughts?) came and went without the code being enacted. The hold-up was not the code itself, which has been at least close to its final form since 2007. (The last revision, in January 2010, was minor.) And it hasn't been the software for the registration database, which has been completed. The hold-up was the companion volume, which turned out to be a much more daunting project than expected. (And considering that the zoological code took 66 years to go from being proposed to being published, perhaps the initial estimate should have been hedged, anyway.)

At the 2008 meeting of the International Society for Phylogenetic Nomenclature (ISPN), this problem was discussed. It was decided that the companion volume should be narrowed in scope. Instead of waiting to get definitions for commonly-used clade names across all fields of biology (many of which did not even have willing authors), entries would be limited to those already in progress. Later on, a revision was also made to the editorial process to help speed things up.

Now for some news: at the website for the ISPN (recently revamped by yrs trly), there is a new progress report for Phylonyms, the companion volume to the PhyloCode. There will be at most 268 entries. Currently 186 of those (over two thirds) have already been accepted. The rest are at various stages of review. But perhaps most excitingly, there is a deadline:

The contract with University of California Press calls for the manuscript to be submitted by September 1, 2014.

Yes, folks, we will see the PhyloCode enacted in our lifetime! (Pending nuclear holocaust or alien invasion.)

06 September 2013

Solution to Rampant Monotypy: Subgenera

Genus names are stupid. They have two jobs, and they do them both poorly:

Refer to a taxon.
Form the first part of the names of all species within that taxon.

They do #1 poorly because they're defined typologically. The definition for a genus is just, "Some taxon that includes the type species." But they could do this task well if they were given phylogenetic definitions instead.

But that doesn't work, either, because it conflicts with #2. Taxa defined by phylogenetic definitions may overlap, or be empty. For #2 to work, every single species has to be part of one genus (and only one genus). Phylogenetically-defined taxa don't really work that way.

So genus names are stupid. But we have to use them, because there's no other system for naming species.

Because they refer to taxa poorly, different disciplines often have wildly different ways of using genus names. In entomology, a genus may have hundreds of species. But, increasingly in dinosaur paleontology, each genus gets one species. Nearly every single Mesozoic dinosaur genus is monotypic.

This is a pattern we see over and over in recent years:

A new dinosaur species is discovered.
Researchers do a cladistic analysis and determine that it is the sister group to another species, already named, Originalgenus oldschoolensis.
At this point, most researchers in other fields would name the new species something like Originalgenus noobius. But, no, even though it's barely different from O. oldschoolensis, it gets a new genus, so it's Newguy noobius.

Today's researchers do have an excuse prepared for #3. It goes like this:

"Sure, this analysis shows it as the sister group of Originalgenus oldschoolensis. But what if a future analysis shifts it a bit so that they no longer form a clade? Cladistic taxonomies may require it to be placed it in a new genus."
"We sure as hell aren't going to let anyone else name that genus; not after all the work we did describing it!"

Ignoring the mild egomania in #2, this sounds reasonable enough. But this way of thinking has given us a huge number of completely redundant names, as well as pushing dinosaur paleontology into an extreme corner of the "splitter vs. lumper" debate. Isn't there a better way?

There Is a Better Way

Just give your species a new subgenus!

Taxonomy

Genus: Originalgenus Original Author 1900

Subgenus: Newguy subgen. nov.

Species: Originalgenus noobius sp. nov.

Now, as long as O. noobius continues to be regarded as the sister group (or otherwise "close enough") to O. oldschoolensis, you just keep the status quo. But if things get shaken up and O. noobius requires a different genus name, by the rules of the ICZN, it has to be Newguy. And you still get the credit!

I know, it's stupid ... but it works!

20 May 2013

PhyloPic Submissions Come in Fits and Bursts (API Example)

The recent surge of activity as PhyloPic neared its 1000th image got me to wondering about the pattern of image submissions over time. Fortunately it's very easy to collect this data using the PhyloPic API.

Step 1. Determine the number of submissions.

This is a very simple API call:
http://phylopic.org/api/a/image/count

...which yields:
{"result": 1024, "success": true}

Step 2. Pull down the submission time data for all images.

Now that we have the total number, we can grab data for all of the images at once, like so:
http://phylopic.org/api/a/image/list/0/1024

But this just yields a list of 1024 image entries that each look like this:
{"uid": "1353c901-f652-4563-941d-7b12bc7a86df"}

Not very useful. To get any actual data fields from the PhyloPic API, you have to be more specific:

http://phylopic.org/api/a/image/list/0/1024?options=submitted

Now each entry is a lot more useful:

{"uid": "1353c901-f652-4563-941d-7b12bc7a86df", "submitted": "2013-05-19 16:05:12"}

Step 3. Process the data.

Once you have this, it's a pretty simple matter for a JavaScript programmer to strip out the month and tally the images. I did this and generated a bar chart using Google's Code Playground. Here it is:

(I left out May since it's not over yet. Apologies for the gaps.)

PhyloPic was officially launched on 21 February 2011. Most of the submissions for that month are ones that I "presubmitted" during development. (A lot are from Scott Hartman's skeletal drawings, including the very first submission.)

Submissions were strong going into March but then completely slacked off. I'm sure a lot of this was due to technical problems — the site became incredibly slow after a while. There were major architecture flaws.

I (mostly) fixed these and relaunched in January 2012. Interest was strong, and in February PhyloPic had its best month ever. But then submissions slacked off again.

A year later, in March 2013, I was getting ready to do another major upgrade. I added dozens of images in anticipation. Then I relaunched at the very end of the month. Sure enough, April was one of the best months ever, second only to February 2012.

May 2013 is currently going strong, but looking at this trend I start to wonder: how long will it last? And although I recently swore off doing massive updates, are they actually better for driving up submissions?

11 May 2013

PhyloPic Passes a Thousand Images!

Just a little while ago, PhyloPic reached its first 1000 silhouettes! Here's the thousandth, the eusauropod dinosaur Cetiosaurus oxoniensis, by Michael P. Taylor:

(Public Domain)

Several contributors seem to have all been vying for the spot. Around the same time we got some other lovely contributions. Gareth Monger contributed this upside-down butterfly, Aglais urticae:

(Creative Commons Attribution-ShareAlike 3.0 Unported)

He missed the 1000th spot and got 1002nd. Matt Martyniuk missed it on the opposite side, with this Lambeosaurus (hadrosaurid dinosaur) at 993rd:

(Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported)

Emily Willoughby got quite close, too, and intended this rather recognizable angiosperm leaf (Cannabis sativa) for the 1000th spot. Alas, it's 1007th:

(Creative Commons Attribution-ShareAlike 3.0 Unported)

(As she noted, 420 would have been a good number as well.)

Thanks to everyone who contributed to the first thousand silhouettes! It took two years to get here — may the next thousand be even faster!

02 April 2013

Why the PhyloPic Relaunch Took So Long

Or, A Lesson in Development Strategy.

As I announced last week, my website, PhyloPic, has been relaunched with a massive update. One of the key updates is a public API for developers. A lot of people have been looking forward to this, and it was actually almost ready for release last summer. So why didn't I release it?

Failure to Branch

Basal tracheophyte.
Public Domain.

As I was writing up the documentation for the API, I learned of Bootstrap, a CSS/JavaScript framework. I realized that it could solve a lot of the design issues I was having — problems with the site on mobile devices, older browsers, etc.

What I should have done: Created a new development branch for adding Bootstrap while continuing to polish up the API branch. That way, I could have released the API shortly while still being able to work on the design issues in parallel.

What I actually did: Continued working in the same branch, ensuring that I couldn't release the API update until the Bootstrap update was complete.

Having Other Projects

By the end of summer I was mostly done with the revisions, but there was still some cleanup to do. By now some other projects I'm attached to, one with other collaborators, were suffering. So I spent most of my free time in the autumn working on those. (I have a full-time job and a toddler, so that isn't much.)

Homo habilis.
Public Domain.

Becoming Enamored of New Technology

In the autumn, Microsoft release a preview version of TypeScript, and I quickly saw that it was going to be extremely useful. So I rewrote PhyloPic's client-side code — it wasn't too hard and it made further development a lot easier. This caused some delay up-front, but I don't regret it.

Becoming Enamored of the Wrong New Technology

Around this time I also realized that I could finally do away with the last bit of Flash on the website: the Image Submission Tool. HTML5 had become mature enough to do all the image manipulation in the browser itself. I did a lot of research, learning about the Canvas, Typed Arrays, etc. And after a lot of work I actually created an image-processing workflow that work in HTML5-enabled browsers. As a bonus, I got a little standalone project out of it: Pictish.

But there were problems. One is that the best existing JavaScript library for creating PNG files doesn't use Typed Arrays — it uses strings, which means that it is slow for large files. I tried creating my own PNG encoder, or adapting that one, but soon realized it was far too much work. Another problem is that I was no longer supporting older browsers (although this was a trade-off against supporting mobile platforms, so I didn't feel too bad about it).

But there was a much more fundamental danger: doing the image-processing in the client side meant that the API had to trust the client to do it properly. What if some developer used the PhyloPic API to add images to the database but didn't do it right? That could be disastrous.

Octopus bimaculatus.
Public Domain.

I realized I would have to do things the old-fashioned way: on the server. After a bit of research, I identified Image Magick and Inkscape as the best tools. The new methodology was so completely different that I ended up making a lot of database changes, too. Until recently, all files were stored in the database — now they're just stored as flat files. The good news is that this makes load times faster.

Doing Things the "Right" Way

Throughout all this I had been making an effort to "dogfood" my own API, i.e., to use it on the site itself. This has the advantage of making load times faster, since the basic page can be cached and then the data can be loaded in secondarily in a much smaller format. Unfortunately this meant a lot of rewrites for how the pages are rendered.

After a while, the code to generate pages from the data had gotten really complex (mostly involving on-the-fly element generation using jQuery). Around the time I was redoing the Image Submission Page, I realized my whole approach was untenable. I needed a cleaner way to divorce presentation logic from control logic.

I ended up using Knockout for the entire site. It made things a lot more manageable.

In Summary

The biggest problem was my branching model, or, rather, my lack of one. Solitary developers often fall into this trap: we think that, since we're doing all the work, there's no need to have more than a single branch of development. At work, we've been using this model and found it very successful. Going forward, I plan to do this on PhyloPic as well. No more massive updates where everything is different. Just incremental features and fixes.

31 March 2013

PhyloPic Launch: API, Responsive Design, etc.

On Good Friday I took PhyloPic down. On Holy Saturday, I wrestled with errors caused by incongruities between the server and dev environments. And, lo, now, on Easter Sunday I announce that PhyloPic is back! (Actually, I already announced in on Twitter, but whatever.)

How smartphone users should see PhyloPic, more or less.

Major New Features:

A Developer API (using JSON). Now other people can build applications using PhyloPic data and images. (Yes, I am dogfooding it, so most of it should be pretty well-tested.)
Responsive design (using the ever-more-ubiquitous Bootstrap) — the site is now much more useable on mobile devices.
A Links Page, showing off work that uses PhyloPic or features it in some way.
Speedier load times (in theory, anyway).
Ranks for Contributors — if you submit one image, you're a "Specialist". Two, and you're a "General". Six, and you're a "Familiar". See where this is going?
Fewer requirements — most notably, Flash is no longer required to submit images.
Handy little icons on most taxon links — now you can tell if you're clicking on Gastonia the dicotyledonous plant or Gastonia the dinosaur. (Still rolling this out to all taxa.)

Preview Screenshots

A little glimpse of what I've been working on:

10 March 2013

"Year of Macrauchenia": Third and Final "All Your Yesterdays" Entry

I made a last-minute entry for the All Your Yesterdays contest:

Year of Macrauchenia

Macrauchenia was the greatest and last of the litopterns, a clade of stem-euungulates. This bizarre Pleistocene South American herbivore is often described as having the body of a llama and the head of a tapir. However, the body is only superficially llama-like (but with gigantic elbows) and the head is not at all tapir-like. (If any part of it resembles tapirs, it's the feet.) In fact, the skull isn't like that of any living terrestrial mammal. It has extremely dorsal nares, like trunked animals and cetaceans, but it lacks any place for trunk muscles to attach.

But there is a possible analogue — another group of terrestrial herbivores with extremely dorsal nares — and they even had long necks, too! I refer, of course, to sauropod dinosaurs. Unfortunately, none are extant for comparison, but recent work has shown that, despite the dorsal placement of the nares in the skull, the external nostrils were still placed rostrally, close to the mouth, thanks to fleshy tubes. I've restored Macrauchenia similarly.

This mandala depicts the life of Macrauchenia across the seasons. At bottom, a lone Macrauchenia wanders the frozen highlands in relative comfort, having grown a shaggy winter coat. Its fleshy nostril tubes serve to warm the air before it enters the body. At right, spring is in effect — a bull courts a cow by inflating his nostril tubes, similar to a hooded seal. At top, a young calf frolics under his mother's watchful eye — his green color comes from the algae living in his fur (similar to the camouflage of those other South American indigenes, the sloths). This extra measure of color accuracy is necessary because, unlike today's ungulates, Macrauchenia must contend with predators that have excellent color vision: phorusrhacids. At left, a wary bull faces off against a Smilodon, an invading predator from the north. (It is restored after linsangs, the extant sister group to felids, instead of the felids themselves, since it is, properly, a stem-felid, not a true felid.) Macrauchenia will survive this great faunal interchange, but not for long — another invader from the north, a large primate, will be the end of it, and hence all litopterns.

28 February 2013

Another "All Your Yesterdays" entry: Denisovan, or "Polar Neandertal"

They've extended the deadline for the All Your Yesterdays contest, so I've decided to do a couple more entries. I started this one years ago as a Neandertal restoration. Since that time, new genomic discoveries showed that the speculative pigmentation was incorrect. But, other discoveries identified a new candidate for the subject matter!

Known from a few scrappy pieces, the Siberian Denisovans (Homo sp. or Homo sapiens ssp., depending on how large you like your species) are a true challenge to reconstruct. We have their entire genome, but know almost nothing about their anatomy. The few fossil elements we have are not morphologically distinct from Neandertals (Homo neanderthalensis or Homo sapiens neanderthalensis) or humans (Homo sapiens sapiens).

But the genomic facts are highly intriguing:

Some Oceanian humans have inherited up to 6% of their nuclear DNA from Denisovans (with the highest ratios in Meganesia [Australia and New Guinea]).

The nuclear DNA indicates a common ancestor with Neandertals, shortly after the split from proto-humans.

But the mitochondrial DNA indicates a motherline that branched off much earlier. (Possibly Homo erectus?)

Genes for pigments are consistent with dark skin.

Here I've imagined a Siberian Denisovan as a sort of "polar Neandertal". As with polar bears, his skin is dark, trapping heat, but his pelage is light, allowing for camouflage against the taiga and tundra. He is the last of his kind — his southern kin mixed with the strange, baby-faced people who keep invading from the west. But he does not welcome them. He will fight to his death.

27 February 2013

"Texan Mama", my "All Your Yesterdays" contest entry

I would have worked on this a bit longer, but the "All Your Yesterdays" contest deadline is tomorrow.

Texan Mama

Dimetrodon and its kin have often been described as "mammal-like reptiles", but in fact they they are just as closely related to modern reptiles as we are (in terms of shared descent). Creatures like Dimetrodon, Moschops, Lystrosaurus, Cynognathus, Morganucodon, etc. are more properly termed "stem-mammals", meaning that they are not mammals, but are more closely related to mammals than to any other living organisms.

We can infer, in the absence of direct evidence, that all stem-mammals probably possessed any characteristics shared by us mammals and our closest living non-mammalian relatives, the sauropsids: turtles, tuataras, lizards (including snakes), crocodylians, and birds. But mammalian characteristics not shared by sauropsids are trickier. When did hair evolve? When did lactation evolve? We have a few clues but no definite answers.

In this piece, I have pushed fur back to an extremely early time — Dimetrodon is one of the furthest stem-mammals from Mammalia proper. While we know that a later stem-mammal, Estemmenosuchus, had glandular skin without any sign of fur, it is possible that fur evolved earlier and was simply lost or reduced in some lineages, as it has been in many lineages of placental mammal.

I have also posited parental feeding, but not, strictly speaking, lactation. Other lineages of tetrapod, including caecilians and pigeons, have evolved ways of feeding the young from foodstuffs produced by the mother. The mother Dimetrodon's sides are swollen with nutritious substances which seep out as her pups gobble it up. Is it milk? Sort of and sort of not.

Finally, I have scrupulously avoided any suggestion that these are in any way reptilian. They do retain some plesiomorphies evidenced in some reptiles, amphibians, and lungfishes, such as a sprawling gait, belly scales, and acute color vision, but they lack the dry skin and derived scales of true reptiles. Not a single sauropsid texture was used in the photocollage; instead they have the textures of Hippopotamus, Phacochoerus, Procyon, Homo, Zaglossus, Caecilia, Litoria, and even Neoceratodus. These Dimetrodon are moist, glandular creatures — not reptiles at all.

If you're wondering, here's a quick breakdown of the textures used:

Hippopotamus amphibius (hippo): general, especially mother's head and torso
Phacochoerus (warthog): general, especially mother's tail and torso
Procyon (raccoon): soles
Homo sapiens (human): general, especially mother's sail
Zaglossus (long-beaked echidna): juveniles' faces
Caecilia (caecilian): mother's sides and underside
Litoria caerulea (Australian green tree frog): general, especially mother's limbs
Neoceratodus forsteri (Queensland lungfish): juveniles' underbelly

I've also submitted a variant, done in a Pop Art style inspired by Roy Lichtenstein's work, to the related Love in the Time of Chasmosaurs: All Yesterdays contest:

Note: Yes, I do descend from a long line of Texan mothers.

15 February 2013

JSEN: JavaScript Expression Notation

That idea I was talking about yesterday? Storing mathematical expressions as JSON? I went ahead and made it as a TypeScript project and released it on GitHub:

JavaScript Expression Notation (JSEN)

Still need to complete the unit test coverage and add a couple more features. I made a change from my original post to the syntax for namespace references. (The reason? I realized I needed to be able to use "*" as a local identifier for multiplication.) ~~They work within Namespace declaration blocks, but I need to make them work at the higher level of Namespaces declaration blocks as well.~~ (Done.) ~~I also want to allow functions to be used as namespaces.~~ (Done.)

This is possible right now:

jsen.decl('my-fake-namespace', {
   'js': 'http://ecma-international.org/ecma-262/5.1',

   'x': 10,
   'y': ['js:Array', 1, 2, 3],
   'z': ['js:[]', 'y', 1]
});

jsen.eval('my-fake-namespace', 'x'); // 10
jsen.eval('my-fake-namespace', 'y'); // [1, 2, 3]
jsen.eval('my-fake-namespace', 'z'); // 2

jsen.expr('my-fake-namespace', 'x'); // 10 // Deprecated
jsen.expr('my-fake-namespace', 'y'); // Deprecated
    // ["http://ecma-international.org/ecma-262/5.1:Array", 1, 2, 3]
jsen.expr('my-fake-namespace', 'z'); // Deprecated
    // ["http://ecma-international.org/ecma-262/5.1:[]", "y", 1]

Eventually something like this will be possible as well:

Mathematical expressions as JSON (and phyloreferencing)

For Names on Nodes I did a lot of work with MathML (specifically MathML-Content), an application of XML for representing mathematical concepts. But now, as XML wanes and JSON waxes, I've started to look at ideas for porting Names on Nodes concepts over to JSON.

I've been drawing up a very basic and extensible way to interpret JSON mathematically. Each of the core JSON values translates like so:

Null, Boolean, and Number values are interpreted as themselves.
Strings are interpreted as qualified identifiers (if they include ":") or local identifiers (otherwise).
Arrays are interpreted as the application of an operation, where the first element is a string identifying the operation and the remaining elements are arguments.
Objects are interpreted either as:

a set of declarations, where each key is a [local] identifier and each value is an evaluable JSON expression (see above), or
a namespace, where each key is a URI and each value is a series of declarations (see previous).

Examples

Here's a simple object declaring some mathematical constants (approximately):

{
    "e": 2.718281828459045,
    "pi": 3.141592653589793
}

Supposing we had declared some operations (only possible in JavaScript, since JSON doesn't have functions) equivalent to those of MathML (whose namespace URI is "http://www.w3.org/1998/Math/MathML"), we could do this:

{
    "x":

        ["http://www.w3.org/1998/Math/MathML:plus",

1,

        ],
    "y":

        ["http://www.w3.org/1998/Math/MathML:sin",

            ["http://www.w3.org/1998/Math/MathML:divide",

                "http://www.w3.org/1998/Math/MathML:pi",

]
}

Once evaluated, x would be 3 and y would be 1 (or close to it, given that this is floating-point math).

Now for the interesting stuff. Suppose we had declared Names on Nodes operations and some taxa using LSIDs:

{
    "Homo sapiens": "urn:lsid:ubio.org:namebank:109086",
    "Ornithorhynchus anatinus": "urn:lsid:ubio.org:namebank:7094675",
    "Mammalia":

        ["http://namesonnodes.org/ns/math/2013:clade",

            ["http://www.w3.org/1998/Math/MathML:union",

                "Homo sapiens",

                "Ornithorhynchus anatinus"

Voilá, a phylogenetic definition of Mammalia in JSON!

I think this could be pretty useful. My one issue is the repetition of long URIs. It would be nice to have a mechanism to import them using shorter handles. Maybe something like this?

{
    "mathml":   "http://www.w3.org/1998/Math/MathML:*",
    "namebank": "urn:lsid:ubio.org:namebank:*",
    "NoN":      "http://namesonnodes.org/ns/math/2013:*",

    "Mammalia":

        ["NoN:clade",

            ["mathml:union",

                "namebank:109086",

                "namebank:7094675"

]
}

Something to ponder. Another thing to ponder: what should I call this? MathON? MaSON?

28 January 2013

Using TypeScript to Define JSON Data

JSON has gradually been wearing away at XML's position as the primary format for data communication on the Web. In some ways, that's a good thing: JSON is much more compact and readable. In other ways, it's not so great: JSON lacks some of XML's features.

One of these features is document type definitions. For XML, there are a variety of formats (DTD, XML Schema, RELAX NG, etc.) for specifying exactly what your XML data looks like: what are the tag names, possible attributes, etc. JSON is a lot more loosey-goosey here.

Okay, that's not entirely true: there is JSON Schema. I've never known anyone to use it, but it's there. It's awfully verbose, though. (So are the definitional formats for XML, but it's XML — you expect it!)

I was thinking about this the other day, and I realized that there is actually a great definitional format for JSON already in existence: TypeScript! If you haven't heard of it, TypeScript is a superset of JavaScript which introduces optional strict typing. And since JSON is a subset of JavaScript, TypeScript is applicable to JSON as well.

One of the great features of TypeScript is that interface implementation is implicit. In Java or ActionScript, you have to specifically say that a type "implements MyInterface". In TypeScript, if it fits, it fits. For example:

interface List

{

length: number;

}

function isEmpty(list: List): bool

{

return list.length === 0;

}

console.log(isEmpty("")); // true

console.log(isEmpty("foo")); // false

console.log(isEmpty({ length: 0 })); // true
console.log(isEmpty({ length: 3 })); // false
console.log(isEmpty({ size: 1})); // Compiler error!

(Note: for some reason that I can't fathom, isEmpty() doesn't work on arrays. Well, TypeScript is still in development — version 0.8.2 right now. Update: I filed this as a bug.)

Note that you can use interfaces even on plain objects. So of course you can use it to describe a JSON format. Here's an example from a project I hope to release before too long:

interface Model

{

uid: string;

}

interface Name extends Model

{

citationStart?: number;

html?: string;

namebankID?: string;

root?: bool;

string?: string;

type?: string;

uri?: string;

votes?: number;

}

interface Taxon

{

canonicalName?: Name;

illustrated?: bool;

names?: Name[];

}

Now, for example, I can declare that an API search method will return data as an array of Taxon objects (Taxon[]). And look how compact and readable it is!

Note that there is one drawback here: there is no way to enforce this at run-time. JSON Schema might be a better choice if that's what you need. But for compile-time checking and documentation, it's a pretty great tool.

24 January 2013

Saving Bootstrap Settings

The popular web page framework Bootstrap recently added a web form whereby you can customize visual settings (color scheme, fonts, etc.). Unfortunately they didn't add a way to save those settings, so if you later decide you need to tweak them and you didn't happen to just leave that web page open, you're screwed. You either have to reinvent them, go from memory, or dig through the generated files and hope you didn't miss anything.

I'm sure they plan to address this eventually, but in the meantime I created some JavaScript code to work around this: https://gist.github.com/4628506

To use this code:

Go to: http://twitter.github.com/bootstrap/customize.html
Run the script in the JavaScript console. (If you don't use a browser with a JavaScript console, you're beyond my help.)
Fill out the customization form.
You can record your settings into an object by running: var settings = record()
You can grab those as JSON by running: JSON.stringify(settings)
You can reinstate those settings later by running: play(settings)
You can save your settings to local storage by running: save()
You can retrieve your settings from local storage by running: retrieve()

I haven't fully tested this, so let me know if you run into any issues.

SIDE NOTE: This is my first gist!

02 January 2013

All Known Great Ape Individuals (Messinian to Present)

Happy 2013, everyone!

Recently I announced a code package I was working on, called Haeckel, for generating vector-based charts related to evolutionary biology. Here's an image I've created using it:

Known Great Ape Individuals

This chart represents all known hominid individuals (Hominidae = great apes, including humans and stem-humans) from the Messinian to the present, erring on the conservative side when the material is too poor to determine the exact number.

If you've been following this blog for a few years you may remember an earlier version of this. I've done a lot of refinement to the data since then. The earlier versions were dissatisfying to me because the horizontal axis was essentially arbitrary. For this version I used matrices from a phylogenetic analysis (Strait and Grine 2004, Table 3 and Appendix C) of craniodental characters to generate a distance matrix, and then inferred positions for other taxa based on phylogenetic proximity and containing clade. This is similar to the metric I used in this chart, except that it incorporates Appendix C, uses inference, and averages distance from humans against distance from [Bornean] orangutans. Don't be mistaken — this is still arbitrary. But it's a bit closer to something real.

Stray notes:

I'm pretty sure there are Pliocene stem-orangutans somewhere, right? Might have some work left to do on that data.
The dot with no taxon above "Australopithecus" is an indeterminate stem-human from Laetoli. It should probably go further left.
The Ardipithecus bubble includes the poorly-known "Australopithecus" praegens. (Although in some runs it moves outside — there's a random element to the plotting.)
The Holocene is barely visible up at the top. What a worthless epoch.
Homo floresiensis (hobbits) are far to the left of Homo sapiens because I placed them outside Clade(Homo erectus ∪ Homo sapiens).
You may recall Lufengpithecus? wushanensis as "Wushan Man", as it was originally placed in Homo erectus. (Hey, it's just teeth.)
A couple of fossil chimpanzees, lots of fossil orangutans, but no fossil gorillas. :(

(Unless you count Chororapithecus, but that's pre-Messinian. Very pre-Messinian. Suspiciously pre-Messinian....)

Look at all that overlap between Homo, Paranthropus, and Australopithecus!

I have a feeling, though, that if I added another dimension, Paranthropus and Homo would jut out in opposite directions.
Reclassifying Australopithecus sediba as Homo sediba would also decrease the overlap. (Although its position is inferred — actually scoring it might do the same thing.)
It's frustrating that the type species of Australopithecus and Paranthropus are also just about the most similar species across the two genera.

Kenyanthropus and Praeanthropus have been provisionally sunk into Australopithecus.
Should we just sink Orrorin and Sahelanthropus into Ardipithecus? Why not?
My guess is that if I added postcranial characters, the stem-humans would all shift right (humanward). Oh, for a good matrix of postcranial characters....

Update
Oh yeah, and if you want a peek at the data, go here.