20 May 2013

PhyloPic Submissions Come in Fits and Bursts (API Example)

The recent surge of activity as PhyloPic neared its 1000th image got me to wondering about the pattern of image submissions over time. Fortunately it's very easy to collect this data using the PhyloPic API.


Step 1. Determine the number of submissions.


This is a very simple API call:
http://phylopic.org/api/a/image/count

...which yields:
{"result": 1024, "success": true}


Step 2. Pull down the submission time data for all images.


Now that we have the total number, we can grab data for all of the images at once, like so:
http://phylopic.org/api/a/image/list/0/1024

But this just yields a list of 1024 image entries that each look like this:
{"uid": "1353c901-f652-4563-941d-7b12bc7a86df"}



Not very useful. To get any actual data fields from the PhyloPic API, you have to be more specific:

http://phylopic.org/api/a/image/list/0/1024?options=submitted



Now each entry is a lot more useful:

{"uid": "1353c901-f652-4563-941d-7b12bc7a86df", "submitted": "2013-05-19 16:05:12"}


Step 3. Process the data.


Once you have this, it's a pretty simple matter for a JavaScript programmer to strip out the month and tally the images. I did this and generated a bar chart using Google's Code Playground. Here it is:

(I left out May since it's not over yet. Apologies for the gaps.)

PhyloPic was officially launched on 21 February 2011. Most of the submissions for that month are ones that I "presubmitted" during development. (A lot are from Scott Hartman's skeletal drawings, including the very first submission.)

Submissions were strong going into March but then completely slacked off. I'm sure a lot of this was due to technical problems — the site became incredibly slow after a while. There were major architecture flaws.

I (mostly) fixed these and relaunched in January 2012. Interest was strong, and in February PhyloPic had its best month ever. But then submissions slacked off again.

A year later, in March 2013, I was getting ready to do another major upgrade. I added dozens of images in anticipation. Then I relaunched at the very end of the month. Sure enough, April was one of the best months ever, second only to February 2012.

May 2013 is currently going strong, but looking at this trend I start to wonder: how long will it last? And although I recently swore off doing massive updates, are they actually better for driving up submissions?

11 May 2013

PhyloPic Passes a Thousand Images!

Just a little while ago, PhyloPic reached its first 1000 silhouettes! Here's the thousandth, the eusauropod dinosaur Cetiosaurus oxoniensis, by Michael P. Taylor:

(Public Domain)
Several contributors seem to have all been vying for the spot. Around the same time we got some other lovely contributions. Gareth Monger contributed this upside-down butterfly, Aglais urticae:

(Creative Commons Attribution-ShareAlike 3.0 Unported)

He missed the 1000th spot and got 1002nd. Matt Martyniuk missed it on the opposite side, with this Lambeosaurus (hadrosaurid dinosaur) at 993rd:

(Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported)
Emily Willoughby got quite close, too, and intended this rather recognizable angiosperm leaf (Cannabis sativa) for the 1000th spot. Alas, it's 1007th:
(Creative Commons Attribution-ShareAlike 3.0 Unported)
(As she noted, 420 would have been a good number as well.)

Thanks to everyone who contributed to the first thousand silhouettes! It took two years to get here  may the next thousand be even faster!


02 April 2013

Why the PhyloPic Relaunch Took So Long

Or, A Lesson in Development Strategy.

As I announced last week, my website, PhyloPic, has been relaunched with a massive update. One of the key updates is a public API for developers. A lot of people have been looking forward to this, and it was actually almost ready for release last summer. So why didn't I release it?

Failure to Branch

Basal tracheophyte.
Public Domain.
As I was writing up the documentation for the API, I learned of Bootstrap, a CSS/JavaScript framework. I realized that it could solve a lot of the design issues I was having — problems with the site on mobile devices, older browsers, etc.

What I should have done: Created a new development branch for adding Bootstrap while continuing to polish up the API branch. That way, I could have released the API shortly while still being able to work on the design issues in parallel.

What I actually did: Continued working in the same branch, ensuring that I couldn't release the API update until the Bootstrap update was complete.

Having Other Projects

By the end of summer I was mostly done with the revisions, but there was still some cleanup to do. By now some other projects I'm attached to, one with other collaborators, were suffering. So I spent most of my free time in the autumn working on those. (I have a full-time job and a toddler, so that isn't much.)

Homo habilis.
Public Domain.

Becoming Enamored of New Technology

In the autumn, Microsoft release a preview version of TypeScript, and I quickly saw that it was going to be extremely useful. So I rewrote PhyloPic's client-side code — it wasn't too hard and it made further development a lot easier. This caused some delay up-front, but I don't regret it.

Becoming Enamored of the Wrong New Technology

Around this time I also realized that I could finally do away with the last bit of Flash on the website: the Image Submission Tool. HTML5 had become mature enough to do all the image manipulation in the browser itself. I did a lot of research, learning about the Canvas, Typed Arrays, etc. And after a lot of work I actually created an image-processing workflow that work in HTML5-enabled browsers. As a bonus, I got a little standalone project out of it: Pictish.

But there were problems. One is that the best existing JavaScript library for creating PNG files doesn't use Typed Arrays — it uses strings, which means that it is slow for large files. I tried creating my own PNG encoder, or adapting that one, but soon realized it was far too much work. Another problem is that I was no longer supporting older browsers (although this was a trade-off against supporting mobile platforms, so I didn't feel too bad about it).

But there was a much more fundamental danger: doing the image-processing in the client side meant that the API had to trust the client to do it properly. What if some developer used the PhyloPic API to add images to the database but didn't do it right? That could be disastrous.

Octopus bimaculatus.
Public Domain.
I realized I would have to do things the old-fashioned way: on the server. After a bit of research, I identified Image Magick and Inkscape as the best tools. The new methodology was so completely different that I ended up making a lot of database changes, too. Until recently, all files were stored in the database — now they're just stored as flat files. The good news is that this makes load times faster.

Doing Things the "Right" Way

Throughout all this I had been making an effort to "dogfood" my own API, i.e., to use it on the site itself. This has the advantage of making load times faster, since the basic page can be cached and then the data can be loaded in secondarily in a much smaller format. Unfortunately this meant a lot of rewrites for how the pages are rendered.

After a while, the code to generate pages from the data had gotten really complex (mostly involving on-the-fly element generation using jQuery). Around the time I was redoing the Image Submission Page, I realized my whole approach was untenable. I needed a cleaner way to divorce presentation logic from control logic.

I ended up using Knockout for the entire site. It made things a lot more manageable.

In Summary

The biggest problem was my branching model, or, rather, my lack of one. Solitary developers often fall into this trap: we think that, since we're doing all the work, there's no need to have more than a single branch of development. At work, we've been using this model and found it very successful. Going forward, I plan to do this on PhyloPic as well. No more massive updates where everything is different. Just incremental features and fixes.

31 March 2013

PhyloPic Launch: API, Responsive Design, etc.

On Good Friday I took PhyloPic down. On Holy Saturday, I wrestled with errors caused by incongruities between the server and dev environments. And, lo, now, on Easter Sunday I announce that PhyloPic is back! (Actually, I already announced in on Twitter, but whatever.)

How smartphone users should see PhyloPic, more or less.

Major New Features:


  • A Developer API (using JSON). Now other people can build applications using PhyloPic data and images. (Yes, I am dogfooding it, so most of it should be pretty well-tested.)
  • Responsive design (using the ever-more-ubiquitous Bootstrap the site is now much more useable on mobile devices.
  • A Links Page, showing off work that uses PhyloPic or features it in some way.
  • Speedier load times (in theory, anyway).
  • Ranks for Contributors — if you submit one image, you're a "Specialist". Two, and you're a "General". Six, and you're a "Familiar". See where this is going?
  • Fewer requirements — most notably, Flash is no longer required to submit images.
  • Handy little icons on most taxon links — now you can tell if you're clicking on Gastonia the dicotyledonous plant or Gastonia the dinosaur. (Still rolling this out to all taxa.)

19 March 2013

Preview Screenshots

A little glimpse of what I've been working on:




10 March 2013

"Year of Macrauchenia": Third and Final "All Your Yesterdays" Entry

I made a last-minute entry for the All Your Yesterdays contest:


Year of Macrauchenia
Macrauchenia was the greatest and last of the litopterns, a clade of stem-euungulates. This bizarre Pleistocene South American herbivore is often described as having the body of a llama and the head of a tapir. However, the body is only superficially llama-like (but with gigantic elbows) and the head is not at all tapir-like. (If any part of it resembles tapirs, it's the feet.) In fact, the skull isn't like that of any living terrestrial mammal. It has extremely dorsal nares, like trunked animals and cetaceans, but it lacks any place for trunk muscles to attach.
But there is a possible analogue  another group of terrestrial herbivores with extremely dorsal nares  and they even had long necks, too! I refer, of course, to sauropod dinosaurs. Unfortunately, none are extant for comparison, but recent work has shown that, despite the dorsal placement of the nares in the skull, the external nostrils were still placed rostrally, close to the mouth, thanks to fleshy tubes. I've restored Macrauchenia similarly.
This mandala depicts the life of Macrauchenia across the seasons. At bottom, a lone Macrauchenia wanders the frozen highlands in relative comfort, having grown a shaggy winter coat. Its fleshy nostril tubes serve to warm the air before it enters the body. At right, spring is in effect  a bull courts a cow by inflating his nostril tubes, similar to a hooded seal. At top, a young calf frolics under his mother's watchful eye  his green color comes from the algae living in his fur (similar to the camouflage of those other South American indigenes, the sloths). This extra measure of color accuracy is necessary because, unlike today's ungulates, Macrauchenia must contend with predators that have excellent color vision: phorusrhacids. At left, a wary bull faces off against a Smilodon, an invading predator from the north. (It is restored after linsangs, the extant sister group to felids, instead of the felids themselves, since it is, properly, a stem-felid, not a true felid.) Macrauchenia will survive this great faunal interchange, but not for long — another invader from the north, a large primate, will be the end of it, and hence all litopterns.

28 February 2013

Another "All Your Yesterdays" entry: Denisovan, or "Polar Neandertal"

They've extended the deadline for the All Your Yesterdays contest, so I've decided to do a couple more entries. I started this one years ago as a Neandertal restoration. Since that time, new genomic discoveries showed that the speculative pigmentation was incorrect. But, other discoveries identified a new candidate for the subject matter!

Known from a few scrappy pieces, the Siberian Denisovans (Homo sp. or Homo sapiens ssp., depending on how large you like your species) are a true challenge to reconstruct. We have their entire genome, but know almost nothing about their anatomy. The few fossil elements we have are not morphologically distinct from Neandertals (Homo neanderthalensis or Homo sapiens neanderthalensis) or humans (Homo sapiens sapiens).
But the genomic facts are highly intriguing:
  1. Some Oceanian humans have inherited up to 6% of their nuclear DNA from Denisovans (with the highest ratios in Meganesia [Australia and New Guinea]). 
  2. The nuclear DNA indicates a common ancestor with Neandertals, shortly after the split from proto-humans.
  3. But the mitochondrial DNA indicates a motherline that branched off much earlier. (Possibly Homo erectus?)
  4. Genes for pigments are consistent with dark skin.
Here I've imagined a Siberian Denisovan as a sort of "polar Neandertal". As with polar bears, his skin is dark, trapping heat, but his pelage is light, allowing for camouflage against the taiga and tundra. He is the last of his kind — his southern kin mixed with the strange, baby-faced people who keep invading from the west. But he does not welcome them. He will fight to his death.

Sociable