A Three-Pound Monkey Brain: tools

Showing posts with label tools. Show all posts

02 April 2013

Why the PhyloPic Relaunch Took So Long

Or, A Lesson in Development Strategy.

As I announced last week, my website, PhyloPic, has been relaunched with a massive update. One of the key updates is a public API for developers. A lot of people have been looking forward to this, and it was actually almost ready for release last summer. So why didn't I release it?

Failure to Branch

Basal tracheophyte.
Public Domain.

As I was writing up the documentation for the API, I learned of Bootstrap, a CSS/JavaScript framework. I realized that it could solve a lot of the design issues I was having — problems with the site on mobile devices, older browsers, etc.

What I should have done: Created a new development branch for adding Bootstrap while continuing to polish up the API branch. That way, I could have released the API shortly while still being able to work on the design issues in parallel.

What I actually did: Continued working in the same branch, ensuring that I couldn't release the API update until the Bootstrap update was complete.

Having Other Projects

By the end of summer I was mostly done with the revisions, but there was still some cleanup to do. By now some other projects I'm attached to, one with other collaborators, were suffering. So I spent most of my free time in the autumn working on those. (I have a full-time job and a toddler, so that isn't much.)

Homo habilis.
Public Domain.

Becoming Enamored of New Technology

In the autumn, Microsoft release a preview version of TypeScript, and I quickly saw that it was going to be extremely useful. So I rewrote PhyloPic's client-side code — it wasn't too hard and it made further development a lot easier. This caused some delay up-front, but I don't regret it.

Becoming Enamored of the Wrong New Technology

Around this time I also realized that I could finally do away with the last bit of Flash on the website: the Image Submission Tool. HTML5 had become mature enough to do all the image manipulation in the browser itself. I did a lot of research, learning about the Canvas, Typed Arrays, etc. And after a lot of work I actually created an image-processing workflow that work in HTML5-enabled browsers. As a bonus, I got a little standalone project out of it: Pictish.

But there were problems. One is that the best existing JavaScript library for creating PNG files doesn't use Typed Arrays — it uses strings, which means that it is slow for large files. I tried creating my own PNG encoder, or adapting that one, but soon realized it was far too much work. Another problem is that I was no longer supporting older browsers (although this was a trade-off against supporting mobile platforms, so I didn't feel too bad about it).

But there was a much more fundamental danger: doing the image-processing in the client side meant that the API had to trust the client to do it properly. What if some developer used the PhyloPic API to add images to the database but didn't do it right? That could be disastrous.

Octopus bimaculatus.
Public Domain.

I realized I would have to do things the old-fashioned way: on the server. After a bit of research, I identified Image Magick and Inkscape as the best tools. The new methodology was so completely different that I ended up making a lot of database changes, too. Until recently, all files were stored in the database — now they're just stored as flat files. The good news is that this makes load times faster.

Doing Things the "Right" Way

Throughout all this I had been making an effort to "dogfood" my own API, i.e., to use it on the site itself. This has the advantage of making load times faster, since the basic page can be cached and then the data can be loaded in secondarily in a much smaller format. Unfortunately this meant a lot of rewrites for how the pages are rendered.

After a while, the code to generate pages from the data had gotten really complex (mostly involving on-the-fly element generation using jQuery). Around the time I was redoing the Image Submission Page, I realized my whole approach was untenable. I needed a cleaner way to divorce presentation logic from control logic.

I ended up using Knockout for the entire site. It made things a lot more manageable.

In Summary

The biggest problem was my branching model, or, rather, my lack of one. Solitary developers often fall into this trap: we think that, since we're doing all the work, there's no need to have more than a single branch of development. At work, we've been using this model and found it very successful. Going forward, I plan to do this on PhyloPic as well. No more massive updates where everything is different. Just incremental features and fixes.

31 March 2013

PhyloPic Launch: API, Responsive Design, etc.

On Good Friday I took PhyloPic down. On Holy Saturday, I wrestled with errors caused by incongruities between the server and dev environments. And, lo, now, on Easter Sunday I announce that PhyloPic is back! (Actually, I already announced in on Twitter, but whatever.)

How smartphone users should see PhyloPic, more or less.

Major New Features:

A Developer API (using JSON). Now other people can build applications using PhyloPic data and images. (Yes, I am dogfooding it, so most of it should be pretty well-tested.)
Responsive design (using the ever-more-ubiquitous Bootstrap) — the site is now much more useable on mobile devices.
A Links Page, showing off work that uses PhyloPic or features it in some way.
Speedier load times (in theory, anyway).
Ranks for Contributors — if you submit one image, you're a "Specialist". Two, and you're a "General". Six, and you're a "Familiar". See where this is going?
Fewer requirements — most notably, Flash is no longer required to submit images.
Handy little icons on most taxon links — now you can tell if you're clicking on Gastonia the dicotyledonous plant or Gastonia the dinosaur. (Still rolling this out to all taxa.)

Saving Bootstrap Settings

The popular web page framework Bootstrap recently added a web form whereby you can customize visual settings (color scheme, fonts, etc.). Unfortunately they didn't add a way to save those settings, so if you later decide you need to tweak them and you didn't happen to just leave that web page open, you're screwed. You either have to reinvent them, go from memory, or dig through the generated files and hope you didn't miss anything.

I'm sure they plan to address this eventually, but in the meantime I created some JavaScript code to work around this: https://gist.github.com/4628506

To use this code:

Go to: http://twitter.github.com/bootstrap/customize.html
Run the script in the JavaScript console. (If you don't use a browser with a JavaScript console, you're beyond my help.)
Fill out the customization form.
You can record your settings into an object by running: var settings = record()
You can grab those as JSON by running: JSON.stringify(settings)
You can reinstate those settings later by running: play(settings)
You can save your settings to local storage by running: save()
You can retrieve your settings from local storage by running: retrieve()

I haven't fully tested this, so let me know if you run into any issues.

SIDE NOTE: This is my first gist!

09 December 2012

Introducing Pictish, an image-processing library for web browsers

Boy, between RaphaëlTS, SHA-1, and Haeckel, I've been releasing an awful lot of TypeScript/JavaScript libraries lately, haven't I? Anyway, here's another!

Pictish

A Library for Processing Binary Image Data

Pictish takes advantage of canvas elements and Typed Arrays to provide fast routines for processing raster image data. Here is a rundown of the currently available functions:

createImageData()
fromFile()
fromHTMLImage()
crop()
flipX()
flipY()
scaleDown()
quarter()
silhouettize()

All of these functions have been tested and optimized. More information is available in the documentation at the BitBucket site. Note that since canvas elements and Typed Arrays are based on more recent specifications, not all browsers support it. (For what it's worth, I've been testing in Chrome.)

(The sharp-eyed may notice that last function and wonder if it might have something to do with another project of mine. The answer is yes.)

Next planned step is to create a PNG file encoder — that could take a while, though.

08 December 2012

SHA-1 with Typed Arrays

There are already SHA-1 implementations for JavaScript, but all the ones I found use strings as input. This is fine for generating hashes from small amounts of data, but not so great for large binary files. So I've created a SHA-1 library that optionally accepts ArrayBuffer objects.

SHA-1 TypeScript/JavaScript Library

Note that this will not work on browsers that do not support Typed Arrays.

I ran some tests and found that in Chrome it takes less than half a second to hash 10Mb of data.

Hope someone finds this useful! (At the very least, I will.)

02 December 2012

Haeckel: A Code Library for Browser-Based Evolutionary Diagrams

For a while now I've been writing posts with diagrams like this one, showing the evolution of cranial capacity in mangani over the past seven million years:

How did I make them? Originally it was all ad-hoc ActionScript code, but more recently I've begun to organize the code into a library and translate it into TypeScript (which, in turn, is automatically translated into JavaScript). Although this library is still in progress, I've decided it's at a stage where I can open it to the general public.

This library includes functionality for:

Modeling scientific concepts such as taxa, phylogeny, character states, stratigraphy, and geography.
Processing scientific data (notably calculating morphological distance and inferring unknown character states).
Rendering data into charts as Scalable Vector Graphics, using RaphaëlJS.

For a while I struggled with what to call this library. It's neither purely about science nor purely about graphics. Finally I got my inspiration from RaphaëlJS, a graphics library named after a great artist. I named my library after a man who was both a great artist and a great biologist:

An Idea for the EOL Phylogenetic Tree Challenge

Earlier this year, the Encyclopedia of Life announced the EOL Phylogenetic Tree Challenge. The goal: to produce "a very large, phylogenetically-organized set of scientific names suitable for ingestion into the Encyclopedia of Life as an alternate browsing hierarchy". The prize: an all-expenses-paid trip to iEvoBio 2012 in Ottawa!

This interested me greatly, because:

It's exactly the sort of thing I'm working on for PhyloPic.
I can't really justify paying for a trip to iEvoBio this year. (Phyloinformatics is my hobby, not my profession!)

After reading Rod Page's thoughts on the challenge, I came up with a basic idea, and started to implement it. Unfortunately, now that we're two weeks from the deadline, I'm realizing that:

I do not have the time to complete it.
Even if it were paid for, I can't justify a trip on my own out of town right now.

Why not? Simply put, this.

So, instead, I'm going to outline the general approach I was going to take, and if someone else wants to run with it, knock yourself out. (Just give me partial credit.)

PhyloPic Is Back!

Last year, I launched a project called PhyloPic. The goal of this project was to create an open database of freely reuseable silhouette images of organisms. Furthermore, it featured a phylogenetic taxonomy so that, if a taxon wasn't illustrated, an approximation could be found.

Pleurosiga minima

This image is in the public domain.

I launched it as a "public alpha", meaning that it wasn't complete and still had some bugs. The year turned out to be very busy for me: an awful thing happened, and a wonderful thing happened. And I didn't really have time to push PhyloPic to the next level.

Unfortunately, I hadn't built it well enough in the first place. The architecture was not optimized, and the site became extremely slow and buggy. I took it offline, hoping to release a new version in short order, but it turned out to need a lot more work than I first thought.

Happily, it is now ready again!

PhyloPic Week 2: Lineages, Browsing, and API

Another good week for PhyloPic. There are now well over 200 silhouettes in the database. I also rolled out some new features and enhancements.

Redesigned Lineage Pages

Lineage pages now provide taxonomic and license information for each image. As a visual touch, figures now fade as they go deeper and deeper into the past. Here's a few of my favorite lineage pages so far:

Anhanguera santanae (pterosaur)
Catocala nupta (moth)
Corvus corax (avian theropod dinosaur)
Damon johnstonii (whip spider)
Giraffatitan brancai (sauropod dinosaur)
Homo sapiens sapiens (primate)
Pantodon buccholzi (butterflyfish)
Therizinosaurus cheloniformis (theropod dinosaur)

Yes, they're all bilaterian animals. There's a definite bias.

Image Browser

Now you can peruse the entire gallery much more easily, with the Image Browser. Use the arrow(s) on the side to navigate through pages of silhouettes.

Developer API

For any developers out there who want to use the PhyloPic database to create their own apps, now you can. I've provided an initial API, available both as a JSON service and an AMF service for Flex apps.

Also of news to developers: I've opened up the code base for viewing and cloning. (Still need to add the licenses, though.) It's a Django app, written in Python. Feel free to poke around.

Thanks

I'd like to thank everyone who's submitted images so far, especially FunkMonk, Scott Hartman, Matt Martyniuk and Maija Karala for their many contributions. (Each of them has submitted at least a dozen.) Thanks also to Steven Coombs, Craig Dylke, Mo Hassan, Neil Kelley, Dann Pigdon, Ville-Veikko Sinkkonen, Patrick Strutzenberger, Reka Szabo, David Tana, Michael P. Taylor, and Emily Willoughby!

21 February 2011

Introducing PhyloPic: An Open Database of Reusable Silhouettes

Ever had this problem? "Boy, I could sure use a silhouette of [some kind of organism] for this diagram I'm working on. But I can't find anything on the web! Well, except for a few images which are copyrighted...."

What if there were a website with an open database of reusable images, available under Creative Commons licenses? What if you could do phylogenetic searches, so that, even if there weren't a silhouette for the taxon in question, you could at least find something close? What if you could build images like this...

Evolution of the Aardvark

...without having to look all over the web for figures?

Well, now you can! I've launched a new site called:

PHYLOPIC

It's currently in public alpha, which means it's not quite done. So, I have some caveats:

I'm pulling most taxonomic data from uBio. It's great because it's really comprehensive. But it's also a huge mess because it stores multiple classifications, many of which are outdated and disagree with each other. (This isn't uBio's fault, as its goal is to store all these classifications, not to offer one nice, neat classification.) So you may (will) find some errata in the phylogenetic system. I'm working on cleaning it up, but there are a lot of taxonomic names out there....
It's still early on, so there are only about a hundred images in the database. It will grow over time, but don't be surprised if the closest image it has for your favorite invertebrate is some kind of indiscriminate worm.
There are some known bugs (and I don't mean Hemiptera). The Issues Page is open to all, though, so you can read the known issues and report new ones. (Please do!)

It's a work in progress, but I think it has enormous potential. And I think it's reached a state where it's ready for public use and feedback. So have a look, see what you think, and let me know! (And, if you're artistically inclined, please consider submitting some silhouettes of your own.)

26 June 2010

Names on Nodes is finally online.

A month ago I got notice that my abstract had been accepted, and that I would be demoing Names on Nodes at the iEvoBio conference's Software Bazaar on June 29. This is the first time Names on Nodes has ever truly had a hard deadline. Since it's a personal project, until now I have had the luxury of languidly rebuilding and polishing and rebuilding and polishing. But now I have to get something up. So it's up.

Names on Nodes

There's still a lot left to do, but this will have to do for now. You can load NexML and Newick files (not NEXUS for now, sorry—although, really, you should be sorry for still using it when NexML is available). You can save as MathML and export PNG image files. You can create phylogenies and phylogenetic definitions on the fly using a visual interface that emphasizes drag-and-drop. Or you can type them in (as Newick and MathML, respectively), should you prefer that.

There are still a lot of bugs, and a lot of unimplemented features. If you come across issues or if you have feature requests, please feel free to submit an issue. And if you want to look at the code, it's open source (MIT license) and available on BitBucket.

06 May 2010

PhyloPainter: Happy Little Trees

The whole Flash/Apple fracas has been rather distasteful to me. But I'm not going to dwell on that right now. Instead, I am trying to keep an open mind by trying out some of the technologies that are competing with my favored development tools. First up: HTML 5.

I'll probably write more on the topic later, but suffice to say for now that working HTML 5 feels like I've traveled in time back to 2001, the days of ActionScript 1.0. JavaScript is a poor language for anything complicated. Canvas has covered the basics of vector drawing well, but little else. That said, I see potential and I'm pretty certain the tools will improve.

For my first HTML 5 app, I ported some basic functionality from Names on Nodes, namely, the ability to read Newick tree strings and the ability to draw graphs. I give you:

PhyloPainter

It's a bit rough right now. For one thing, it doesn't work in Internet Explorer (despite the inclusion of a workaround JavaScript tool—the current version of IE doesn't support HTML 5 Canvas). But it's a start.

Give it a try—paint some happy little trees!

24 August 2009

Online NEXUS File Viewer

It's been a month since my last post, but I have a very good reason for the hiatus. Namely, I was busy getting married to this woman (at the Los Angeles County Museum of Natural History) and going on our honeymoon (in Sydney, Australia).

Now that I'm back in California, time to get back to work on Names on Nodes! I've just put together a small demo of two key parts of its functionality: the reading of NEXUS files and the displaying of phylogenetic networks. Click here to see the NEXUS Viewer demo. This application opens NEXUS files and displays the trees in them as a combined phylogenetic network.

Things you need to know:

You must have a NEXUS file stored locally on your computer to use this.
That file should have a TREES section. (If not, the viewer should just display a list of operational taxonomic units.)
This could get messy for NEXUS files with lots of trees. (Although it's kind of neat-looking.)
You can move the nodes around by clicking on them, or click anywhere else to move the entire diagram.
I would dearly love to know if, for some reason, it does not work for a given file.

Enjoy!

30 May 2008

Thoughts on 2001: A Space Odyssey

Recently I saw a special screening of a 70mm print of my favorite film, 2001: A Space Odyssey. This was the second time I'd seen a 70mm screening of it and, I have to say, if you have only seen it on TV, then you haven't really seen it. The level of detail is insane. You can actually read the instructions on the zero-gravity toilet.

Apart from the detail, the feel of seeing it on a large screen is different. You realize that it's not so much a typical film as a ride. Not like a typical "roller coaster" summer blockbuster, but a long, deliberate, and intelligent ride, thoughtfully taking us from our deep past to our far future.

Seeing it with an audience is fun, too. Kubrick (the director) had a very dry sense of humor, I think, and it comes out better with an audience. I've noticed the exact same thing with his Barry Lyndon. When watching it on your own, some lines are sort of dryly amusing. But in an audience, they're hilarious. HAL's calm, persistent politeness is already amusing when watching the film on TV, but with an audience laughing, it's that much funnier (and creepier). (That said, though, the first time I saw it with an audience, they laughed too much. It's not a comedy!)

As a prediction of the future, the film failed in many ways (Pan Am?), but it's held up better than anything else form that time period. In particular I was impressed with the Australopithecus makeup—it's still reasonably consistent with what we know of stem-humans.

I thought of one objection, though, after the film was over. The film supposes that the reason for humanity's greatness, what enabled us to go from rooting around for tubers to walking on the moon, is our ability to create and use tools. The film's greatest, most dramatic moments involve tools. Think of the first one, where the ape-man thinks of the monolith, and then an idea starts to form. He plays with a leg bone, flipping it around, and then sees in his head that it could be used as a cudgel. The majestic strains of Also Sprach Zarathustra swell as the ape-man experiences a violent orgasm of intellectual discovery. That scene still gives me chills.

Tools are certainly important to us, and we are clearly the best at creating them. But is that really the reason for our success?

Since the 1960s, tool use has been observed in a number of non-human species, most famously in chimpanzees, but also in other great apes and some animals much more distant from us. Recently, New Caledonian crows have been seen to not only use tools, but fashion their own, creating hooks out of wire in order to reach out-of-the-way food. (Imagine redoing the Also Sprach Zarathustra scene with a crow.)

So we're not unique in this ability (even if we are much more proficient). It's wrong to suppose that the inspiration to use tools could have suddenly set us on our course of domination, because apes have presumably been using tools for a fairly long time.

What sets us apart, then? Arguably, it's language. Well, not just language, but discrete grammar. Many animals are capable of creating symbols, vocal or otherwise, that can be used to communicate the idea of a particular object to another member of their species. For example, vervet monkeys have different types of screeches for alerting their band to the presence of different types of predator. Honeybees have a system of dance that has something of a grammar (and is used to communicate the location of flowers), but not a discrete grammar like ours. Look at this essay: you have probably not seen most of the sentences in it before in your life, and yet you can understand it (I hope). That's the power of discrete grammar.

So the big dramatic scene should not have been when the ape-man discovered how to use a bone as a cudgel. It should have been when he told the other members of his tribe, through grunts, smacks, and gestures, that a bone could be used as a cudgel, and that they should gang up on that rival tribe. That was the true moment of power.

A Three-Pound Monkey Brain