The revelation was this: nomenclatural codes, bioinformatics files, publications, specimen collections, and people are all the same thing. They are authorities.
Scientific names, taxonomic units, character states, and specimens are all the same thing. They are signifiers. They each signify a taxon (a set of organisms).
Signifiers are authorized by an authority. For example, Homo sapiens is a species authorized by the International Code of Zoological Nomenclature. YPM-VP 1450 is a specimen authorized by the Yale Peabody Museum's Vertebrate Paleontology Collection. "Wings used for powered flight," is a character state authorized by Gauthier & de Queiroz (2001). "30. Number of stamens: ten or fewer," is authorized by the NEXUS file registered as M331 in TreeBASE, as are the taxonomic units Phytolaccaceae and Lardizabalaceae.
Signifiers may share the same identity. For example Tyrannosaurus bataar (ICZN) and Tarbosaurus bataar (ICZN) signify the same taxon, no matter what. The identity is only accessible to the signifiers themselves, which means that signifiers can be equated and differentiated without disrupting references to them. (A similar identity property holds for authorities.)
Every authority may be associated with an absolute URI (universal resource identifier). Publications (including nomenclatural codes) may be associated with DOIs, ISBNs, etc. People may be associated with OpenIDs. Anything may be associated with a web address. It's a bit trickier for NEXUS files, but I figure that they can be uniquely identified by an ad hoc schema plus a SHA-1 hash of their textual data.
Examples:
- http://www.peabody.yale.edu/collections/vp — Yale Peabody Museum: The Collections: Vertebrate Paleontology
- http://uppsaladomkyrka.se — Uppsala domkyrka (cathedral)
- urn:isbn:0080-0694/146 — The International Code of Botanical Nomenclature (Vienna Code)
- http://openid-provider.appspot.com/keesey — Timothy Michael Keesey
- http://threelbmonkeybrain.blogspot.com — Timothy Michael Keesey (also!)
- urn:isbn:0-912532-57-2/chapter1 — Gauthier & de Queiroz 2001
- biofile:5b2f349967c18006233fc89b8643ff6c57be2858 — the NEXUS file of Rodman & al. 1984
- http://www.peabody.yale.edu/collections/vp::1450 — a specimen
- http://uppsaladomkyrka.se::Carolus+Linnaeus — a specimen
- urn:isbn:0-85301-006-4::Homo+sapiens — a species
- urn:isbn:0-912532-57-2/chapter1::wings+used+for+powered+flight — a character state
- biofile:5b2f349967c18006233fc89b8643ff6c57be2858::CHARACTERS/19._Crassulacean_acid_metabolis/present_in_at_least_some_specie — a character state
- biofile:5b2f349967c18006233fc89b8643ff6c57be2858::TAXA/Menispermaceae — a taxonomic unit
- I don't have to track that much information about each thing, since that information is held in other resources. I really just need to reference other resources (authorities and signifiers) and maybe provide a convenient name for each one (a canonical name in the Names on Nodes database).
- It is possible to create an extremely flexible data model capable of accomodating just about any data set, nomenclatural act, or taxonomic opinion.
- When using Names on Nodes, you'll be able to filter out authorities you don't want to use.
- I gotta redo a lot of stuff.
- Precedence.—nexus:5b2f349967c18006233fc89b8643ff6c57be2858::TREES/Fig._2/a (a hypothetical ancestor) is ancestral to nexus:5b2f349967c18006233fc89b8643ff6c57be2858::TAXA/Caryophyllaceae according to nexus:5b2f349967c18006233fc89b8643ff6c57be2858::TREES/Fig._2.
- Inclusion.—urn:isbn:0-85301-006-4::Homo includes urn:isbn:0-85301-006-4::Homo+sapiens according the rank-based definition authorized by urn:isbn:0-85301-006-4.
- Inclusion.—urn:isbn:0-85301-006-4::Homo+sapiens includes http://uppsaladomkyrka.se::Carolus+Linnaeus. according the rank-based definition authorized by urn:isbn:0-85301-006-4.
So, once this is set up, the application will be able to automatically apply phylogenetic definitions, given a certain set of relators. This set will typically include a tree (or network) and a character matrix (optionally). But it could also include many trees, or a custom phylogeny. It gets a bit complex, though, since definitions themselves are relators (mandating the inclusion of types or internal specifiers), as are contextual applications of definitions (indicating other, non-essential inclusions).
Still some details to work out, but I think I'm on a good track here.
No comments:
Post a Comment