|
|
Features of Consortium of California Herbaria Data View
Input
- Participant data: acceptable in any field format, converted to common Consortium format; file format xml or tab-delimited preferred
- Minimum standards: Unique herbarium specimen number, taxon name
- Transmitted by e-mail, CD, FTP, memory stick
- Updatable at any interval
Common Consortium format
- Produced by scripts constructed for each participant
- Fields accommodated:
Field | Searchable? | Sortable? |
Unique herbarium specimen number | * | * |
Taxon name | * | * |
Locality | * | * |
Habitat | |
Macromorphology | |
Notes | |
Latitude | * |
Longitude | * |
Township/range/section | |
Elevation (f or m) | | * |
Date | * | * |
Collector | * | * |
Collector number | * | * |
County | * | * |
Annotations | |
Type information | |
Restrictions/Quality Control
- Names not in master taxon authority file excluded;
Authority table updated as necessary, following consultation with IPNI, Tropicos, ICPN
- Common spelling of taxon names enforced
- Common spelling of county names enforced
- Coordinates outside California box nulled
- Dates before 1750 and after today's date nulled
- Field contents sanity checked, record skipped on mismatch
- Elevations below Death Valley or above county max elevation nulled
- Any spelling of localities, collectors allowed
- Any synonymy allowed
- Errors, modifications reported to participant
Data updates
- Consortium contents completely replaced at refresh (ca. every one to two weeks)
- Participant data replaced en masse (i.e., not corrected or incremented) when new data received
Change tracking
- "Heteroduplicate" specimen tracking
- Canned common searches ("related searches")
- Feedback (cumulative comments page for each participant), displayed per-record, or over current intervals.
- Comparative graphs (county vs specimen numbers per participant)
- Comparison of specimen coordinates with range specified for taxon in Jepson Manual (yellow flagging)
Mapping
- Connected to BerkeleyMapper for point display
Online constituents
Databases | Main file: Herbarium number -> corresponding data |
Indexes | Various: Index item -> list of Herbarium numbers |
Translation tables | taxon name <-> taxon ID nomenclatural synonyms |
Link tables | specimen number -> external links (e.g., specimen scan, fieldbook entry, collector biography, user comments) |
CGI scripts | searching; commenting |
Static web pages | Search form; administrative pages; graphs |
Software, hardware
All construction and retrieval done with Perl scripts; Perl available free for all operating systems.
Data stored with BerkeleyDB database library.
System has run unmodified on Windows NT, Mac OSX, and Linux, all
using Apache webserver. The data, indexes, scripts, etc., for the ~2,000,000 records currently
being displayed occupy ~2 GB. Refreshing the entire system from common format
files requires ~15 minutes of processing time on a 2.6 GHz Mac mini.
|