Exemplar implementation.
In order to be able to cope quickly and efficiently with large numbers of
internet based identification queries in a resonable time scale
(target turnaround time of < 1 minute per enquiry) an exemplar DAISY system
has been implemented in a
POSIX environment using the
ANSI C programming
language using Linux as the development (and main deployment) platform.
DAISY may also be run within the Windows
XP and Vista environments using via
coLinux parasite system, for example the Tumbling Dice Fedora coLinux
package. This exemplar
makes use of the P3M
(PUPS/P3 plus
MOSIX)
distributed programming environment and has been designed from the outset
as a loosely coupled
MIMD
parallel system. From a computational standpoint two features
of the exemplar DAISY system are of particular interest:
The use of
homeostatic and
evolutionary mechanisms within
DAISY in order to make the system fault tolerant: distributed implementations
of DAISY will fail gracefully in the event of hardware failures on the
platforms running components of the system.
The form of task distribution used by the DAISY system is highly novel
and modelled loosely on lock and key molecular interactions within living cells.
Information which is to be processed by components of the DAISY system are tagged with key
codes. These key codes cause components of the DAISY system which
have matching locks to the data keys to read it in process (and if necessary re-tag it
for down-stream processing by other DAISY components). DAISY image data
may be tagged in an arbitrary manner using the ftag tool.
|
Showing screenshot of DAISY which running as a pod of co-operating
PUPS/P3 Linux processes within a Fedora-coLinux parasite which is itself running as an XP process.
|
DAISY system software components
Although in fully functional DAISY implementations, there are additional components (for example to automatically
generate optimal training sets from a larger pool of training imagery, and to test the current system classification
accuracy), the base level exemplar DAISY system consists of four components:
|
Schematic showing dataflow through the DAISY classifier system.
The ipm, floret, vhtml applications form a pod of co-operating PUPS/P3 processes which may in principle
be located on any set of TCP/IP networked hosts. These hosts may be virtual e.g coLinux parasites. Because of the unique biologically inspired lock and
key communication system used by DAISY the throughput of the system may be increased by simply
adding more instances of these applications and hosts.
|
DFE graphical user interface
DFE
is a graphical front-end which is based upon the
GTK+/Gnome X toolkit.
The DFE front-end is used to capture image data (e.g. via a
CCD camera attached to a microscope or by reading in imagery
which has been previously captured by digital camera).
DFE then tags this data as either training images or unknown images and
feeds it to the next component of the DAISY system the pose normalisation module, IPM.
|
Showing a specimen of Xylophanes tersa
(Linnaeus, 1771) about to be identified by a Internet based DAISY instance delivered via
VNC to the user desktop.
|
IPM: the pose normaliser
The function of IPM is to normalise the input imagery and also to resample it
to a standard size prior to classification. The poser is sensitised to appropriately tagged
input imagery. When such imagery is found it extract a region of interest (using a ROI file which
has been associated with the input image by DFE. The resulting pose-normalised and
resampled
image is then appropriately tagged for the next component in the DAISY system, the floret.
Floret: the NNC/PSOM classifier
The floret
is the component of the DAISY system that actually classifies unknowns.
The floret application is
multithreaded. This means that multiple floret processes, which
may be running on different physical host computers can co-operate when classifying unknowns, thus speeding
up the DAISY classification process. In keeping with the botanical naming conventions of the
DAISY system. This multithreaded collection of florets is known as an
infloresence. The floret application identifies the unknown if it can. If the unknown can
be identified, it is tagged for downstream processing by the vhtml application, otherwise, the
unknown image is simply discarded.
VHTML: the virtual HTML generator
vhtml
takes identification tags (inserted into the image file by the floret subsystem
and interrogates the web in order to see if there is any information available
on the organism. If information
is available vhtml builds a page of virtual HTML
which it then displays via a slaved dillo (web browser) client.
If no information is available, vhtml generates a stub web page which simply give the name of the organism and displays that.
|
Showing automatically generated
HTML output produced by vhtml virtual HTML generator following identification of
Xylophanes tersa (Linneaus, 1771) by floret (Web page courtesy
Bill Oehkle).
|
Content Copyright (c) 2007 Tumbling Dice Ltd.
DAISY is a Tumbling Dice Ltd product.
|