DAISY Logo The DAISY Automated Insect Identification Project TumblingDi
ce Logo


Home Overview Aims Objectives Background Implementation Projects Report References Acknowledgments

Exemplar implementation.

In order to be able to cope quickly and efficiently with large numbers of internet based identification queries in a resonable time scale (target turnaround time of < 1 minute per enquiry) an exemplar DAISY system has been implemented in a POSIX environment using the ANSI C programming language using Linux as the development (and main deployment) platform. DAISY may also be run within the Windows XP and Vista environments using via coLinux parasite system, for example the Tumbling Dice Fedora coLinux package. This exemplar makes use of the P3M (PUPS/P3 plus MOSIX) distributed programming environment and has been designed from the outset as a loosely coupled MIMD parallel system. From a computational standpoint two features of the exemplar DAISY system are of particular interest: The use of homeostatic and evolutionary mechanisms within DAISY in order to make the system fault tolerant: distributed implementations of DAISY will fail gracefully in the event of hardware failures on the platforms running components of the system. The form of task distribution used by the DAISY system is highly novel and modelled loosely on lock and key molecular interactions within living cells. Information which is to be processed by components of the DAISY system are tagged with key codes. These key codes cause components of the DAISY system which have matching locks to the data keys to read it in process (and if necessary re-tag it for down-stream processing by other DAISY components). DAISY image data may be tagged in an arbitrary manner using the ftag tool.

DAISY running on a XP coLinux parasite

Showing screenshot of DAISY which running as a pod of co-operating PUPS/P3 Linux processes within a Fedora-coLinux parasite which is itself running as an XP process.



DAISY system software components

Although in fully functional DAISY implementations, there are additional components (for example to automatically generate optimal training sets from a larger pool of training imagery, and to test the current system classification accuracy), the base level exemplar DAISY system consists of four components:

Schematic of DAISY System Architecture

Schematic showing dataflow through the DAISY classifier system. The ipm, floret, vhtml applications form a pod of co-operating PUPS/P3 processes which may in principle be located on any set of TCP/IP networked hosts. These hosts may be virtual e.g coLinux parasites. Because of the unique biologically inspired lock and key communication system used by DAISY the throughput of the system may be increased by simply adding more instances of these applications and hosts.




    DFE graphical user interface

    DFE is a graphical front-end which is based upon the GTK+/Gnome X toolkit. The DFE front-end is used to capture image data (e.g. via a CCD camera attached to a microscope or by reading in imagery which has been previously captured by digital camera). DFE then tags this data as either training images or unknown images and feeds it to the next component of the DAISY system the pose normalisation module, IPM.


    Showing DAISY Front End Tool (DFE)

    Showing a specimen of Xylophanes tersa (Linnaeus, 1771) about to be identified by a Internet based DAISY instance delivered via VNC to the user desktop.


    IPM: the pose normaliser

    The function of IPM is to normalise the input imagery and also to resample it to a standard size prior to classification. The poser is sensitised to appropriately tagged input imagery. When such imagery is found it extract a region of interest (using a ROI file which has been associated with the input image by DFE. The resulting pose-normalised and resampled image is then appropriately tagged for the next component in the DAISY system, the floret.




    Floret: the NNC/PSOM classifier

    The floret is the component of the DAISY system that actually classifies unknowns. The floret application is multithreaded. This means that multiple floret processes, which may be running on different physical host computers can co-operate when classifying unknowns, thus speeding up the DAISY classification process. In keeping with the botanical naming conventions of the DAISY system. This multithreaded collection of florets is known as an infloresence. The floret application identifies the unknown if it can. If the unknown can be identified, it is tagged for downstream processing by the vhtml application, otherwise, the unknown image is simply discarded.




    VHTML: the virtual HTML generator

    vhtml takes identification tags (inserted into the image file by the floret subsystem and interrogates the web in order to see if there is any information available on the organism. If information is available vhtml builds a page of virtual HTML which it then displays via a slaved dillo (web browser) client. If no information is available, vhtml generates a stub web page which simply give the name of the organism and displays that.

    Showing output of DAISY VHTML generator


    Showing I.D. Page automatically linked by DAISY VHTML Generator

    Showing automatically generated HTML output produced by vhtml virtual HTML generator following identification of Xylophanes tersa (Linneaus, 1771) by floret (Web page courtesy Bill Oehkle).


    Content Copyright (c) 2007 Tumbling Dice Ltd. DAISY is a Tumbling Dice Ltd product.