![]()
![]()
Paul D. Bridge1, Mark A. ONeill2 & Jamshid Fatehi3
1Mycology Section, The Herbarium, Royal Botanic Gardens, Kew, Richmond, Surrey TW9 3AB and School of Biological & Chemical Sciences, Birkbeck, University of London, Malet St, London WC1E 7HX: e-mail p_bridge@hotmail.com
The molecular systematics of fungal taxa has largely been established through sequence analysis of the ribosomal RNA (rRNA) gene cluster. The cluster consists of genes for the production of the individual 5.8S, small and large ribosomal subunits, separated by internally transcribed spacer (ITS) regions. The individual copies of the cluster are arranged linearly and are separated by an intergenic spacer (IGS). DNA sequence within the subunit genes are in general relatively conserved, although they also contain some variable domains (see Bruns et al., 1992; Hibbert et al., 1992). Gene sequences, particularly from the small subunit have been widely used to establish phylogenetic relationships between genera (e.g. Hausner et al., 1993). The spacer sequences however are less conserved and have proved to be useful for species identification, and for determining relationships between closely related species (e.g. Gardes & Bruns, 1991; Nazar et al., 1991).
Although ITS sequences have proved useful in identification at the species level, the level of variability within these sequences is not consistent between different species. In some fungal species ITS sequences may differ by as little as 0.5% between different isolates, whereas in some other species sequences may vary by 15% or more between individual isolates (see Seifert et al., 1995). There has been considerable debate concerning the application of species concepts in fungi; and in the absence of a classical definition, different criteria have been accepted for different groups.
In some fungal groups ITS sequences have successfully been used to determine hierarchical relationships. One example of this is in the genus Colletotrichum, where ITS variation of 2-3% has been considered to be the normal range within a species complex, and sequence variation between species complexes being around 5% or greater (see Sreenivasaprasad et al., 1996). Although clear cut-offs can be used to describe species and species complexes in some groups, the situation may become very complicated when considering a grouping of related species with differing levels of sequence variation. Under these circumstances within species sequence variation might be expected to approach, or even overlap with, between species variation. This sort of inter-specific overlap can also give rise to problems when trying to classify groups of higher organisms which are in the process of speciation. For example, the Hyles euphorbiae complex [Lepidoptera: sphingidae] and the Bombus, Megabombus and Colletes complexes [Hymenoptera: apidae] (see for example Pittaway, 1993). A further problem can arise when comparing groups of the same taxonomic rank, which contain widely different levels of intra-group variation. The entomopathogenic fungus Metarhizium consists of both host specific and generalised pathogens, and these have been grouped at both species and variety level (Driver & Milner, 1997). Hierarchic methods were shown to provide inaccurate representations of the delineation of these taxa and the possible relationships between them and the non-hierarchic principle component analysis proved more appropriate (Bridge, 1998).
One problem that may be encountered at the level of species and variety is the direction of current and recent linear changes. The use of heirarchic mathematical approaches, such as those represented by dendrograms or cladograms, are appropriate methods to model evolutionary progressions, particularly as in most instances evolution can be assumed to have occurred in a single direction. Hennig (1966) pointed out that phylogenetic analyses were inappropriate for currently interbreeding populations, and this may be particularly relevant among organisms such as bacteria, yeasts and fungi, which are often considered to contain genetically plastic species and varieties, where lateral gene transfer may occur through a range of mechanisms from transposons to interspecific mating (see Woo et al., 1998; Brasier, 1997). A complication of lateral transfer is that it will occur between groups placed at the end points of hierarchic representations, and so it is not accurately described by extending the linear mathematics of the evolutionary line. This has recently been demonstrated by Wang & Zhang (2000) who showed that multiple lateral transfers within the rRNA genes of actinomycete bacteria gradually destroyed the evolutionary history recorded in the nucleotide sequence. There is therefore a need to investigate the use of non-directional mathematics when investigating relationships between closely related, possibly rapidly changing, taxa. Within the fungi these are often taxa described at the species, variety of pathogenic form level.
The generally established Phylogenetic Species Concept (Nixon & Wheeler, 1990) does not adequately cope with the levels of plasticity that are becoming apparent in such cases (see Goldstein & de Salle, 2000), and application of Population Aggregation Analysis (Davis & Nixon, 1992) would lead to the sub-division of many existing species. Non-hierarchic methods such as principle component analysis are basically linear in their initial comparison of variance, but provide a non-directional representation of the final results. Non-linear mathematical approaches are by definition direction independent. Specifically, they provide a method of data analysis which is independent of any a priori assumption, including that of divergent evolution from a single common progenitor. In this way they are particularly suited for the analysis of fuzzy data that is obtained from complex biological relationships. They have proven to be particularly powerful in the field of neuroinformatics (see Young et al., 1996; Hilgetag et al., 1995) and may provide a significant tool for the analysis of DNA sequence data.
Stochastic annealing with closely related DNA sequences
The legume Ascochyta complex is a group of species of the genera Ascochyta and Phoma that are significant pathogens of many leguminous hosts. Ascochyta and Phoma are primarily differentiated on the basis of their conidia. In Ascochyta the conidia are predominantly septate, whereas in Phoma the conidia are largely non septate (Boerema & Bollen, 1975). These are not exclusive characters and refer only to the majority of the conidia. In one instance, A. suboltshauseri, septate, Ascochyta like conidia are formed in planta, and non-septate, Phoma like conidia are produced in culture. Ascochyta and Phoma are anamorphic genera, and teleomorphs, where known, have been placed in the genus Didymella (For full review see Fatehi, 2000).
Various species concepts have been applied to these taxa at different times. Characteristics that have been used include original host, size and type of lesion, production of a particular metabolite and spore size and shape (e.g. Jones, 1927; Sutton, 1980). There is evidence of host preference among some isolates, but host specificity of individual species is debatable. There have also been some discrepancies in the ranks applied to taxa, with some groups considered different species by some authors, and as special pathogenic forms of a single species by others (e.g. Gossen et al., 1986). Determination of relationships in this group by molecular methods is complicated by the presence of more than one type of ITS sequence in some isolates. These multiple forms may be due to non-conserved mutation in some copies of the rRNA gene cluster, or may alternatively be the result of heterokaryotic multinucleate conidia (Fatehi & Bridge, 1998).
The complete ITS1-5.8S-ITS2
sequences from 14 isolates of Phoma
and Ascochyta species
associated with legumes, and two isolates of the presumed teleomorph
Didymella were obtained
(Fatehi, 2000), and the data was analysed by four different
approaches, UPGMA cluster analysis, parsimony, principle component
analysis and stochastic annealing.
Cluster
analysis: The sequence divergence between sequences was calculated
for all isolates by pairwise comparison of sequences. Sequence
divergence was calculated as the percentage of bases different
between each pair of sequences. The matrix of sequence divergences
was then clustered by UPGMA average linkage through the package MVSP
(Kovach Computing Services, Anglesea). Parsimony:
The most parsimonious tree was determined by analysis of the aligned
sequence data through PAUP (Swofford, 1993). An unrooted tree was
constructed for maximum parsimony using the outgroup method. Principle
component analysis: Principle component analysis was carried out on
sequences after calculation of divergence through the program
CLUSTALW. Principle component ordinations were then obtained through
the JALVIEW utility available on-line at the European Bioinformatics
Institute (EBI).
Stochastic annealing: Stochastic
annealing was undertaken through the program seriate running under
the P3) platform. For this operation
transitions (e.g. purine to purine or pyramidine to pyramidine) were
given a weight of 0.5, in comparison to transversions (pyramidine to
purine or purine to pyramidine) which were given a weight of 1. Initial
indications Four
of the 16 sequences were identical (A. pisi (is2), A. pisi
(CH3) A. pisi (CBS), A. pisi (IMI) and D. viciae),
and the divergence between the others varied between 0.36% and 8.89%
(Table 1). Plotting the frequency of the different levels of sequence
divergence did not identify any clear cutoff values that could be
investigated for species or species complex delineation.
The dendrogram obtained from the sequence divergence gave a simple
structure of one group of 10 organisms, with the remaining organisms
linked to this in increasing order of divergence.
Although there was some indication of the limit of the core group of
organisms, there was no clear delineation. The sequence obtained from
D. fabae, the teleomorph of A. fabae was linked singly
to the core group, which in turn contained a sequence from the
anamorphic A. fabae, despite the true difference between these
isolates being only 0.36%. The parsimony tree gave a very
similar arrangement of the organisms, and showed P. rabiei and
P. medicaginis as closely linked to the core group. The
ordination in the first 3 principle components (Fig 2c) gave a core
group of organisms that corresponded to part of the large cluster
seen in the cluster analysis. This group contained 10 organisms
(1-10; Table 1), with the remaining 6 organisms dispersed both from
this group and from each other. The ordination placed P. rabiei
outside this main group. The seriation programme gave a series of
reordered matrices, each of which showed minimum binding energy.
These were all largely consistent, and in every case placed the same
10 organisms grouped by PCA together at the top of the matrix. The
consensus reordering obtained from the seriate analysis is that given
in Table 1. Implications The
data set analysed in this study is unusual in that it contains a
number of different issues that are often encountered in systematic
studies. In this case the sampling sizes, while similar in terms of
specific names, can be seen to be very unequal, with comparisons
being made between a group of 10 and 6 single organisms. This is due
in part to uncertainties and inconsistencies within this group of
fungi. The generic division between Ascochyta and Phoma has
been questioned, and in the past, different species epithets have
been used to distinguish between very similar organisms isolated from
different hosts (see Punithalingam, 1979; Boerema et al.,
1993). Previous molecular studies within this group of fungi have
suggested the possibility of recent divergence or genetic exchange
among some host recognised species, particularly A. pisi and
A. fabae. This was seen as the possession of two distinct ITS
sequences within these isolates, which appeared after PCR
amplification to comprise of a major and minor type (Fatehi &
Bridge, 1998). Detailed analysis suggested that the minor ITS type in
some isolates was the same as the major type in others, and vice
versa. This may have arisen due to horizontal transfer of one
nuclear type in the multinucleate conidia, or may represent a
radiating group from a single founder taxon (Fatehi & Bridge,
1998). In this study only the major amplification product has been
sequenced. Of the four methods evaluated here, Principle Component
Analysis and seriation both give results that stress the close
relationship within the A. pisi/A. fabae complex. These
species are pathogens on peas and beans respectively, and the species
A. vicia-villosa and A. vicia-panonicia were also
isolated from Vicia species. This is in contrast to A.
rabiei, which is commonly found on chickpeas. Given the similarities in host, together with the already
reported ITS heterogeneity and overlap, it would seem reasonable to
conclude that the 10 organisms that grouped together by these methods
constituted a single taxon, probably at species level. Once this
assumption has been made, it is possible to make a more accurate
interpretation of the cluster analysis and the parsimony tree. It is
also possible to return to these representations and the original
data and determine some cutoff values that could then be applied for
the identification of further sequences. This is however only clear
when the results of the PCA and the seriation analyses have been
taken into account. Although significance levels can be applied to
branch points in dendrograms, most analyses assume that the
variability behind the branchpoint is normally distributed (see
Dudzinski, 1975). While this may be the case for a branch point
describing a single species, it is not likely to be the case for
branch points differentiating species with different levels of
variability. This highlights a problem that can occur when
considering unknown sequences, and in this case it was not possible
to accurately determine the species boundaries purely from tree-based
representations. This raises a question as to how much of the answer
needs to be known before the results can be interpreted. In reality
it is the empirical judgement of the experienced taxonomist that
provides this.
The seriation approach used here is an extension of a methodology
originally developed to seriate connection matrix bit-vectors derived
from neuroanatomical connection data of the visual areas of the
rhesus macaque cortex (see Young et al., 1995;).
The basic seriation algorithm was extended to
permit the seriation of strings of character data (e.g. FASTA format
DNA sequence data) rather than simple bit-vectors. A cost function
was included to represent the different costs of transversions and
transitions in terms of subsequent DNA binding energy. It is expected
that further extensions to the system will permit it to be used for
automated sequence alignment using stochastically controlled
inflation and erosion of the DNA sequence data. In addition, output
data can be generated in a form that can be visualised in a 2- or 3-
dimensional cluster space (using for example, an adapted version of
the xtopol application which was developed to perform similar sorts
of display task for neuroinformatic data) (Figure 1). The simple version of the
seriate application used here did however identify the association
within the core A. pisi/A.fabae group through their placement
and movement in the lowest energy re-orderings. This approach has
some similarities to the Cladistic Haplotype Aggregation as described
by Brower (1999), but in this case the distinct aggregations are
identified by low energy levels in the annealing, as opposed to
topological arrangement. To obtain a separation/grouping comparable
to that obtained from PCA with an initial simplified version is
encouraging, and suggests that further development will lead to a
powerful tool for sequence comparison. The utility of PCA for
analysing molecular data in fungi has been suggested previously for
band data from closely related taxa of the same level (Bridge, 1998).
The extraction of correlated characters as eigenvectors to obtain the
principle coordinates acts to bias the final ordination towards that
supported by the greatest number of correlated characters. This can
act as a filter in situations where correlations are low and there is
a high level of random variation, as the small number of correlated
characters will form the principle components, ahead of the
uncorrelated random variation (Bridge, 1998). In this case the
majority of the data is correlated, as the sequences are very
similar, but as this similarity extends across the data set, the
level of variability in these characters is low, and the bulk of the
variability in the data is from the random variation. However, this
would appear to be filtered by the methodology, and the result
provides a useful method for determining the significance of branches
within trees. This appears to be particularly important in situations
where significant genetic events have occurred in different
directions, and these are in turn masked by multi-directional random
variation. Such a situation could be expected where isolation or
radiation has occurred from a single meiotic population.
The approach taken here
was a pilot study based on fungal sequence data, and was further
limited by the relatively low level of sequence variation. The
non-hierarchic and non-linear methods are organism independent and so
may be expected to provide useful insights into plant or insect
groups, such as Brassicas and Hyles, where either
interspecific or varietal crosses may occur naturally in the
environment.
References
Boerema, G.H. and Bollen, G.J. 1975.
Conidiogenesis and conidial septation as differentiating criteria
between Phoma and Ascochyta. Persoonia 8: 111-144.
Boerema, G.H., Pieters, R. and Hamers,
M.E.C. 1993. Check-list for scientific names of common parasitic
fungi. Supplement Series 2c,d (additions and corrections). Fungi on
field crops: pulse (legumes), forage crops (herbage legumes),
vegetables and cruciferous crops. Netherlands Journal of Plant
Pathology 99: Supplement 1, 1-32.
Boerema, G.H., de Gruyter, J., and
Noordeloos, M.E. 1997. Contributions towards a monograph of Phoma
(coelomycetes) IV. Persoonia 16: 335-371.
Brasier, C.M. 1997. Fungal species in
practice; identifying species units in fungi. In. Pp. 135-170 in:
Claridge, M., Dawah, H. & Wilson, M. (ed.), The Units of
Biodiversity. Chapman & Hall.
Bridge, P.D. 1998. Numerical analysis of
molecular variability, a comparison of hierarchic and nonhierarchic
approaches. Pp. 291-308 in: Bridge, P.D., Couteaudier, Y. &
Clarkson, J.M. (ed.) Molecular Variability of Fungal Pathogens. CAB
International, Wallingford.
Brower, A.V.Z. 1999. Delimitation of phylogenetic species with DNA
sequences: a critique of Davis and Nixons Population
Aggregation Analysis. Systematic Biology 48: 199-213.
Bruns, T.D., Vilgalys, R., Barns, S.M.,
Gonzalez, D., Hibbett, D.S., Lane, D.J., Simon, L., Stickel, S.,
Szaro, T.M., Weisburg, W.G. and Sogin, M.L. 1992. Evolutionary
relationships within the fungi: analysis of nuclear small subunit
rRNA sequences. Molecular Phylogenetics and Evolution 1: 231-241.
Davis, J.I. and Nixon, K.C. 1992. Populations, genetic variation, and
the delineation of phylogenetic species. Systematic Biology 41:
421-435.
Driver, F. and Milner, R.J. 1998. PCR
applications to the taxonomy of entomopathogenic fungi. Pp.153-186
in: Bridge, P.D., Arora, D.K., Elander, R.P. & Reddy, C.A.
(ed.), Applications of PCR in Mycology, CAB International,
Wallingford.
Dudzinski, M.L. 1975. Principal components
analysis and its use in hypothesis generation and multiple
regression. Pp. 85-105 in: Bofinger, V.J. & Wheeler, J.L. (ed),
Developments in Field Experiment Design and Analysis. Commonwealth
Agricultural Bureaux, Slough.
Fatehi, J. 2000. PhD Thesis, University of
Reading, UK
Fatehi, J. and Bridge, P.D. 1998. Detection
of multiple rRNA-ITS regions in isolates of Ascochyta. Mycological
Research 102: 762-766.
Gardes, M. and Bruns, T.D. 1991. Rapid
characterization of ectomycorrhizae using RFLP pattern of their PCR
amplified-ITS. Mycological Society Newsletter 41: 14.
Goldstein, P.Z. and DeSalle, R. 2000.
Phylogenetic species, nested hierarchies, and character fixation.
Cladistics 16: (in press).
Gossen, B.D., Sheard, J., Reauchamp, C.J.
and Morrall, R.A.A. 1986. Ascochyta lentis renamed Ascochyta
fabae f.sp. lentils. Canadian Journal of Plant Pathology
8: 154-160.
Hausner, G., Reid, J. and Klassen, G.R.
1993. On the phylogeny of Ophiostoma, Ceratocystis s.s., and
Microascus, and relationships within Ophiostoma based
on partial ribosomal DNA sequences. Canadian Journal of Botany 71:
1249-1265.
Hennig, W. 1966. Phylogenetic Systematics.
University of Illinois Press, Urbana, USA
Hibbert, D.S. 1992. Ribosomal RNA and
fungal systematics. Transactions of the Mycological Society of Japan
33: 533-556.
Hilgetag
C.C., O'Neill, M.A. and Young, M.P. 1996. Optimal hierarchical
orderings for the primate visual system using a novel network
processor. Science 271: 776-777.
Jones, L.K. 1927. Studies of the nature and
control of blight, leaf and pod spot, and footrot of peas caused by
species of Ascochyta. Pp.46 in: Bulletin of New York State
Agricultural Experimental Station, 547.
Nazar, R.N., Hu, X., Schmidt, J., Culham,
D. and Robb, J. 1991. Potential use of PCR amplified ribosomal
intergenic sequences in the detection and differentiation of
Verticillium wilt pathogen. Physiological and Molecular Plant
Pathology 39: 1-11.
Nixon, K.C. and Wheeler, Q.D. 1990. An
amplification of the phylogenetic species concept. Cladistics 6:
211-223.
Pittaway A.R. 1993. The Hawkmoths of the
Western Palaearctic. Harley Books.
Punithalingam, E. 1979. Graminicolous
Ascochyta species. Mycological Papers 142: 1-214.
Seifert, K.A., Wingfield, B.D. and
Wingfield, M.J. 1995. A critique of DNA sequence analysis in the
taxonomy of filamentous ascomycetes and ascomycetous anamorphs.
Canadian Journal of Botany 73 (suppl. 1): S760-767.
Sreenivasaprasad, S., Mills, P.R., Meehan,
B.M. & Brown, A.E. 1996. Phylogeny and systematics of 18
Colletotrichum species based on ribosomal DNA spacer
sequences. Genome 39: 499-512.
Sutton, B. C. 1980. The Coelomycetes.
CAB International Mycological Institute: Kew.
Swofford, D.L. 1993. PAUP: Phylogenetic
Analysis Using Parsimony, version 3.1.1. Illinois Natural
History Survey, Champaign, Illinois.
Wang, Y. and Zhang, Z. 2000. Comparative sequence analyses reveal
frequent occurrence of short segments containing an abnormally high
number of non-random base variations in bacterial rRNA genes.
Microbiology 146: 2845-2854.
Woo, S.L., Noviello, C. and Lorito, M.
1998. Sources of molecular variability and applications in
characterization of the plant pathogen Fusarium oxysporum. Pp.
187-208 in Bridge, P.D., Couteaudier, Y. & Clarkson, J.M. (ed)
Molecular Variability of Fungal Pathogens. CAB International,
Wallingford
Young
M.P., Scannell, J.W., O'Neill, M.A., Hilgetag, C.C., Burns, G. and
Blakemore, C.B. 1995. Non metric multi dimensional scaling in the
analysis of neuroanatomical connection data. Philosophical
Transactions of the Royal Society B 348: 281-308.
Table 1. Sequence divergence 1. A. viciae-villosae
0 2. D. fabae 2.13 0 3. A. pisi (is2) 0.71 2.13 0 4. A. fabae 0.36 1.42 0.71 0 5. A. viciae-pannonicae 1.07 2.85 0.36 1.07 0 6. A. pisi (CBS) 0.71 2.49 0 0.71 0.36 0 7. A. pisi (CH3) 0.71 2.49 0 0.71 0.36 0 0 8. D. viciae 0.71 2.49 0 0.71 0.36 0 0 0 9. A. pisi (IMI) 0.71 2.49 0 0.71 0.36 0 0 0 0 10. Ascochyta sp. 2.13 3.56 1.42 2.13 1.78 1.42 1.42 1.42 1.42 0 11. A. rabiei 2.85 4.98 2.49 2.85 2.85 2.49 2.49 2.49 2.49 3.21 0 12. P. medicaginis 4.63 6.41 3.91 4.27 3.91 3.91 3.91 3.91 3.91 5.34 3.21 0 13. P. exigua 5.34 6.76 4.63 5.69 4.27 4.63 4.63 4.63 4.63 5.34 4.27 3.91 0 14. A. boltshauseri 5.34 7.11 4.27 5.34 4.98 4.27 4.27 4.27 4.27 4.63 5.34 5.69 4.27 0 15. A. pinodes 4.98 6.41 4.27 4.27 4.98 4.27 4.27 4.27 4.27 5.69 5.34 6.76 6.05 7.11 0 16. P. glomerata 4.98 6.41 4.63 4.63 4.98 3.91 3.91 3.9 3.91 5.34 5.69 7.11 6.05 8.89 4.63 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Figure 1.
Xtopol visulisation of consensus seriation.
![]()
1 Mycology Section, The Herbarium, Royal Botanic Gardens, Kew, Surrey TW9 3AH
2Bee Systematics and Biology Unit, Hope Entomological Collections, Parks Road, Oxford OX1 3PW
3Department of Forest Mycology and Pathology, Swedish University of Agricultural Sciences, Box 7026, 750 07 Uppsala, Sweden