26
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Detection of genomic islands via segmental genome heterogeneity

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          While the recognition of genomic islands can be a powerful mechanism for identifying genes that distinguish related bacteria, few methods have been developed to identify them specifically. Rather, identification of islands often begins with cataloging individual genes likely to have been recently introduced into the genome; regions with many putative alien genes are then examined for other features suggestive of recent acquisition of a large genomic region. When few phylogenetic relatives are available, the identification of alien genes relies on their atypical features relative to the bulk of the genes in the genome. The weakness of these ‘bottom–up’ approaches lies in the difficulty in identifying robustly those genes which are atypical, or phylogenetically restricted, due to recent foreign ancestry. Herein, we apply an alternative ‘top–down’ approach where bacterial genomes are recursively divided into progressively smaller regions, each with uniform composition. In this way, large chromosomal regions with atypical features are identified with high confidence due to the simultaneous analysis of multiple genes. This approach is based on a generalized divergence measure to quantify the compositional difference between segments in a hypothesis-testing framework. We tested the proposed genome island prediction algorithm on both artificial chimeric genomes and genuine bacterial genomes.

          Related collections

          Most cited references30

          • Record: found
          • Abstract: found
          • Article: not found

          Extensive mosaic structure revealed by the complete genome sequence of uropathogenic Escherichia coli.

          We present the complete genome sequence of uropathogenic Escherichia coli, strain CFT073. A three-way genome comparison of the CFT073, enterohemorrhagic E. coli EDL933, and laboratory strain MG1655 reveals that, amazingly, only 39.2% of their combined (nonredundant) set of proteins actually are common to all three strains. The pathogen genomes are as different from each other as each pathogen is from the benign strain. The difference in disease potential between O157:H7 and CFT073 is reflected in the absence of genes for type III secretion system or phage- and plasmid-encoded toxins found in some classes of diarrheagenic E. coli. The CFT073 genome is particularly rich in genes that encode potential fimbrial adhesins, autotransporters, iron-sequestration systems, and phase-switch recombinases. Striking differences exist between the large pathogenicity islands of CFT073 and two other well-studied uropathogenic E. coli strains, J96 and 536. Comparisons indicate that extraintestinal pathogenic E. coli arose independently from multiple clonal lineages. The different E. coli pathotypes have maintained a remarkable synteny of common, vertically evolved genes, whereas many islands interrupting this common backbone have been acquired by different horizontal transfer events in each strain.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Genomic islands in pathogenic and environmental microorganisms.

              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Amelioration of bacterial genomes: rates of change and exchange.

              Although bacterial species display wide variation in their overall GC contents, the genes within a particular species' genome are relatively similar in base composition. As a result, sequences that are novel to a bacterial genome-i.e., DNA introduced through recent horizontal transfer-often bear unusual sequence characteristics and can be distinguished from ancestral DNA. At the time of introgression, horizontally transferred genes reflect the base composition of the donor genome; but, over time, these sequences will ameliorate to reflect the DNA composition of the new genome because the introgressed genes are subject to the same mutational processes affecting all genes in the recipient genome. This process of amelioration is evident in a large group of genes involved in host-cell invasion by enteric bacteria and can be modeled to predict the amount of time required after transfer for foreign DNA to resemble native DNA. Furthermore, models of amelioration can be used to estimate the time of introgression of foreign genes in a chromosome. Applying this approach to a 1.43-megabase continuous sequence, we have calculated that the entire Escherichia coli chromosome contains more than 600 kb of horizontally transferred, protein-coding DNA. Estimates of amelioration times indicate that this DNA has accumulated at a rate of 31 kb per million years, which is on the order of the amount of variant DNA introduced by point mutations. This rate predicts that the E. coli and Salmonella enterica lineages have each gained and lost more than 3 megabases of novel DNA since their divergence.
                Bookmark

                Author and article information

                Journal
                Nucleic Acids Res
                Nucleic Acids Res
                nar
                nar
                Nucleic Acids Research
                Oxford University Press
                0305-1048
                1362-4962
                September 2009
                September 2009
                9 July 2009
                9 July 2009
                : 37
                : 16
                : 5255-5266
                Affiliations
                1Department of Computer Science, University of California San Diego, 9500 Gilman Drive; Mail Code 0404 La Jolla, CA 92093, 2Department of Biological Sciences, University of Pittsburgh Pittsburgh, PA 15260, 3Keck Graduate Institute of Applied Life Sciences, 535 Watson Drive Claremont and 4School of Mathematical Sciences Claremont Graduate University 711 North College Avenue Claremont, CA 91711, USA
                Author notes
                *To whom correspondence should be addressed. Tel: +1 412 624 4204; Fax: +1 412 624 4759; Email: jlawrenc@ 123456pitt.edu
                Article
                gkp576
                10.1093/nar/gkp576
                2760805
                19589805
                d9b9101f-ffc5-42eb-b793-c0849fa13c65
                © 2009 The Author(s)

                This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 30 April 2009
                : 19 June 2009
                : 22 June 2009
                Categories
                Computational Biology

                Genetics
                Genetics

                Comments

                Comment on this article