The Global Genome Initiative (GGI) Knowledge Portal
The GGI Knowledge Portal enables scientists to identify where biodiversity genomic knowledge exists for each family of living organisms and to evaluate the relative amounts of knowledge gathered for each family. By highlighting the presence and absence of knowledge across the tree of life, the portal seeks to focus new knowledge-building efforts where they can produce the greatest impact.
Currently, the knowledge sources that make up the GGI Knowledge Portal index include Biodiversity Heritage Library (BHL) pages, Barcode of Life (BOLD) records, Encyclopedia of Life (EOL) rich pages, Global Biodiversity Information Facility (GBIF) records, Global Genome Biodiversity Network (GGBN) records, and GenBank (NCBI) sequences. The methods used to assemble the content displayed in the GGI Knowledge Portal will be described in an upcoming paper
Future versions of the Portal will expand to contain data on all genera of living organisms, insofar as lists of genera are known, and add selected improvements to portal functionality.
The GGI Knowledge Portal is based on the infrastructure of the Encyclopedia of Life, and all data presented in the portal is sourced from EOL’s TraitBank repository.
The GGI Knowledge Portal Taxonomy is based on the list of Families of Living Organisms (FALO) compiled by Ruggiero (2014) as an extension of a seven-kingdom classification of life in Ruggiero et al. (in prep 2014). The latter work is the result of an expert panel representing the major taxonomic disciplines convened to review, revise, and update the existing incomplete Catalogue of Life (CoL) hierarchy down to order.
The FALO classification is based on a consensus view among the authors, accommodating taxonomic choices and practical compromises among diverse opinions, usages and often conflicting evidence of the boundaries between ranks and some major taxa, including kingdoms.
FALO is unique because it aims to be comprehensive, with all known species of life on earth finding a home within its classification. Obviously, FALO is just “a” classification, certainly not “THE” classification of life. Because it heuristically combines strictly phylogenetic and relatively classical taxonomies, no doubt some of the implied relationships will require revision.
Ruggiero, Michael A. (2014). Families of All Living Organisms, Version 2.0.a.15, (4/26/14). Expert Solutions International, LLC, Reston, VA. 420 pp.
Ruggiero, M., Gordon, D., Bailly, N., Bourgoin, T., Brusca, R., Cavalier-Smith, T., Guiry, M., Kirk, P., and Orrell, T. (In prep 2014). Seven Kingdoms: A Practical Higher-level Classification of All Living Organisms.
Search functions as an auto-complete. The user starts typing the beginning of a family name and is presented with a list of matching names from the FALO taxonomy tree to choose from. Users can select a name and the node in the taxonomy tree is selected.
Taxonomy tree contains the FALO classification from ranks superkingdom to family. This tree is 1) expandable and collapsible, 2) scrollable, 3) lists nodes alphabetically by default, 4) shows the GGI score in parentheses. Tree nodes are selectable using "+". When a node is selected, the reference, GGI data, score data and taxonomic information is displayed. Each node has a unique URL.
Phylogeny tree is not currently available on the portal. It will be sourced from the Open Tree of Life project.
Taxon Information. Taxa below superkingdom level are associated with a classification hierarchy. Every taxon is associated with a scientific name, with a preferred common name drawn from and linked to EOL if available. The taxon reference is a citation, specifically referenced in the FALO classification. The photo shown is the exemplar or best image from EOL, hyperlinked to the appropriate EOL page. The total number of families is an integer calculated from the FALO classification and is only shown for taxa above the rank family. A full list of FALO citations may be downloaded.
GGI Score is calculated on a scale from 0-100 as a quick guide to the relative amount of knowledge available for the taxon. The score is an average of all percentile scores (see below) for that taxon. GGI scores for higher taxa are an average of the percentile scores of taxa immediately below the higher taxon.
Source, count, percentile score. Only taxa of rank family have source, count, and percentile score data. Source is a shortened acronym form of the core source of knowledge and contains a link to the core source website. Count is the number of records retrieved from each of the six core databases. Percentile score is the percentile rank of that family compared to all other families from that data source (using the type 8 method recommended in Hyndman and Fan, 1996), with counts of 0 receiving a score of 0, resulting in a scale of 0-100 for each data source.
Color coding (Legend) is applied to all scores, with scores grouped into good (green, 81-100), average (blue, 51-80) and poor (red, 0-50). Filled, half filled, and empty circles are also used as visual cues in the taxonomy tree to represent these three categories.
Updates: The GGI Portal is updated monthly. The date of the last update is the timestamp of the data file on the Download page.
Download: Data and references can be downloaded as an Excel file from the Download page.