OBJECTIVES


The long-term goal of this project is to create and maintain a FishBase-like information system for all non-fish marine organisms, ca. 400,000 spp. Of these, marine organisms (about 240,000 spp) are the target of the current project phase. It will not provide yet another authority list of species, but rather, for each species included, make available the biological and ecological information necessary to conduct biodiversity and ecosystem studies, taking advantage of lists of species already available on paper and electronically, and using the scientific names as ‘hook’ to organize biodiversity information. 

Since the number of species is huge, SeaLifeBase has made a list of priorities in its encoding strategy with short-term goals being set on an annual basis. Working on one or two island ecosystems at a time, the project gears closer toward its goal to assign species to large marine ecosystems (66 ecosystems worldwide).

Our short-term objectives this 2015 are:

  1. Improve information for all important species (i.e., commercially important/exploited, threatened, invasive and charismatic species) prioritized by the Sea Around Us;
  2. Provide/increase the number of pictures for the species in objective 1.
  3. Fill-in data gaps for distribution, ecology, and sizes for large taxonomic groups. Current statistics show that SeaLifeBase has data for 58% of 7 large taxonomic groups with about 70,000 species, and 59% of 2 very large taxonomic groups of about 100,000 species;
  4. Provide life history data at least for species in objective 1.
  5. Provide well-researched marine biodiversity lists for island ecosystems, e.g., those prioritized by the Global Oceans Legacy Project of the Pew Charitable Trusts.

TYPES OF INFORMATION ENCODED


Mandatory
  1. Current scientific accepted names, and synonyms in the sources used;
  2. Distribution by country (i.e. in their EEZ);
  3. Distribution by FAO area;
  4. Published references used (both hard copy and online).
Actively searched for:
  1. Common names in English and other languages;
  2. Distribution by provinces/state for large countries;
  3. Distribution by ecosystems;
  4. Distribution by depth;
  5. Abundance;
  6. Maximum length and weight (unit of measure depends on the taxonomic group);
  7. Trophic ecology: food items, diet, trophic level;
  8. Habitats;
  9. Introduction and invasion;
  10. Growth parameters, length-weight relationships;
  11. Reproduction: age at first maturity, fecundity;
  12. IUCN and CITES status;
  13. Drawings and/or photos (some deep-linked).

LINKS TO OTHER DATABASES


SeaLifeBase has established links with relevant data providers (as a policy, the websites linked are preferably those that mention their sources):

The SeaLifeBase website was made public in November 2008. It was structured using the FishBase web interface. The page uses common names as well as scientific names as keywords to get to the species summary page. The species summary page provides basic information on the species as well as pictures, maps and links to other databases. Click here to go to the SeaLifeBase search page.

Maps are now provided using Aquamaps and data from the Sea Around Us

TECHNOLOGY AND METHOD


LAMP: Linux, Apache, MySQL, PHP (resp. operating system, web server, DBMS, programming language).

The FishBase database and website structures were used as a shell. Graphical charts were progressively modified and fields were adapted in tables where some aspects of the taxonomic groups required changes.

The FishBase IT Team is consulted for suggested and new developments in SeaLifeBase. Changes in SeaLifeBase as approved by the FishBase IT Team are adapted in FishBase and all changes in FishBase are concurrently adapted in SeaLifeBase.

A classification to the class level, and when available to the order level, is the taxonomic backbone, and is primarily based on the Catalogue of Life/Species 2000 higher hierarchy, and then follows Tree of Life for groups not yet in the a classification. A classification to the order level (when no stable phylogeny exists) follows primarily the Catalogue of Life/Species 2000 higher hierarchy, then ITIS, then dedicated published classification for groups not yet in these databases.

Subspecies is not taken into account, but mentioned in a comment field.

Search for information is conducted in the databases with relevant information: Zoological Record, ASFA, CISTI, FishLit, etc. and the priorities for data encoding is informed by the short- and long term objectives for SeaLifeBase.

Search on the web with a guideline strategy, starting from some well-known biodiversity portals (CBD, GBIF, Diversitas, UNEP, WWF, some dedicated websites of universities, museum, and research institutions).

Development of a web crawler to check the updates in important websites deep-linked. 

STRATEGY OF ENCODING


Scientific Names

All information at species level is hooked to scientific names. It is crucial to have the list of species very quickly at the beginning of the project, if possible from electronic lists.

It is also crucial to identify taxonomic references to validate the choice of the current accepted names, and to link the names to these references when available on the web.

The lists are extracted in the following decreasing order:

  • Catalogue of Life/Species 2000;
  • World Register of Marine Species (WoRMS);
  • The Sea Around Us database on marine vertebrates other than finfishes (turtles, sea snakes, birds, marine mammals, sea cucumber);
  • ITIS;
  • UNESCO register of marine organisms;
  • UNEP-WCMC, CITES, IUCN;
  • OBIS/CoML initiatives;
  • ERMS (and Fauna Europaea for freshwater species), IABIN and other starting regional initiatives (e.g., in South-East Asia, North-East Pacific);
  • Country lists (e.g., Italy, Spain, New Zealand, Australia, Costa Rica, Brazil, Philippines);
  • Marine station inventories;
  • Printed compilations: FAO ASFIS database, FAO species catalogues, FAO regional guides (formerly identification sheets) and FAO country guides. Simple lists and check-lists. Monographies (e.g., ETI CD-ROMS);
  • Compilations on the web (Taxonomicon, Wikipedia initiative, various personal sites).

Unless synonyms are under electronic format, they are not entered as a priority; the only synonyms encoded are those that are used in sources of other information.

The source is always recorded, as well as the type to allow the user to assess the reliability of the name.

Common Names

The common names in English and other languages are entered only when available in electronic format or from compilations. However, some groups are prioritized when common names are well known such as, e.g., in marine mammals.

Some common names were already entered in Species 2000 by a FishBase team member.

The lists are extracted in the following decreasing priority:

  1. Species 2000;
  2. The Sea Around Us database on marine organisms other than finfishes (turtles, sea snakes, birds, marine mammals, and sea cucumbers);
  3. ITIS;
  4. OBIS/CoML initiatives;
  5. IABIN and other regional initiatives (e.g., South-East Asia , North-East Pacific);
  6. Country lists (e.g., Italy, Spain, New Zealand, Australia, Costa Rica, Brazil, Philippines);
  7. Printed compilations: FAO species catalogues, FAO regional guides (formerly identification sheets), FAO country guides, simple lists and check-lists and monographs.
Distributions

In addition to country, the state/provincial levels are considered. A geographic standard was established for the marine areas on the same model as the TDWG geographic standard for the terrestrial areas.

Distribution by country and subdivisions: from printed compilations and monographs (FAO publications first), from published distribution maps, from country lists, in that order.

Distribution by FAO areas: from FAO publications, from published distribution maps, from distribution by country above, in that order.

Distribution by ecosystem: from distribution by FAO area and country above and from printed compilations. Distribution by Large Marine Ecosystems will be by oceans and then by principal seas.

Distribution by depth: from printed compilations and monographs (FAO publications first).

Maximal Size

This data is a crucial key point for biodiversity and ecosystem studies, but rarely available from electronic sources.

This information is extracted on opportunistic basis mainly from FAO publications and printed monographs. Targeted species searches were performed for important species, e.g., threatened, invasive and commercially important species.

Conservation Status

IUCN and CITES database are explored and linked for the threatened and commercial status of species, respectively.

Habitats

This information is rarely available from electronic sources. Moreover, various standards are used, and may depend on the taxonomic group. The FishBase standard is used after it is reassessed and completed for invertebrates.

Targeted and Prioritized Taxonomic Groups

The phylum- to class-group levels are classified as small, medium or large groups.

This encoding strategy has changed since the completion of all the small and many of the medium groups in 2007. Each encoder’s weekly programme now consists of encoding data for 50% of the remaining groups and encoding life history parameters for targeted species groups. In addition, the 5 remaining encoders are each in charge of special topics, viz.: faunal lists, life-history parameters, ecological parameters, pictures and targeted reference-searches.

Note that the rapid completion and availability of results/data on small and medium groups were psychologically important to the encoder and the donor alike as they measure and assess the achievements of the project. Moving forward from encoding scientific names to encoding, e.g., life-history parameters, gave the encoders a sense of accomplishment in spite of the huge tasks still ahead. Our short-term doable targets and their completion provide us with milestones with which we measure our accomplishments. This strategy has so far been proven useful.

Data Encoding Progress Indicators

Some indicators of data encoding progress were established at the beginning of the project reflecting the completion of data encoding by taxonomic group, and the advancement of the project relatively to the expected number of species.

It is important to consider these indicators at various taxonomic levels from phylum to species as we can show rapid completion at phylum to family levels, whereas genus and species level are a long-term goal.