This is a pure Python
CGI-based implementation of a taxonomically intelligent species
search engine. It searches biological databases for a taxonomic
name. The search is done "on the fly" using web services
(SOAP/XML) or URL API's.
Synonyms and higher taxa for a taxon name are retrieved using
the Catalogue of Life
Display of snippets from Wikipedia
articles makes use of a Dapper
transform. A link to the original article is also displayed.
Keyword extraction uses the Term Extraction service from FiveFilters.org for extracting terms from the contents of Wikipedia articles.
Queries to NCBI are performed using the Entrez
Programming Utilities. The ESearch
tool is used to look up a taxon name and, if the name is found,
tool is called to get basic statistics on what NCBI holds for
that taxon. Links to external information resources for the taxon
are retrieved using the Elinktool.
Distribution maps for a taxon are retrieved from GBIF using code inspired in the Species
Distribution Widget written by Tim Robertson and Dave Martin.
Google Images is used
to find up to five images for the query term.
Documents are retrieved from Google Scholar. The script
extracts references by screen scraping, since Google has not
released any API for Google Scholar.
Page has written the original iSpecies
taxonomically-based search engine, that also uses web services David Shorthouse has
written an iSpecies
Clone, that uses JSON.
The e-Species search engine has been developed on an IBM-PC
compatible machine running Linux Ubuntu 8.04 Hardy Heron and
The e-Species source code is released under the terms of the GNU General Public
License, and is available from SourceForge.
- Version 1.00, 29th Jun 08: Initial public release
- Version 1.01, 6th Jul 08: Added spelling suggestion from Yahoo!
Spelling Suggestion service to provide a suggested
spelling correction for a given name.
- Version 1.02, 10th Jul 08: Improved handling of synonym
status and fixed a bug in spelling suggestion.
- Version 1.03, 11th Jul 08: Added a method to class
COLSearch to check for the existence of a taxon name.
- Version 1.04, 31th Jul 08: Added automated tagging from Yahoo!
Term Extraction for Wikipedia snippet.
- Version 1.05, 1st Aug 08: Added a method to class
NCBISearch to return a list of external information
resources for search name.
- Version 1.06, 11th Aug 08 - Added a function to strip out
markup tags from Wikipedia snippet.
- Version 1.07, 05th Sep 08 - Fixed a bug in handling
Unicode characters in the author of a taxon name returned
- Version 1.08, 09th Sep 08 - Renamed class
YahooSearchImage to YahooSearch and added functions
spellingSuggestion (renamed to spellCheck) and
termExtraction (renamed to termExtract) as new methods.
- Version 1.09, 21th Oct 08 - Removed dependency of Set
module, using tuple instead, and fixed a problem with the
display image thumbnails from Yahoo search.
- Version 1.10, 19th Mar 09 - Rewrote class
GoogleScholarSearch to removedependency of BeautifulSoup
module, using a HTMLParser instead, and included a
default value for class YahooSearch number of results
- Version 1.11, 25th Mar 09 - Improved handling of returned
references from Google Scholar and rewrote class
WikipediaSearch to make use of Dapper
client-side search form validation
- Version 1.13, 21th Jul 09 - Added stylesheet for better
form display and minor fixes
- Version 1.14, 14th Apr 10 - Adjusted for changes in CoL
- Version 1.15, 29th Jul 11 - Removed Yahoo search class
because of useless calls to deprecated Yahoo web services
and substituted GoogleScholarSearch for a new
GoogleSearch class to search from both Google Scholar and
- Version 1.16, 2nd Aug 11 - Improved Google Images search
- Version 1.17, 14th Jul 12 - Fixed a bug query string
variable in class CoLSearch and updated URL to the latest
Annual Checklist version website
- Version 1.18, 20th Sep 13 - Added new routine for retrieving
images from Google and other minor improvements
- Version 1.19, 16th Aug 14 - Substituted a FiveFilters proxy
webservice for deprecated calls to Yahoo! Search
- Make use of synonyms in the searches, merging the results
from searches using different names, and present those
together (as suggested by Rod Page in the iPhylo
- Allow searches using common names
- Include images from Flickr
Thanks to Rod Page for his implementation tips on the iSpecies Blog and
overall inspiration, to Edinaldo Nelson dos Santos-Silva and Projeto Biotupé for continuing support, to Eduardo
Dalcin for crash testing and pointing out several flaws, to Flavio Coelho and other
members of the PyScience-Brasil
discussion list for support and constructive comments, and to Douglas Soares de
Andrade for providing patches and creating an svn trunk for
the e-Species source code.
Send comments and suggestions to Mauro J. Cavalcanti