APDD: the Archaeal Pathogen Detection DatabaseCurrent holdings
Welcome to the APDD.

This database is designed to facilitate the detection of contaminant sequences in the human branch of dbEST that may be derived from Archaea. In this way, we hope to detect the presence of novel organisms in human tissue that may be pathogens or symbionts.

We describe the database as "a dataset containing the top BLAST hit to the non-redundant nucleotide database using query sequences from the human EST database that are similar to archaeal sequences". Many of the ESTs in the database are derived from genuine human genes. However, a deliberately non-aggressive filtering strategy increases the odds of detecting foreign transcripts in dbEST human.

Brief summary of database construction:

  • BLAT search human EST (query) v. all archaeal nt sequences in GenBank
  • Parse BLAT output to obtain ESTs with similarity to archaeal sequence
  • BLAST (or MEGABLAST) those ESTs v. nr-nt database
  • Extract top BLAST hit, use to obtain taxonomic ID via Entrez query, append and import to MySQL

Published in: Bioinformatics 20(15): 2361-2362.

Human EST with top BLAST hit to:
Archaea36
Bacteria632
Eukarya (non-human)8146
Other (virus, vector etc.)5226
 
Show me the putative archaeal ESTs with BLAST HSP length > sorted by