FAQ


Data and web-server version

  • Code version v0.4.1-9-g50c10d54fd
  • Data version 2020_11_19 (code v0.1.7-003704ff, DBs 2020_11_19)
  • Code https://github.com/VGalata/plsdb
  • Tested browsers 68.0.3440.84, 61.0.1, 11.1.2

Download

All relevant data files can be downloaded here. The provided archive includes also a README with information on other included files, and instructions on how to use the Mash sketches and BLAST database locally.

Publication

For more information about the resource, please read our publication.

Valentina Galata, Tobias Fehlmann, Christina Backes, Andreas Keller; PLSDB: a resource of complete bacterial plasmids, Nucleic Acids Res., 2018 Oct 31, doi: 10.1093/nar/gky1050

Please note, that some pipeline steps may have changed to resolve issues arising during data updates. Modifications can include updated versions of the tools and data used for plasmid annotation, and minor changes in the performed processing steps. For more information, see the code repository referenced above. It also contains a file listing the most important changes in the pipeline. Additionally, a file containing a list of removed, added and changed plasmid records with respect to the previous version is included in the download.

Data

Overview of the main processing steps

  • Plasmids: Plasmid records were searched in the NCBI nucleotide database for the sources INSDC and RefSeq respectively. The collected records were deduplicated and filtered to remove potentially incomplete or mislabelled chromosomal sequences.
  • Associated Bio-Samples were queried to retrieve sample isolation location and source. The locations were mapped using OpenCage Geocoder.
  • Taxonomic lineages were collected for taxa associated with the obtained plasmid records from the NCBI taxonomy database.
  • FASTAs: FASTA files were downloaded from the NCBI nucleotide database for each plasmid record.
  • Annotations: Specific genes were searched in the record FASTAs using ARG-ANNOT, CARD, ResFinder, and VFDB. Additionally, PlasmidFinder and pMLST were used to characterize the records.
  • Files for sequence search: Mash was used to create sketches from plasmid FASTAs, and BLAST was used to create a nucleotide BLAST database.
  • 2D embedding: Sketches created by Mash were used to estimate pairwise distances between plasmids. Then, UMAP was applied to compute an embedding of the plasmids in 2D.
The data processing pipeline makes use of the PubMLST website developed by Keith Jolley (Jolley & Maiden 2010, BMC Bioinformatics, 11:595) and sited at the University of Oxford; the development of that website was funded by the Wellcome Trust.

Graphical representation of the workflow
PLSDB workflow

Included tools for sequence search

  • Mash (Mash paper, Mash screen pre-print, repository)
    Version used by the web-server: 2.1.1
    CMD: mash sketch -S 42 -k 21 -s 1000{individually} -o {query_msh_noext} {query_fa} && mash dist {plasmids_msh} {query_msh} -v {max_pvalue} -d {max_dist} > {output}
    CMD: mash screen {plasmids_msh} {query_fa} -v {max_pvalue} -i {min_ident}{winner_takes_all} > {output}
  • BLASTn (official website)
    Version used by the web-server:
    • blastn: 2.9.0+; Package: blast 2.9.0, build Mar 11 2019 15:20:05
    • tblastn: 2.9.0+; Package: blast 2.9.0, build Mar 11 2019 15:20:05
    CMD:
    • blastn -query {query_fa} -task blastn -db {plasmids_db} -out {output} -evalue 1 -perc_identity {min_ident} -qcov_hsp_perc {min_cov} -outfmt '6 qseqid sseqid qstart qend sstart send evalue bitscore pident qcovs qcovhsp'
    • tblastn -query {query_fa} -task tblastn -db {plasmids_db} -out {output} -evalue 1 -qcov_hsp_perc {min_cov} -outfmt '6 qseqid sseqid qstart qend sstart send evalue bitscore pident qcovs qcovhsp'

Used libraries

  • Bootstrap
  • Font Awesome icons
  • Highcharts
  • Krona
  • underscore.js
  • w2ui.js
  • jquery.auto-complete.js

Support

If you encounter any problems or have questions about the data feel free to open an issue. In case of reporting an error, do not forget to specify the used web server version, the error message, and a description of your request (e.g. job ID and input data).


Disclaimer

This website is provided "as is" without any warranties.

Uploaded data
If you submit data for sequence search we assume that you have the right to upload this data to this web-server. The uploaded sequences are stored only temporary and are deleted upon job completion.