PLSDB

FAQ

Data and web-server version

Code version v0.4.1-415-g32ba67b900
Data version 2023_11_03_v2 (code v.2023_11_03_v2, DBs 2023_11_03_v2)
Code https://github.com/CCB-SB/plsdb
Tested browsers 92.0.4515.131, 91.02, 14.1.2

Download

All relevant data files can be downloaded here. The provided archive includes also a README with information on other included files, and instructions on how to use the Mash sketches and BLAST database locally.

Publication

For more information about the resource, please read our updated publication and original publication.

Georges P Schmartz, Anna Hartung, Pascal Hirsch, Fabian Kern, Tobias Fehlmann, Rolf Müller, Andreas Keller; PLSDB: advancing a comprehensive database of bacterial plasmids, Nucleic Acids Res., 2021 Nov 25, doi: 10.1093/nar/gkab1111
Valentina Galata, Tobias Fehlmann, Christina Backes, Andreas Keller; PLSDB: a resource of complete bacterial plasmids, Nucleic Acids Res., 2018 Oct 31, doi: 10.1093/nar/gky1050

Please note, that some pipeline steps may have changed to resolve issues arising during data updates. Modifications can include updated versions of the tools and data used for plasmid annotation, and minor changes in the performed processing steps. For more information, see the code repository referenced above. It also contains a file listing the most important changes in the pipeline. Additionally, a file containing a list of removed, added and changed plasmid records with respect to the previous version is included in the download.

Data

Overview of the main processing steps

Plasmids: Plasmid records were searched in the NCBI nucleotide database for the sources INSDC and RefSeq respectively. The collected records were deduplicated and filtered to remove potentially incomplete or mislabelled chromosomal sequences.
Associated Bio-Samples were queried to retrieve sample isolation location and source. The locations were mapped using OpenCage Geocoder.
Taxonomic lineages were collected for taxa associated with the obtained plasmid records from the NCBI taxonomy database.
FASTAs: FASTA files were downloaded from the NCBI nucleotide database for each plasmid record.
Annotations: Specific genes were searched in the record FASTAs using ARG-ANNOT, CARD, ResFinder, and VFDB. Additionally, PlasmidFinder and pMLST were used to characterize the records.
Files for sequence search: Mash was used to create sketches from plasmid FASTAs, and BLAST was used to create a nucleotide BLAST database.
2D embedding: Sketches created by Mash were used to estimate pairwise distances between plasmids. Then, UMAP was applied to compute an embedding of the plasmids in 2D.

The data processing pipeline makes use of the PubMLST website developed by Keith Jolley (Jolley & Maiden 2010, BMC Bioinformatics, 11:595) and sited at the University of Oxford; the development of that website was funded by the Wellcome Trust.

Graphical representation of the workflow

Included tools for sequence search

Mash (Mash paper, Mash screen pre-print, repository)
Version used by the web-server: 2.3
CMD: mash sketch -S 42 -k 21 -s 1000{individually} -o {query_msh_noext} {query_fa} && mash dist {plasmids_msh} {query_msh} -v {max_pvalue} -d {max_dist} > {output}
CMD: mash screen {plasmids_msh} {query_fa} -v {max_pvalue} -i {min_ident}{winner_takes_all} > {output}
BLASTn (official website)
Version used by the web-server:
- blastn: 2.14.1+; Package: blast 2.14.1, build Aug 1 2023 12:22:01
- tblastn: 2.14.1+; Package: blast 2.14.1, build Aug 1 2023 12:22:01
CMD:
- blastn -query {query_fa} -task blastn -db {plasmids_db} -out {output} -evalue 1 -perc_identity {min_ident} -qcov_hsp_perc {min_cov} -outfmt '6 qseqid sseqid qstart qend sstart send evalue bitscore pident qcovs qcovhsp'
- tblastn -query {query_fa} -task tblastn -db {plasmids_db} -out {output} -evalue 1 -qcov_hsp_perc {min_cov} -outfmt '6 qseqid sseqid qstart qend sstart send evalue bitscore pident qcovs qcovhsp'

Used libraries

Bootstrap
Font Awesome icons
Highcharts
Krona
underscore.js
w2ui.js
jquery.auto-complete.js
Kablammo

Support

If you encounter any problems or have questions about the data feel free to open an issue. In case of reporting an error, do not forget to specify the used web server version, the error message, and a description of your request (e.g. job ID and input data).

Disclaimer

This website is provided "as is" without any warranties.

Uploaded data

If you submit data for sequence search we assume that you have the right to upload this data to this web-server. The uploaded sequences are stored only temporary and are deleted upon job completion.