TY - JOUR
T1 - HPViewer
T2 - Sensitive and specific genotyping of human papillomavirus in metagenomic DNA
AU - Hao, Yuhan
AU - Yang, Liying
AU - Galvao Neto, Antonio
AU - Amin, Milan R.
AU - Kelly, Dervla
AU - Brown, Stuart M.
AU - Branski, Ryan C.
AU - Pei, Zhiheng
N1 - Publisher Copyright:
© The Author(s) 2018.
PY - 2018/6/15
Y1 - 2018/6/15
N2 - Motivation Shotgun DNA sequencing provides sensitive detection of all 182 HPV types in tissue and body fluid. However, existing computational methods either produce false positives misidentifying HPV types due to shared sequences among HPV, human and prokaryotes, or produce false negative since they identify HPV by assembled contigs requiring large abundant of HPV reads. Results We designed HPViewer with two custom HPV reference databases masking simple repeats and homology sequences respectively and one homology distance matrix to hybridize these two databases. It directly identified HPV from short DNA reads rather than assembled contigs. Using 100 100 simulated samples, we revealed that HPViewer was robust for samples containing either high or low number of HPV reads. Using 12 shotgun sequencing samples from respiratory papillomatosis, HPViewer was equal to VirusTAP, and Vipie and better than HPVDetector with the respect to specificity and was the most sensitive method in the detection of HPV types 6 and 11. We demonstrated that contigs-based approaches had disadvantages of detection of HPV. In 1573 sets of metagenomic data from 18 human body sites, HPViewer identified 104 types of HPV in a body-site associated pattern and 89 types of HPV co-occurring in one sample with other types of HPV. We demonstrated HPViewer was sensitive and specific for HPV detection in metagenomic data. Availability and implementation HPViewer can be accessed at https://github.com/yuhanH/HPViewer/. Supplementary informationSupplementary dataare available at Bioinformatics online.
AB - Motivation Shotgun DNA sequencing provides sensitive detection of all 182 HPV types in tissue and body fluid. However, existing computational methods either produce false positives misidentifying HPV types due to shared sequences among HPV, human and prokaryotes, or produce false negative since they identify HPV by assembled contigs requiring large abundant of HPV reads. Results We designed HPViewer with two custom HPV reference databases masking simple repeats and homology sequences respectively and one homology distance matrix to hybridize these two databases. It directly identified HPV from short DNA reads rather than assembled contigs. Using 100 100 simulated samples, we revealed that HPViewer was robust for samples containing either high or low number of HPV reads. Using 12 shotgun sequencing samples from respiratory papillomatosis, HPViewer was equal to VirusTAP, and Vipie and better than HPVDetector with the respect to specificity and was the most sensitive method in the detection of HPV types 6 and 11. We demonstrated that contigs-based approaches had disadvantages of detection of HPV. In 1573 sets of metagenomic data from 18 human body sites, HPViewer identified 104 types of HPV in a body-site associated pattern and 89 types of HPV co-occurring in one sample with other types of HPV. We demonstrated HPViewer was sensitive and specific for HPV detection in metagenomic data. Availability and implementation HPViewer can be accessed at https://github.com/yuhanH/HPViewer/. Supplementary informationSupplementary dataare available at Bioinformatics online.
UR - http://www.scopus.com/inward/record.url?scp=85049068368&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/bty037
DO - 10.1093/bioinformatics/bty037
M3 - Article
C2 - 29377990
AN - SCOPUS:85049068368
SN - 1367-4803
VL - 34
SP - 1986
EP - 1995
JO - Bioinformatics
JF - Bioinformatics
IS - 12
ER -