The free version of the PeakTrace Basecaller™ (PeakTrace Online) has been updated to PeakTrace 6.10. This is a major update of the PeakTrace Basecaller and it includes a large number of enhancements and new features. The Auto PeakTrace RP™ and PeakTrace:Box™ will be released over the next few days.
Has the KB Basecaller been Validated?
We occasionally get asked if the PeakTrace Basecaller™ ever been independently validated? The answer is no one has ever published an independent validation of any Sanger DNA sequencing basecaller. This includes the widely used KB Basecaller™.
The most complete published study on basecaller validation is Ewing and Green’s 1998 paper on the validity of phred [1]. This study was performed by the same team that developed phred [2] so while very though, it is far from an independent study.
The KB Basecaller™ has no published studies on its validation beyond a single poster at Advances in Genome Biology and Technology (AGBT) in 2004 by the team that developed KB [3]. While this poster showed that KB gave more Q20+ bases than phred, these extra Q20+ bases didn’t seem to help much when the data was used (i.e. in assemblies – see Figure 5 on this poster).
The ABI 3500 DX model has been approved for clinical testing by the FDA [4], however, the validation of KB done for this approval was performed by ABI and is limited to the ABI 3500 DX model when using POP6™, BigDye™ v1.1, the 3500 Dx Series Data Collection Software 1.1, and one of two run conditions [5]. If you are running any other sequencing instrument, polymer, BigDye, or run condition then KB has not gone through the FDA validation process. You can’t assume that KB will give valid results under your run conditions just because it has been validated for a particular instrument, consumable set, and run conditions.
It is totally understandable why ABI (ThermoFisher these days) has only undertaken the FDA approval process for the ABI 3500 DX and only under a very limited range of conditions. The FDA process takes years, costs millions of dollars, and needs to be done separately for each instrument, polymer, terminator chemistry, capillary length, and run module. Even for a company as large as ThermoFisher it does not make sense to go through the FDA process for every sequencer it has ever sold.
The closest to an independent study of the KB Basecaller we have been able to find was a paper published by Hyman et. al. in 2010 [4]. They did not perform a full validation (as was done by Ewing and Green for phred), but they did find that KB gave more Q20+ bases than phred. They also found that these extra Q20+ bases did not help them in their application of identifying bacterial species from the sequence data [4].
In defence of the KB Basecaller our own validation studies shows that KB is basically fine except for a few problems in the Q20 to Q30 range where it tends to over predict the actual quality [7]. For almost all non-clinical applications this is not an major issue, but it may explain why there has been little or no benefit seen from the greater Q20+ bases that KB provides over phred.
The bottom line is KB (like phred) has been validated by the people who wrote it, but no one has published an independent validation of either of these basecallers except ironically us [7]. If you demand that a basecaller be independently validated before you will use it then the only independent study you can turn to is ours of KB (we have also validated phred, but this has not been made public). If you trust us to validate KB, then you can trust us to validate PeakTrace too.
So where does this leave the validation of the PeakTrace Basecaller? Nucleics has extensively validated PeakTrace [7] in exactly the same manner as was done by the developers of the phred [1] and KB basecallers [3]. PeakTrace is currently being used in 93 facilities around the world to basecall tens of millions of traces per year. Our customers would not pay the extra cost for using PeakTrace – after all they get the KB Basecaller for free with the instrument – if they did not believe it provided real and significant improvement.
Of course the best validation is always your own validation. For almost all applications it does not matter if, for example, a predicted Q31 base has a true Q31 error rate (or if it is actually Q29 or Q33), what matters is the number of actual errors, the usable read length, and that bad bases are not called as good bases. We have yet to see an application (even clinical) where it matters if a base is given a Q score more accurate than to the nearest Q10.
Luckily it is very easy to do a quick, do-it-yourself validation of PeakTrace. Just grab a handful of good traces of known sequence, run them through the free online PeakTrace service, then BLAST both the KB and PeakTrace sequence data to comparing the errors and aligned bases found with each bascaller. This will quickly tell you which basecaller is best for your data. If you follow this process we are sure you will be impressed with how PeakTrace performs.
References
- Ewing B, & Green P. (1998). Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res. 8:186-194.
- Ewing B, Hillier L, Wendl MC, Green P. (1998). Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res. 8:175-185.
- Longer Reads and More Robust Assemblies with the KB Basecaller (2004)
- http://www.fda.gov/downloads/BiologicsBloodVaccines/BloodBloodProducts/ApprovedProducts/SubstantiallyEquivalent510kDeviceInformation/UCM339686.pdf
- https://tools.thermofisher.com/content/sfs/brochures/3500-dx-series-genetic-analyzer-cs2.pdf
- Hyman RW, Jiang H, Fukushima M, Davis RW. (2010). A direct comparison of the KB™ Basecaller and phred for identifying the bases from DNA sequencing using chain termination chemistry. BMC Research Notes. 3:257.
- Tillett D. (2010). Validation of the PeakTrace Basecaller.
Additional 38 DNA sequencing service reviews added
We have added another 38 reviews of DNA sequencing service providers. These include
- Acadia Research Laboratories
- Avance Biosciences
- Clemson University Genomics Institute
- East Tennessee State University Core Facility
- Emory University Integrated Genomics Core
- Epoch Life Science
- Eurofins Genomics USA
- Florida International University DNA Core Facility
- HudsonAlpha Institute for Biotechnology Genome Sequencing Center
- Lone Star Labs, Inc.
- North Carolina State University Genomic Sciences Laboratory
- Oblique Bio
- Omega Bioservices
- Pennington Biomedical Research Center Genomics Core
- Texas A&M AgriLife Research Laboratory for Genome Technology
- Texas A&M College of Veterinary Medicine & Biomedical Sciences DNA Technologies Core Lab
- Texas Tech University Center for Biotechnology & Genomics
- University of Alabama Heflin Center for Genomic Science
- University of Arkansas DNA Resource Center
- University of Arkansas for Medical Sciences DNA Sequencing Core Facility
- University of Florida Interdisciplinary Center for Biotechnology Research
- University of Georgia Genomics Facility
- University of Kentucky Advanced Genetic Technologies Center
- University of Miami Miller School of Medicine Center for Genome Technology
- University of Mississippi Medical Center Cancer Molecular and Genomics Core
- University of North Carolina at Chapel Hill Genome Analysis Facility
- University of Oklahoma Health Sciences Center LMBCR
- University of Tennessee Molecular Biology Resource Facility
- University of Tennessee Health Science Center Molecular Resource Center
- University of Texas at Austin DNA Core Facility
- University of Texas MD Anderson Cancer Center Sequencing and Microarray Facility
- University of Texas Medical Branch Protein Chemistry Laboratory
- University of Texas Health Science Center at San Antonio Nucleic Acids Core Facility
- University of Virginia Biomolecular Research Facility
- University of Virginia Genomics Core Facility
- UT Southwestern Medical Center McDermott Center Sequencing Core
- Virginia Tech Bioinformatics Institute
- Virginia Commonwealth University Nucleic Acids Research Facilities
We hope to have all the reviews finished in the next week or so. If your facility or company is not on the list please get in contact with us.
Why do only 5 of the 93 DNA sequencing facilities using PeakTrace acknowledge using it?
If you look through our list of DNA sequencing facilities reviews you will notice that only 5 of the more than 300 facilities listed publicly acknowledge using PeakTrace, despite 93 sites using it (as of October 2015). Why won’t the other 88 let us say they are using PeakTrace? It is not because they think the product is bad, or that we have not asked them!
The simple answer is they don’t want their competitors to know about PeakTrace. The DNA sequencing service market is very competitive and the facilities using PeakTrace want to keep it all to themselves. The last thing they want is their competitors to start using PeakTrace. When you have access to software that can turn your traces from the top image (called with the KB basecaller) into the bottom image (called with the PeakTrace basecaller) then you know you are onto something fantastic that gives your business a massive edge. If you are at all rational you don’t want your competitors using PeakTrace too. Such are the woes of a business selling a product that is truly unique and valuable.
If you want to see what PeakTrace can do for your traces and facility then trial our free PeakTrace Online service, or download the PeakTrace Whitepaper. If you have any questions about how PeakTrace can help your facility then please get in contact with us at {This email is obscured. Your must have javascript enabled to see it}.
Updated DNA Sequencing Service Provider Reviews
We have been updating the reviews of DNA sequencing service providers on our DNA Sequencing Service Reviews page. The new reviews include the following facilities and companies. If your facility or company is not on the list we are happy to add it – just get in contact with us.
New Reviews
- BCH IDDRC Molecular Genetics Core
- BGI Americas
- Bio Basic
- Cornell University Institute of Biotechnology
- Creative Genomics
- Dana-Farber/Harvard Cancer Center DNA Core Facility
- Dartmouth College Molecular Biology Shared Resource
- Duke University DNA Analysis Facility
- Georgetown University DNA Sequencing and Fragment Sizing Service
- GENEWIZ Inc. DNA Sequencing Service
- Harvard Medical School Biopolymers Facility
- Hubbard Center for Genome Studies DNA Sequencing Facility
- Huck Institutes for Life Sciences at Penn State
- John Hopkins University DNA Analysis Facility
- Macrogen USA
- Marshall University Genomics Core Facility
- Napcore The Children’s Hospital of Philadelphia
- New Jersey Medical School Molecular Resource Facility
- Penn State Hershey Molecular Genetics Core Facility
- Roswell Park Cancer Institute DNA Sequencing Facility
- Sloan-Kettering Cancer Center Core Facility
- Sequegen DNA Sequencing Service
- University of Delaware Sequencing & Genotyping Center
- University of Albany Center for Functional Genomics
- University of Pennsylvania DNA Sequencing Facillity
- University of Pittsburgh DNA Sequencing Core
- University of Rochester Functional Genomics Center
- Wadsworth Center Applied Genomic Technologies Core
- West Virginia University Genomics Core Facility
- Vermont Cancer Center DNA Sequencing Core
- Yale W.M. Keck DNA Sequencing Facility
- Arizona State University DNASU Sequencing Core
- Bio Applied Technologies Joint, Inc
- Brigham Young University DNA Sequencing Center
- City of Hope Integrative Genomics Core
- Colorado State University DNA Sequencing Facility
- Fred Hutchinson Cancer Research Center Genomics Resource
- HIBM Research Group
- High Throughput Genomics Center
- Nevada Genomics Center
- Oregon State University CGRB Core Facility
- Polymorphic DNA Technologies
- Quintara Biosciences
- Rancho Santa Ana Botanic Garden Core Genetics Facility
- Seattle BioMed Sequencing Core Facility
- Scripps Research Institute Center for Protein and Nucleic Acid Research
- SeqXcel
- Sorenson Genomics
- Stanford University Medical Center PAN Facility
- Synthetic Genomics
- UC Berkeley DNA Sequencing Facility
- UC Davis DNA Sequencing Facility
- UC Riverside Institute for Integrative Genome Biology Genomics Core
- UC San Francisco Genomics Core Facility
- University of Alaska Fairbanks DNA Core Facility
- University of Arizona Genetics Core
- University of Colorado DNA Sequencing & Analysis Core
- University of Hawaii Pacific Biosciences Research Center
- University of Montana Murdock DNA Sequencing Facility
- University of Nevada, Las Vegas Genomics Core Facility
- University of Utah DNA Sequencing Core Facility
- University of Washington School of Pharmacy DNA Sequencing and Gene Analysis Center
- Utah State University Center for Integrated BioSystems
- BloodCenter of Wisconsin Molecular Biology Core
- Case Western Reserve University Genomic Sequencing Core
- Cincinnati Children’s Hospital DNA Sequencing and Genotyping Core
- Cleveland Clinic Genomics Core
- DNA Analysis LLC
- Ingenbio
- Indiana University Molecular Biology Institute
- Indiana University School of Medicine
- Iowa State University DNA Facility
- ISU Molecular Research Core Facility
- Kansas State University
- Loyola University DNA Core
- Mayo Clinic Molecular Biology Core
- Miami University Center for Bioinformatics & Functional Genomics
- Medical College of Wisconsin Human and Molecular Genetics Center
- Michigan State University Research Technology Support Facility
- Northwestern University Genomics Core Facility
- Ohio State University Plant-Microbe Genomics Facility
- Ohio State University Comprehensive Cancer Center
- Ohio University Genomics Facility
- Oklahoma Medical Research Foundation DNA Sequencing Facility
- Purdue University Genomics Core Facility
- Southern Illinois University Core
- Stowers Institute for Medical Research
- Taueret Laboratories
- Washington University Protein and Nucleic Acid Chemistry Laboratories
- University of Chicago Genomics Facility
- University of Cincinnati Core Facility
- University of Illinois at Chicago DNA Services Facility
- University of Illinois at Urbana High-Throughput Sequencing and Genotyping Unit
- University of Iowa DNA Facility
- University of Minnesota DNA Sequencing and Analysis Facility
- University of Nebraska Medical Center DNA Sequencing and Genotyping Core Facility
- University of Notre Dame Genomics and Bioinformatics Core Facility
- University of Wisconsin DNA Sequencing Facility
- Wayne State University Applied Genomics Technology Center
- WestCore – Black Hills State University
- Acadia Research Laboratories
- Clemson University Genomics Institute
- East Tennessee State University Core Facility
- Eurofins Genomics USA
- North Carolina State University Genomic Sciences Laboratory
Auto PeakTrace RP and the N base threshold
Among the many changes introduced with PeakTrace 6 was an improvement in the PeakTrace basecalling. This has resulted in some bases that had previously been called as Q6 now being called as Q5. Since these bases are unlikely to be useful (a Q5 base is expected to be wrong about 32% of the time, while an Q6 base is expected to be wrong 25% of the time), this change is of minor importance except that it can cause an issue if you are using a very old version of Auto PeakTrace RP (pre-5.70) and have the N base threshold set to 0. If you are you will now see some N bases in your basecall.
If this is an issue for you the best solution is to upgrade to Auto PeakTrace 6 RP since there are a number of new features that can’t be accessed using old versions of Auto PeakTrace RP, however, you can work around this change by setting the N base threshold to 1 rather than 0.
Validating the PeakTrace Basecaller
We often get asked if the improvement seen by using PeakTrace is real. While this question is best answered by doing your own investigations (try the free version of PeakTrace on your own traces and see), it is something we have investigated in depth.
Determining if a basecaller is accurate and provides real improvement is not a simple process – Ewing & Green had two back-to-back papers in Genome Research answering this very question for phred [1, 2]. We followed the same approach to validate both PeakTrace and KB. This study found that PeakTrace not only offered significantly more alignable Q20+ and total bases than KB, it was also more accurate at predicting the true error rate than KB [3].
While this study was performed using an earlier versions of PeakTrace (4.25) and KB (1.2), it underestimates the benefits of using PeakTrace. Since 2010 PeakTrace has continued to improve, while the most recent release of KB (1.4.1) provides basically the same read length as KB 1.2.
References
- Ewing B, Hillier L, Wendl MC, Green P. (1998). Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res. 8:175-185.
- Ewing B, & Green P. (1998): Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res. 8:186-194.
- Tillett, D. (2010). Validation of the PeakTrace basecaller.
Auto PeakTrace 6.01 Released
We have released Auto PeakTrace 6.01 and PeakTrace 6.01 for Linux. This is a minor update that fixes a small issue with the clear range trim when combined with other trim settings. The changes are:
- New feature. Select the edited or original base call from the input trace (command line only).
- Fixes an issue with the clear range trim may not work.
These updates are available on the Nucleics Downloads page. If you have not upgraded to Auto Peaktrace 6 you will need to follow the instructions on how to upgrade to Auto PeakTrace 6.
More papers mentioning the use of PeakTrace
Here are some new publications that mention the use of PeakTrace in their methods sections. While most people using PeakTrace are unaware they are using it as it is being performed automatically for them by their DNA sequencing service provider, it is always good to see small-scale users making use of the free online version of PeakTrace.
Auto PeakTrace 6 RP Software Released
We have released Auto PeakTrace 6 RP. This is a major update to our PeakTrace RP software and offers a number of improvement and new features as well as the usual improvements and fixes. The full list of changes are.
- New Feature. Clean baseline.
- New Feature. Extra normalization.
- Parallel installation of Auto PeakTrace 6 RP with Auto PeakTrace RP.
- Improved handling of capillaries that run slower than expected.
- Improved signal to noise determination.
- Improved trimming of noise level data from traces.
- Improved detection of failed traces.
- Improved basecalling of traces without KB quality scores.
- Improved basecalling of short traces.
- Improved q average trim.
- Additional error messages to help identify the cause of problems.
- Updated visual design and help manual.
- Bug fixes.
Many of these changes will be automatically applied to users of previous versions of Auto PeakTrace RP as they are controlled by the remote server software, however the two new features (clean baseline and extra normalisation) can only be utilised via the Auto PeakTrace 6 RP. This major update is available from the Nucleics Downloads page and it is highly recommend that all users of PeakTrace RP upgrade to this version.
- « Previous Page
- 1
- …
- 11
- 12
- 13
- 14
- 15
- …
- 22
- Next Page »