DNA sequencing reviews

Has the KB Basecaller been Validated?

October 17, 2015 By Daniel

Mapping comparison of PeakTrace and KB

We occasionally get asked if the PeakTrace Basecaller™ ever been independently validated? The answer is no one has ever published an independent validation of any Sanger DNA sequencing basecaller. This includes the widely used KB Basecaller™.

The most complete published study on basecaller validation is Ewing and Green’s 1998 paper on the validity of phred [1]. This study was performed by the same team that developed phred [2] so while very though, it is far from an independent study.

The KB Basecaller™ has no published studies on its validation beyond a single poster at Advances in Genome Biology and Technology (AGBT) in 2004 by the team that developed KB [3]. While this poster showed that KB gave more Q20+ bases than phred, these extra Q20+ bases didn’t seem to help much when the data was used (i.e. in assemblies – see Figure 5 on this poster).

The ABI 3500 DX model has been approved for clinical testing by the FDA [4], however, the validation of KB done for this approval was performed by ABI and is limited to the ABI 3500 DX model when using POP6™, BigDye™ v1.1, the 3500 Dx Series Data Collection Software 1.1, and one of two run conditions [5]. If you are running any other sequencing instrument, polymer, BigDye, or run condition then KB has not gone through the FDA validation process. You can’t assume that KB will give valid results under your run conditions just because it has been validated for a particular instrument, consumable set, and run conditions.

It is totally understandable why ABI (ThermoFisher these days) has only undertaken the FDA approval process for the ABI 3500 DX and only under a very limited range of conditions. The FDA process takes years, costs millions of dollars, and needs to be done separately for each instrument, polymer, terminator chemistry, capillary length, and run module. Even for a company as large as ThermoFisher it does not make sense to go through the FDA process for every sequencer it has ever sold.

The closest to an independent study of the KB Basecaller we have been able to find was a paper published by Hyman et. al. in 2010 [4]. They did not perform a full validation (as was done by Ewing and Green for phred), but they did find that KB gave more Q20+ bases than phred. They also found that these extra Q20+ bases did not help them in their application of identifying bacterial species from the sequence data [4].

In defence of the KB Basecaller our own validation studies shows that KB is basically fine except for a few problems in the Q20 to Q30 range where it tends to over predict the actual quality [7]. For almost all non-clinical applications this is not an major issue, but it may explain why there has been little or no benefit seen from the greater Q20+ bases that KB provides over phred.

The bottom line is KB (like phred) has been validated by the people who wrote it, but no one has published an independent validation of either of these basecallers except ironically us [7]. If you demand that a basecaller be independently validated before you will use it then the only independent study you can turn to is ours of KB (we have also validated phred, but this has not been made public). If you trust us to validate KB, then you can trust us to validate PeakTrace too.

So where does this leave the validation of the PeakTrace Basecaller? Nucleics has extensively validated PeakTrace [7] in exactly the same manner as was done by the developers of the phred [1] and KB basecallers [3]. PeakTrace is currently being used in 93 facilities around the world to basecall tens of millions of traces per year. Our customers would not pay the extra cost for using PeakTrace – after all they get the KB Basecaller for free with the instrument – if they did not believe it provided real and significant improvement.

Of course the best validation is always your own validation. For almost all applications it does not matter if, for example, a predicted Q31 base has a true Q31 error rate (or if it is actually Q29 or Q33), what matters is the number of actual errors, the usable read length, and that bad bases are not called as good bases. We have yet to see an application (even clinical) where it matters if a base is given a Q score more accurate than to the nearest Q10.

Luckily it is very easy to do a quick, do-it-yourself validation of PeakTrace. Just grab a handful of good traces of known sequence, run them through the free online PeakTrace service, then BLAST both the KB and PeakTrace sequence data to comparing the errors and aligned bases found with each bascaller. This will quickly tell you which basecaller is best for your data. If you follow this process we are sure you will be impressed with how PeakTrace performs.

References

Ewing B, & Green P. (1998). Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res. 8:186-194.
Ewing B, Hillier L, Wendl MC, Green P. (1998). Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res. 8:175-185.
Longer Reads and More Robust Assemblies with the KB Basecaller (2004)
http://www.fda.gov/downloads/BiologicsBloodVaccines/BloodBloodProducts/ApprovedProducts/SubstantiallyEquivalent510kDeviceInformation/UCM339686.pdf
https://tools.thermofisher.com/content/sfs/brochures/3500-dx-series-genetic-analyzer-cs2.pdf
Hyman RW, Jiang H, Fukushima M, Davis RW. (2010). A direct comparison of the KB™ Basecaller and phred for identifying the bases from DNA sequencing using chain termination chemistry. BMC Research Notes. 3:257.
Tillett D. (2010). Validation of the PeakTrace Basecaller.

Additional 38 DNA sequencing service reviews added

October 6, 2015 By Daniel

We have added another 38 reviews of DNA sequencing service providers. These include

We hope to have all the reviews finished in the next week or so. If your facility or company is not on the list please get in contact with us.

Why do only 5 of the 93 DNA sequencing facilities using PeakTrace acknowledge using it?

October 5, 2015 By Daniel

KB Basecaller

PeakTrace Basecaller

If you look through our list of DNA sequencing facilities reviews you will notice that only 5 of the more than 300 facilities listed publicly acknowledge using PeakTrace, despite 93 sites using it (as of October 2015). Why won’t the other 88 let us say they are using PeakTrace? It is not because they think the product is bad, or that we have not asked them!

The simple answer is they don’t want their competitors to know about PeakTrace. The DNA sequencing service market is very competitive and the facilities using PeakTrace want to keep it all to themselves. The last thing they want is their competitors to start using PeakTrace. When you have access to software that can turn your traces from the top image (called with the KB basecaller) into the bottom image (called with the PeakTrace basecaller) then you know you are onto something fantastic that gives your business a massive edge. If you are at all rational you don’t want your competitors using PeakTrace too. Such are the woes of a business selling a product that is truly unique and valuable.

If you want to see what PeakTrace can do for your traces and facility then trial our free PeakTrace Online service, or download the PeakTrace Whitepaper. If you have any questions about how PeakTrace can help your facility then please get in contact with us at {This email is obscured. Your must have javascript enabled to see it}.

Updated DNA Sequencing Service Provider Reviews

October 4, 2015 By Daniel

We have been updating the reviews of DNA sequencing service providers on our DNA Sequencing Service Reviews page. The new reviews include the following facilities and companies. If your facility or company is not on the list we are happy to add it – just get in contact with us.

New Reviews

Updated DNA Sequencing Service Reviews

August 4, 2014 By Daniel

We have started to update the DNA sequencing service reviews, add new facilities, and remove facilities that are no longer doing Sanger DNA sequencing. We have identified 325 facilities worldwide that offer Sanger DNA sequencing services and will be add and updating reviews on all of them over the next couple of months. This is quite a big undertaking, but if you would like to have your facility added to the this list please let us know.

Review of DNA sequencing service facilities

December 17, 2005 By Daniel

We have added a review of more than 143 DNA sequencing facilities that offer general DNA sequencing services. In addition, we have added a guide to selecting a DNA sequencing service provider. We hope you find these guides and reviews helpful.