The production of nucleic acid sequences by automatic DNA sequencer machines is always associated with some base-calling errors. In order to produce a high-quality DNA sequence from a molecule of interest, researchers normally sequence the same sample many times. Considering base-calling errors as rare events, re-sequencing the same molecule and assembling the reads produced are frequently thought to be a good way to generate reliable sequences.
When analyzing sequencing reads, it is important to distinguish between putative correct and wrong bases. An open question is how a PHRED quality value is capable of identifying the miscalled bases and if there is a quality cutoff that allows mapping of most errors. Considering the fact that a low quality value does not necessarily indicate a miscalled position, we decided to investigate if window-based analyses of quality values might better predict errors.