DNA decoding non-accuracy (Introduction)
How much DNA is deciphered properl? Only some.-"Biology has a dirty little secret. It's a well-known secret among those who deal with sequenced-genome data intensively, but I suspect many non-biologists are unaware of the problem, which is: Much of the existing genome data (for sequenced genomes, ranging from bacteria to human DNA) is either corrupt or misannotated. "Junk DNA" probably doesn't exist in living cells. But it certainly exists in published genomes. -"A substantial portion of published genome data is suspect, at this point, either because of contamination issues, technical problems surrounding DNA sequencing technology, or faulty gene annotation. An example is the Oryza sativa indica (rice) genome, which inexplicably contains at least 10% of the genome of the bacterium Acidovorax citrulli. There's also a Culex (mosquito) genome with a complete copy of Wolbachia embedded. The genome of Rothia mucilaginosa DY-18 contains over 300 genes incorrectly annotated in antisense orientation (as does the genome of Burkholderia pseudomallei strain 1710b, a truly execrable train-wreck of a genome)."-"The annotation accuracy problem is getting worse by the year (see graph above). Devos and Valencia estimated in 2001 that misannotation levels could be as high as 37%. More recently, Schnoes et al. (2009) concluded that "function prediction error (i.e., misannotation) is a serious problem in all but the manually curated database Swiss-Prot," and yet Artamonova et al. (2005) found that for five types of annotation entries, even the vaunted UniProt/SwissProt database had an error rate between 33% and 43%. So even the best manually curated database is full of errors"-http://asserttrue.blogspot.co.uk/2014/05/biologys-dirty-little-secret.html-Conclusions from this type of situation mean evolutionary history conclusions are suspect.