Error-free software is extremely rare. A syntax error as simple as missing a quotation mark may be enough to prevent a software program from executing properly. In general, errors near the beginning of the source code will likely propagate downstream. The presence of software errors is often attributed to the inadequate software testing prior to release. This includes proper software verification and validation across different computing environments, parameters and use cases. When choosing a software for our applications, it is important to note that popularity or usage of software is not always indicative of its quality.

As big data continue to transform the medical and research field, diagnostic software products, ranging from personalized healthcare apps to genetic testing software, are becoming an integral part of healthcare. In particular, bioinformatic software now plays an important role in the advancement of precision medicine, which ultimately affects the healthcare decisions for the patients. In clinical settings, errors in medical software, known as Software as a Medical Device (SaMD), cannot be tolerated. To increase clinical adoption of genomic-based diagnostics, we must build better scientific tools and validate them rigorously to ensure safe use in patient care.

Recent case studies of scientific software errors

  • Lack of software development training - A 2021 benchmarking study of 48 scientific software has shown that while 3 out of 5 top scoring tools were developed by researchers with formal training in computer science, 6 worst scoring tools were developed by self-taught programmers with no formal training (Zapletal et al., 2019).
  • Unexpected software behaviour - Unexpected, operating system-dependent behaviour in the Python’s glob module caused miscalculation of data in over 150 published studies (Bhandari Neupane et al., 2019).
  • Improper data formatting - Microsoft Excel erroneously converts gene symbols to dates and floating-point numbers, affecting 19.6% of publications with supplementary Excel file with gene symbols. The most affected journals include Nucleic Acid Research, Genome Biology, Nature Genetics, Genome Research, Genes and Development and Nature (Lewis et al., 2021).

Open-access tools for software quality evaluation

  • SoftWipe assess the adherence to coding standards in scientific tools written in C or C++. It scores scientific software based on the inherent characteristics of the software, such as the number of compilers, inconsistent or non-standard coding formatting and the degree of code duplication. However, it does not evaluate code correctness or validate the software against reference standards.
  • Benchtop compares software outputs to evaluate metrics, such as concordance and accuracy, against a reference. Contrary to SoftWipe, Benchtop does not assess code quality, but rather highlights the differences in outputs caused by changes in the code or computing environments.

In addition to these software evaluation tools, the open-source development community (e.g., GitHub) is also valuable in allowing users to use and to test the software, The community can examine the source code, report bugs, raise questions or suggest improvements to the software, minimizing undetected software errors and accelerating the time for troubleshooting and optimization.


1. Bhandari Neupane, J., Neupane, R. P., Luo, Y., Yoshida, W. Y., Sun, R., & Williams, P. G. (2019). Characterization of Leptazolines A-D, Polar Oxazolines from the Cyanobacterium Leptolyngbya sp., Reveals a Glitch with the "Willoughby-Hoye" Scripts for Calculating NMR Chemical Shifts. Organic letters, 21(20), 8449–8453.

2. Lewis D. (2021). Autocorrect errors in Excel still creating genomics headache. Nature, 10.1038/d41586-021-02211-4. Advance online publication.

3. Zapletal, A., Höhler, D., Sinz, C., & Stamatakis, A. (2021). The SoftWipe tool and benchmark for assessing coding standards adherence of scientific software. Scientific reports, 11 (1), 10015.