For the majority of individuals and businesses worldwide, Microsoft Excel is the go to spreadsheet software, and everyone from students to trillion dollar enterprises use the software every day.
But now, Microsoft Office’s spreadsheet package is being blamed for errors in academic papers by scientists. In fact, according to a report by BioMed Central, as many as 20% of all scientific papers with genomic material data that rely on information inputted to Excel spreadsheets, could be seriously flawed.
However, the real issue, when it comes down to brass tacks and facts, has more to do with the fact that some scientists and researchers just don’t know how to use Microsoft Office very well…that, and Excel’s obsession with automatically formatting and changing things to what it thinks might be right.
“For example, gene symbols such as SEPT2 (Septin 2) and MARCH1 [Membrane-Associated Ring Finger (C3HC4) 1, E3 Ubiquitin Protein Ligase] are converted by default to ‘2-Sep’ and ‘1-Mar’, respectively.”
In another setback for scientists, the BioMed report also noted that they had uncovered several instances “where gene symbols were converted to dates in supplementary data of recently published papers (e.g. ‘SEPT2’ converted to ‘2006/09/02’). This suggests that gene name errors continue to be a problem in supplementary files accompanying articles.”
The BioMed report found that 704 of the papers they studied contained gene name errors created by Excel.
Microsoft told the BBC:
“Excel offers a wide range of options, which customers with specific needs can use to change the way their data is represented.”
Not that Excel alone has issues with genomic data. The study also found the conversion issues were also present in other spreadsheet software, such as Apache OpenOffice Calc. The problem did not however seem to occur in Google Sheets.
At the end of the day however, the real issue isn’t with Excel itself, but with the fact that the some of the world’s smartest and highly educated minds, just don’t know how to use a spreadsheet; or at least, they don’t know how to adapt a spreadsheet from its default settings. Ironic and funny as that is, it’s a very real and serious issue.
“Inadvertent gene symbol conversion is problematic because these supplementary files are an important resource in the genomics community that are frequently reused.”