Microsoft Excel-led Errors in Genomics Papers Underscore Importance of End User Computing Management

By Ian Cleaver, VP Global Professional Services, ClusterSeven

The recent story that 20 per cent of academic papers on gene research that are based on data collated in Excel spreadsheets have errors, underscores two things – one, that the scientific community is primarily reliant on the Excel spreadsheet for acquisition of data. Two, that there appear to be very minimal, if any, controls on monitoring the accuracy and integrity of data in Excel.

While this revelation is astounding, the error is easily overlooked given the 1000s and 1000s of data points being generated from next generation gene sequencing programmes. With the volume of data handled in spreadsheets and other End User Computing (EUC) applications (databases, modelling tools), scientific research organisations need to establish best practice controls across the lifecycle of critical spreadsheets to prevent systemic errors from proliferating across the EUC landscape and indeed into the more advanced analysis systems that researchers use such as graphical and statistical analysis systems.

Applying a monitoring and control framework to spreadsheets and other files across the EUC landscape is imperative. Rules-based controls applied to data and specific areas of spreadsheets means that any changes will be automatically monitored and alerts generated to designated individuals for visibility of the modifications. Such an approach will also enable researchers to detect anomalous behaviour to provide early warning and facilitate corrective action.

The Excel spreadsheet, a tried and tested workhorse, is here to stay. It offers users the flexibility and agility they need to acquire and manipulate data, a functionality that isn’t as easily offered by other systems. But clearly, this EUC tool isn’t without its shortcomings. Excel risk is a reality.  However, the risk can be mitigated by instituting simple, fundamental safeguards. Today, proven technology exists to enable life sciences organisations to effortlessly streamline the process of managing data change across the lifecycle of critical spreadsheets and other EUC files. Implementing such EUC control frameworks is a no-brainer, given the wide reaching repercussions of Excel-based errors on the scientific community and the cutting edge research it performs.