Can Excel Data Fuel an Enterprise-Grade Analytics Strategy?
Modern cloud-based data technologies are revolutionising how businesses make informed decisions based on trusted data. These cloud technologies offer immense benefits, from seamless scalability to advanced process automation. Despite these advancements, every organisation still suffers from the proliferation of Excel out in the wild, and it’s not going anywhere anytime soon. Excel continues to be the go-to tool for most knowledge workers – mainly due to its general availability, familiarity alongside its ability to input and manipulate data before adding calculations and analysing it. Excel’s versatility encourages creativity, leading to the development of critical Shadow Data Processes that operate outside standard procedures. Whilst important to business operations, these shadow processes pose considerable risks that challenge data governance and control, ultimately undermining trust in the data. Let’s delve into these risks and explore how Excel-based processes can live harmoniously with enterprise-grade data management. The Six Key Risks of Excel Shadow Data Processes Risk 1: Quality, Completeness and Validity Many Excel functions, tools, and techniques exist to attempt to input complete, clean data – but for every approach we’ve seen, we’ve also seen an end-user creatively circumvent the ‘controls’ that have been put in place. Cell validations can easily be mistakenly overwritten by a simple copy-and-paste action. Columns can be added, or worse, removed too easily. Sheets are renamed, combined, or duplicated. Dates are entered in a wide range of formats. The versatility and accessibility of Excel are at the heart of poor data quality and completeness in Excel. Users can quickly open a file and enter invalid inputs that are not allowable values – they do not adhere to master data management standards. They can easily right-click and delete important information. Excel is hard to govern and often leads to poor data quality. Risk 2: Data Silos The core issue with Excel-related data quality is the fact that the process relies on human interaction. Excel data is often human-generated, and those who input data are doing so for their own use – they often are not thinking about how the data can be systematically collected and stored in a database where its value can be leveraged by their colleagues. This leads to data being created for the needs of an individual running a ‘shadow’ process, ignoring the benefits that the data may bring to the wider business. Furthermore, we see that the same data is collected by many people within the same organisation, often duplicating their efforts and reducing the overall productivity of the workforce. The question of productivity is amplified when we consider that data may be collected and input into Excel at a different cadence, which can lead to two people returning different answers to the same question – which then takes time to unpick and reduces overall confidence and trust in the data. Excel can impact an organisation’s ability to produce a single picture of their position without a small army of people trawling their file systems to provide data and insights. Silos prevent building a comprehensive view of the organisation and often prevent the creation of automated processes to free up time for their most valuable resources to make decisions. Risk 3: Security Data holds answers and insights to so many important business topics, many of which are top secret and can provide an organisation with a competitive advantage over their competition. Therefore, it is paramount that the data is kept secure and governed, ensuring only those who have permission and authority to access it can do so. Excel does have security features, but these are not robust and enterprise-grade. Workbooks and sheets can be locked; however, they do not require personal identification through an Identity Provider to access. Passwords are often shared. Equally, whilst the use of file storage has hugely matured over the last decade, with many organisations adopting enterprise-wide solutions such as OneDrive, Google Drive Enterprise, Box (and more!), Excel files are still commonly shared via email as attachments. When combined with the poor user-based security, emailing files with sensitive data increases an organisation’s exposure to a data leak, which may have an adverse impact. Furthermore, emailing Excel files further contributes to the issues discussed in ‘Silos’ – giving multiple versions of the truth. Risk 4: History and Versioning Some of the most powerful analytics require daily snapshots of data, allowing consumers to understand what has changed and moved in their data since the last time they looked at their reporting and analytics. Storing historical data in Excel makes it extremely challenging to track changes through time and understand how data is changing and drifting on a longitudinal basis. Volume and computing constraints present the largest challenges; however, discipline on entering and processing the data at regular intervals also contributes to the challenge. Equally, versioning the data can present its own unique challenges. When working with data, it is critical that a user understands where the data has come from, who owns the data, and how frequently the data is updated. With Excel, versioning data and keeping track of it as it moves through emails and file storage systems can make it nearly impossible to clearly identify the most up-to-date dataset to work with. File proliferation, tracking changes, and merging updates before tracing them back to the owners is a tedious, manual task that must be regularly undertaken. Risk 5: Volumes Excel will always have limitations on the amount of data it can store – the most recent version of Excel has a row limit of 1,047,576 rows with 16,384 columns. Whilst this volume of data is a limitation, we rarely see files which max out the row/column limit as other volume-based challenges are met before the hard cap on rows and columns is reached. Whilst there is a cloud version of Excel, most commonly, Excel is worked on locally on the desktop-based application. Files with large volumes become cumbersome and can take a long time to simply open. Furthermore, performance degradation is common where some of the most basic formulas leave the user with