The Cloud BI Transition: BI Key Players and their changing roles

I’ve previously talked around how Azure is changing traditional BI approaches, and the various architectural frameworks we’re going to see coming out of it. But what about the existing job roles within BI Teams and related structures? How are they going to change with the new ways of working that are emerging?

It’s a question we need to start considering now as cloud architectures become more commonplace and business are more willing to trust their data to cloud providers. From my recent work in piecing together new architectures and frameworks, I’ve been fortunate enough to have these conversations with people who are currently evaluating their position and their future training needs. From these conversations, and my own research, I’ve put some thoughts together around the “traditional” BI roles and how they’re changing:

BI Architect – In many ways, this is the role that changes the most. We now have a wealth of options for our data, each with their own specialist uses. The BI Architect needs to be familiar with both new functionality and old, able to identity the most relevant technology choice for each. Whereas we would previously be able to host the majority of our different systems within a single, cover-all SQL Server architecture, we will now find certain structures to be more aligned with the performance profiles of specific Azure components and less so with others. Business Users expecting lightning-fast dashboard performance would not benefit from a Data Lake system, whereas a Data Scientist would be unable to work if confined to a pre-defined, data model. A small, lightweight data mart of several Gb would likely perform worse in a full Azure DW system, given the data would be split across so many distributions the overhead of aggregating the results would outweigh the parallelism gains – in this case we would introduce an Azure SQL DB or Azure SQL VM to cater for these smaller marts.

Infrastructure Specialist – Gone are the days of the infrastructure specialist needing to know the install parameters of the SQL Server and the best disk configurations to use in different scenarios. We’re now focusing on security models, network architecture and automation and scaling management. The PaaS and IaaS systems differ greatly in their approach to security, with IaaS requiring traditional networking, setting up VLANs/security layers and PaaS components each having their own firewall layers with individual exception management. The infrastructure specialist should also be advising/designing the Azure subscription setup itself, connections to other subscriptions, perhaps managing expressroute and gateway connections back to the on-premise systems. There is also the considerations of whether to extend Active Directory into the Azure domain, making the Azure estate more an extension of the company’s internal network.

Data Modeller – The end role here doesn’t change dramatically, many previous design principles are still the case in the new approaches we’ve outlines. However there are some additional performance considerations that they will need to build into their designs. Azure DataWarehouse, for example, fundamentally relies on minimising data movements that occur when querying data. A snow-flaked model, or a model with several very high cardinality dimensions, might find performance degrades significantly, when it may have been the most performant design in a traditional multidimensional cube.

BI Developer – This role will still include many of the traditional tools, very strong SQL skills, an understanding of data movement & transformation technologies and excellent data visualisation skills. However, the traditional “stack” skills of SSIS/SSAS/SSRS are extended and augmented with the additional tools at our disposal – components such as Stream Analytics, Data Factory, Event Hubs and IoT sensor arrays could all easily fall into the domain of BI yet require radically different skillsets. Exposure to C# and the .NET framework becomes more valuable in extending systems beyond the basic BI stack. The management of code is essential as these environments grow – being able to rapidly deploy systems and being confident in the development process is vital in order to get the most out of cloud technologies.

Data Scientist – For the first time, users with advanced analytical skills have a place in the architecture to allow for experimentation, ad-hoc analysis and integration with statistical tools. This free-form analysis outside of strict development protocols accelerates the business’s access to the insight and understanding locked within their collected data. Many of these insights will mature into key measures for the business and can be built into the more stable, curated data models

Data Steward – If anything, the importance of a nominated data steward grows as we introduce systems designed for ad-hoc analysis. Without governance and controls around how data is stored within a data lake, it can quickly become a dumping ground for anything and everything. Some critics see data lakes as a dystopian future with uncontrolled “swamps” of data that grow meaningless over time. Whilst “store everything” is a fundamental tenant of the data lake mentality, everything stored should be carefully catalogued and annotated for maximum usefulness – the importance of this should not be underestimated. Our steward should embody the meticulous collector, not the disorganised hoarder.

Database Administrator – The introduction of PaaS services as our main components change this role dramatically, but they remain a core member of the business intelligence team. The common DBA tasks of security management, capacity and growth planning, performance optimisation and system monitoring are all very much a part of day-to-day life. Certain tasks such as backup & recovery are taken away as services Microsoft provide but the additional skills needed to manage performance on these new technologies are critical. Data Lakes produce large amounts of output as a by-product of running jobs and queries, these need to be cleaned and maintained over time. Access levels to different areas of the lakes will be a growing concern as our lake models mature. Finally, the tuning of high-performance queries, whether in U-SQL or Azure DW now require a whole new set of skills to analyse and optimise. PowerShell, a traditional tool of the DBA, becomes hugely powerful within Azure as it is the key to managing system automation – scaling systems up, down, on and off requires a reasonable grasp of PowerShell if you want to get the most out of your systems.

BI Analyst – Somewhere in between our Data Scientists, Developers and Consumers, we have our BI Analysts. Whereas previously they may have been expert cube users, building reports and dashboards for end users to consume using the BI systems provisioned, they now have far more autonomy. PowerBI and other reporting technologies, whilst being touted as the silver-bullet for all self-service reporting needs, can deliver far more when in the hands of an experienced reporting analyst. Essentially, the analyst still acts as the champion of these tools, pushing data exploitation and exposure, except they now have the ability to deliver far more powerful systems than before.

Data Consumer – The business user is, by far, the beneficiary of these tools and systems. With a flexible, scalable architecture defined, as well as different streams of data management and exploitation, there will be many benefits for those who need to gain insight and understanding from the company’s various data sources. The data models supporting self-service tools will benefit from faster performance and can include data sets that were previously size-prohibitive, giving the data consumers instant access to wider models. If the datasets are too complex or new to be exposed, they can contact specialists such as the BI Analyst or Data Scientist to investigate the data on their behalf. These manual-analysis tasks, because the architecture is built and designed to support them, should be more maintainable for a BI team to provide as an ongoing service.

Conclusions

In many ways, things aren’t changing that much – we still need this mix of people in our BI team in order to succeed. But as always, technology is moving on and people must be willing to move along with it. Many of the design patterns and techniques we’ve developed over the past years may no longer apply as new technology emerges to disrupt the status quo. These new technologies and approaches bring with them new challenges, but the lessons of the past are essential in making the most of the technology of the future.

The people who will be thriving in this new environment are those who are willing to challenge previous assumptions, and those who can see the new opportunities presented by the changing landscapes – both in terms of delivering value quicker and more efficiently, and finding value where previously there was none.