Blending Disciplines: Microsoft's Bold Solution for the Data Responsibility Crisis

Blending Disciplines: Microsoft's Bold Solution for the Data Responsibility Crisis

Have you ever wondered how multinationals manage the massive amounts of data they generate daily? Mostly, they don’t. If we’re talking about doing this by design, by technical means… It's a colossal task, which involves not only the management and protection of data but also deploying IT in balance with different privacy & data laws from several countries. Enter Microsoft Purview - a suite of tools designed to ease this burden, acting as a central portal for data classification, protection, and risk management.

The Meaning of “Purview”

In the Oxford Learner's Dictionaries, 'Purview' is defined as being within the limits of responsibility or dealt with by a document or law. At Microsoft, we chose this name for our suite of tools to signify our commitment to enabling organizations to efficiently manage their data responsibilities.

The Intersection of Data Governance, Security, and Privacy

In a world where data breaches and privacy violations regularly make headlines, it's important to understand the intricate connections between data governance, security, and privacy. Although these challenges revolve around responsible data management, they each stem from distinct disciplines.

Data Governance: Organizing the Data Deluge

The era of big data saw companies and governments accumulate massive amounts of information, creating an urgent need for proper data organization. When people use the term ‘data governance’, they mostly refer to organizing data at the infrastructure and platform levels, such as with Infrastructure as a Service (IaaS) and Platform as a Service (PaaS) offerings in Azure. Data scientists, engineers, and server specialists use these cloud-based computers to prepare data for business intelligence, research, and app or product development. However, over the last two decades data security was often overlooked in that process, leading to numerous breaches and an increasing demand for protecting the (unstructured) data lakes and (structured) data warehouses that feed into these processes.

Data Security: Safeguarding Our Digital Assets

Initially, data security efforts revolved around identity and access management to ensure that only authorized users could access internal data sources. But securing IaaS and PaaS systems with firewalls and ‘demilitarized zones’ wasn't enough. Even with these security measures in place, humans still make lots of mistakes with data. Information was and is often (over)shared by employees using popular Software as a Service (SaaS) products like Slack, Microsoft Word, Google Docs, Microsoft Teams, Dropbox, email, and so on that can be accessed from anywhere on any device. This gives us near-infinite vectors for malicious and unintentional data leaks.

To address this issue, security experts implemented Virtual Private Networks (VPNs) to create secure tunnels for data transmission over the internet and maintained enterprise versions of these productivity-boosting SaaS apps, so that access for work purposes can be monitored, reducing the likelihood of malicious or accidental data sharing when working from home or a coffee shop. But if these methods of securing data improve our ability to ringfence our organizational capital, there remains the issue of how we use (not just share) data appropriately and protect our digital rights…

Data Privacy and the Concept of Information Protection

The right to privacy extends beyond securing data as an asset. Society demands that personal data in files, chats and documents not be used or shared without consent or a reasonable expectation from the data subject in the file or the data owner that created the file. But how can an organization verify wrongful data sharing if they can't identify where this data lives or how many people it’s been shared with?

This challenge led Microsoft to develop an evolving data classification engine called Microsoft Purview Information Protection. This technology leverages a combination of information rights management, pattern recognition to identify a sensitive group of characters, and the capability to tag files with properties indicating their confidential or sensitive nature.

Whether a file tagged as sensitive is stored in SharePoint (data at rest), a text with someone’s social security number is being shared via Teams (data in motion) or a list of 100s of credit cards is being edited in an Excel spreadsheet (in use), they are flagged as sensitive no matter where they live in the organizational IT environment.

This has been a major step-change in data security and privacy because if a computer can recognize and classify a text or file without human intervention, it can automatically restrict its access, and block the file from being shared or saved on a personal cloud storage app. Brought together, this is Microsoft’s approach to information protection, and it certainly helps when you have billions of files and texts being created and shared in organisations every day.

Bridging the Disciplines

Before my time working at the company, innovators like Satya Nadella and Brad Smith at Microsoft saw potential in merging the application of the concepts I’ve explored here — identity and access management, secure data sharing, and data classification to protect sensitive information. These tools were built in response to regulations like the General Data Protection Regulation (GDPR), a landmark data privacy law in the European Union. Leadership very soon realized we could abstract these integration efforts away from customers and make their data security, privacy & governance journey a far more seamless one.

As I write in 2023, we’ve evolved these technologies into a comprehensive solution set that allows organizations and employees to exercise unparalleled control & respect over their data, by allowing organisations to propagate the latest of these capabilities across their computing environment just like downloading the latest version of an iPhone app.

We call this ‘cloud-native’ or ‘built-in’ data security and privacy. Nonetheless, the technologies are customizable to the purpose of each organisation. Still, the governance part of the equation remains a people and process challenge rather than only a technology challenge. For many years we’ve been asking each other ‘What should you do with data?’, not just ‘What can you do with data?’. The difference today is not the choices to be made, it’s that these capabilities have made it possible to make them.


Thanks to Jan Willem Roks (Technical Specialist at Microsoft) for reviewing my blog. He suggested I do a "How to get started with Purview" blog. That's a great idea, I'll get to it someday I hope. Consider the images Microsoft's, but the opinions my own.

Did you find this article valuable?

Support Rodney Mhungu by becoming a sponsor. Any amount is appreciated!