Lessons From IAM: Governing Data in the Cloud
What it means to apply IAM principles to the data cloud
Identity and Access to Data
Identity and access management (IAM) is the set of technology and processes that grant access to the right company assets, to the right people, at the right time, and for the right reason. In my twenty years of IAM experience, I have seen the full evolution from web single sign on in the early 2000s, to identity provisioning in 2004, identity governance and administration in 2005, and finally identity and access intelligence and automation driven by “identity fabrics” in 2019.
It is time for IAM concepts to be applied to the data cloud. At ALTR, we see a large trend of increased complexity, maintenance, and operating costs for ensuring people have access to the right data, for the right reasons, and at the right time. Applying IAM concepts to data can simplify this process and reduce your administrative burden.
Treat Data Access Controls Like IAM
Just as IAM platforms centrally manage identities and their access to applications, so should a central data governance and security system manage access to sensitive data. Sounds neat, right? Well, it's a bit more complicated than that. Just as Identity is moving towards a multi-cloud model, so is data. This means that data is distributed across multiple data clouds like Snowflake, AWS (Amazon Web Services) Redshift, and Google BiqQuery. This shift into a multi-data cloud architecture requires a platform that has the following characteristics:
- Simple – Simple to use by line of business line users. You do not shouldn’t necessarily need to be an experienced cybersecurity professional or data security engineer to set up, configure, and get value from the platform.
- Distributed (Snowflake, AWS, Google) – The platform must support ease of connectivity and integration to the major data cloud platforms.
- Controlled from a single platform and pane of glass – Centralized management but distributed control is key to enforce common governance policies across data cloud platforms.
- Intelligence is built in – Intelligence-driven data security should deliver insights which drive policy and automation.
- Performance as king – Maintaining an adequate level of data access performance while observing data access and protecting against a variety of threats such as a credentialed breach.
- Delivered as a service – The centralized but distributed data governance and security system must be delivered with zero code and zero on-premises footprint.
It is All About the Roles, Tags, and Grants
A cloud native data governance and security system uses a cloud service provider’s (AWS, GCP, Azure) IAM roles to grant privileges on data warehouses, schemas, and table rows and columns via policy tags. These grants based on IAM roles allow for proper user or application operations on sensitive data.
A data security strategy that combines a multi-level (warehouse, schema, table, rows, columns) approach in an easy to implement, scale, and manage strategy is the “north star” of any sensitive data protection program. Answering key questions on establishing this multi-level model and augmenting it with secure views and functions are key to ensuring a solid strategy against massive data exposure and exfiltration.
Identity Is No Good Without Context
Having a strategy to map your Identity model to your sensitive data is great, but now you need to think about context. This approach is the “dynamic” nature of responding to potential threats. To gain context, you need a broader view of identity, data sources, security controls, and what governance rules apply.
By connecting identity, governance, and security together, you can gain much more granular views into and control over how data is used.
End to End Data Protection Use Case
Let us look at an end-to-end use case. In this sample use case, we set up a data catalog service to discover data in Snowflake, classify sensitive data, and notify ALTR of sensitive data for consumption governance and protection. Here are the five simple steps to take for this use case.
- Discover data from the Snowflake warehouse, schema, and tables. Automatically look for and classify sensitive data. This data could be any PII (Personal Identifiable Information), PHI (Protected Health Information), or data deemed sensitive by regulatory requirements such as GDPR (General Data Protection Regulation) or CCPA (California Consumer Protection Act).
- Leverage In ALTR for , gaining data consumption intelligence based on the discovered data and consumption patterns from users and applications. With this intelligence, we will understand who is accessing sensitive data and why.
- After identifying consumption patterns, we can use ALTR to govern access to sensitive data. We then place limits on data consumption, protecting data against credentialed threats.
- The last step is to further protect sensitive data by replacing it with mapless and keyless tokens using ALTR. This approach allows for the utmost security by giving you a way to tokenize data without using complex key management systems and requirements that make cryptographic alternatives hard to maintain and scale.
This end-to-end use case can be scaled to multiple data cloud platforms to govern and protect sensitive data distributed across cloud data platforms. ALTR provides the central data governance and security control point to manage policy once and affect data across your organization, significantly reducing complexity and cost for data protection.
To learn more about how ALTR can help your business, check out the latest demo from ALTR CTO, James Beecham, here.