ALTR eBook: Snowflake Data Governance Buying Guide
Blog Post

Where to Activate Tokenization on Your Path to the Data Cloud

3 risk-based models for tokenizing sensitive data in your cloud data warehouse

Register Here

Where to Activate Tokenization on Your Path to the Data Cloud

3 risk-based models for tokenizing sensitive data in your cloud data warehouse

Published on Nov 10, 2021
Reading Time:

Get started for Free

Case Study: How ALTR helped Q2's Biller Direct offering become Level 1 PCI DSS certified in 30 days

In my last post, I talked about why we think tokenization is the ideal solution for data security in the cloud. But where do you start tokenizing data? I see three main models for tokenization on your cloud data journey. The best choice for your company will depend on the risks you’re facing.

Level 1: Tokenization just before moving to the cloud data warehouse

If you’re concerned about protecting sensitive in your destination database, often a cloud data warehouse like Snowflake, you’re not alone. While you may feel confident in the security measures in place in your own datacenter, storing sensitive data in the cloud is a different ballgame.  

The first issue might be that you’re consolidating sensitive data spread across multiple databases, each with its siloed and segmented security and log in requirements, into one central repository. This means a bad actor, dishonest employee or hacker just needs to sneak into one location to have access to all your sensitive data. It creates a much bigger risk. And this leads to the second issue: as more and more data is stored in prominent cloud data warehouses, they have become a top target for bad actors and nation states. Why should they target Salesforce or Workday separately when all the same data can be found in one place? The third concern might be about privileged access from Snowflake employees or your Snowflake admins who could, but really shouldn’t, have access to the sensitive data in your offsite database.  

In these cases, it makes sense for you to choose “Level 1 Tokenization”: tokenize data just before it goes into the cloud. By tokenizing data that is stored in the database, you ensure that only the people you authorize have access to the plain text data.  

Level 2: Tokenization before moving through the ETL Process

As you’re planning your path to the cloud, you may be concerned about data as soon as it leaves the secure walls of your datacenter. This is especially challenging for CISOs who’ve spent years hardening the security of their perimeter only to have control wrested away as sensitive data is moved to cloud data warehouses they don’t own. If you’re working with an outside ETL (extract, transform, load) provider to help you prepare, combine, and move your data, that will be the first step outside your perimeter causing concern. Even though you hired them, without years of built-up trust, you may not want them to have access to sensitive data. Or it may even be out of your hands—you may have agreements or contracts with your customers specifying that you can’t let any vendor or other entity have access without written consent.  

In this case, “Level 2 Tokenization” is appropriate. This takes one step back in the data transfer path and tokenizes sensitive data before it even reaches the ETL. Instead of direct connection to the source database, the provider connects through the tokenization software which returns tokens. ALTR partners with SaaS-based ETL providers like Matillion to make this seamless for enterprises.  

Level 3: Full end-to-end tokenization

If you’re a very large financial institution classified as “critical vendor” by the US government, you’re familiar with the rigorous security required. This includes ensuring that ultra-sensitive financial data is highly protected – no unauthorized users, inside or outside the enterprise, can have access to that data, no matter where it is. You already have this nailed down on-prem, but we’re living in the 21st century and everyone from marketing to operations is saying “you have to go to the cloud.” In this case, you’ll need “Level 3 Tokenization”: full end-to-end tokenization of all your onsite databases all the way through to your cloud data warehouse.  

As you can imagine, this can be a complex task. It requires tokenization across multiple on-premises systems before even starting the data transfer process. The upside is that it can also shine a light on who’s accessing your data, wherever it is. You’ll quickly hear from people throughout the company who relied on sensitive data to do their jobs when the next time they run a report all they get back is tokens. This turns into a benefit by stopping “dark access” to sensitive data.  

Our customers found this so valuable, ALTR created a tool called “the observer” they can hook up to a source database to provide a report of every connection, every user, every IP address and every query run against the source database. The process forces companies to understand the full lifecycle of their data – where it’s being used and by whom – and helps ensure sensitive data is truly secure.  

The power of ALTR’s tokenization platform throughout your cloud data path

ALTR’s tokenization platform provides unique data security benefits across your entire path to the cloud. Our SaaS-based approach means we can cover data wherever it’s located: on-premises, in the cloud or even in other SaaS-based software like Salesforce. This also allows us to deliver innovations like new token formats or new security features more quickly, with no need to upgrade. Our tokenization solutions also range from the most fundamental level all the way up to PCI Level 1 compliant, allowing companies to choose the best balance of speed, security and cost for their business. We’ve also invested heavily in IP that enables our database driver to connect transparently and keep data usable while tokenized. The drivers can, for example, perform the lookups and joins needed to keep applications that are unused to tokenization running.  

With tokenization from ALTR, you can bring sensitive data safely into the cloud to get full analytic value from it, while meeting contractual security requirements or the steepest regulatory challenges.

In my last post, I talked about why we think tokenization is the ideal solution for data security in the cloud. But where do you start tokenizing data? I see three main models for tokenization on your cloud data journey. The best choice for your company will depend on the risks you’re facing.

Level 1: Tokenization just before moving to the cloud data warehouse

If you’re concerned about protecting sensitive in your destination database, often a cloud data warehouse like Snowflake, you’re not alone. While you may feel confident in the security measures in place in your own datacenter, storing sensitive data in the cloud is a different ballgame.  

The first issue might be that you’re consolidating sensitive data spread across multiple databases, each with its siloed and segmented security and log in requirements, into one central repository. This means a bad actor, dishonest employee or hacker just needs to sneak into one location to have access to all your sensitive data. It creates a much bigger risk. And this leads to the second issue: as more and more data is stored in prominent cloud data warehouses, they have become a top target for bad actors and nation states. Why should they target Salesforce or Workday separately when all the same data can be found in one place? The third concern might be about privileged access from Snowflake employees or your Snowflake admins who could, but really shouldn’t, have access to the sensitive data in your offsite database.  

In these cases, it makes sense for you to choose “Level 1 Tokenization”: tokenize data just before it goes into the cloud. By tokenizing data that is stored in the database, you ensure that only the people you authorize have access to the plain text data.  

Level 2: Tokenization before moving through the ETL Process

As you’re planning your path to the cloud, you may be concerned about data as soon as it leaves the secure walls of your datacenter. This is especially challenging for CISOs who’ve spent years hardening the security of their perimeter only to have control wrested away as sensitive data is moved to cloud data warehouses they don’t own. If you’re working with an outside ETL (extract, transform, load) provider to help you prepare, combine, and move your data, that will be the first step outside your perimeter causing concern. Even though you hired them, without years of built-up trust, you may not want them to have access to sensitive data. Or it may even be out of your hands—you may have agreements or contracts with your customers specifying that you can’t let any vendor or other entity have access without written consent.  

In this case, “Level 2 Tokenization” is appropriate. This takes one step back in the data transfer path and tokenizes sensitive data before it even reaches the ETL. Instead of direct connection to the source database, the provider connects through the tokenization software which returns tokens. ALTR partners with SaaS-based ETL providers like Matillion to make this seamless for enterprises.  

Level 3: Full end-to-end tokenization

If you’re a very large financial institution classified as “critical vendor” by the US government, you’re familiar with the rigorous security required. This includes ensuring that ultra-sensitive financial data is highly protected – no unauthorized users, inside or outside the enterprise, can have access to that data, no matter where it is. You already have this nailed down on-prem, but we’re living in the 21st century and everyone from marketing to operations is saying “you have to go to the cloud.” In this case, you’ll need “Level 3 Tokenization”: full end-to-end tokenization of all your onsite databases all the way through to your cloud data warehouse.  

As you can imagine, this can be a complex task. It requires tokenization across multiple on-premises systems before even starting the data transfer process. The upside is that it can also shine a light on who’s accessing your data, wherever it is. You’ll quickly hear from people throughout the company who relied on sensitive data to do their jobs when the next time they run a report all they get back is tokens. This turns into a benefit by stopping “dark access” to sensitive data.  

Our customers found this so valuable, ALTR created a tool called “the observer” they can hook up to a source database to provide a report of every connection, every user, every IP address and every query run against the source database. The process forces companies to understand the full lifecycle of their data – where it’s being used and by whom – and helps ensure sensitive data is truly secure.  

The power of ALTR’s tokenization platform throughout your cloud data path

ALTR’s tokenization platform provides unique data security benefits across your entire path to the cloud. Our SaaS-based approach means we can cover data wherever it’s located: on-premises, in the cloud or even in other SaaS-based software like Salesforce. This also allows us to deliver innovations like new token formats or new security features more quickly, with no need to upgrade. Our tokenization solutions also range from the most fundamental level all the way up to PCI Level 1 compliant, allowing companies to choose the best balance of speed, security and cost for their business. We’ve also invested heavily in IP that enables our database driver to connect transparently and keep data usable while tokenized. The drivers can, for example, perform the lookups and joins needed to keep applications that are unused to tokenization running.  

With tokenization from ALTR, you can bring sensitive data safely into the cloud to get full analytic value from it, while meeting contractual security requirements or the steepest regulatory challenges.