A data professional who delivers data sets which have been transformed, then tested and documented, as well as code-reviewed.
- In addition to delivering transformed data sets, these individuals apply version control and continuous integration to the analytics code base.
- On the data team, the Analytics Engineer typically adopts functions which fall between those of a Data Engineer and a Data Analyst
Massive amounts of data processed by complex systems and tools to decipher patterns or trends that provide business insights to an enterprise.
Business Intelligence (BI) Tools
Analysts and managers use BI tools to help inform business insights. This software collects and digests a large amount of structured and unstructured data from many sources, both internal and external.
- Examples: Tableau, Power-BI, Oracle net-suite, Qlik, SAS, Yellowfin BI.
- This software gives the user the ability to analyze, manage and combine broad sets of data, including ad-hoc analysis and querying, enterprise reporting, online analytical processing (OLAP), etc.
Chief Information Security Officer (CISO)
The Chief Information Security Officer is an executive of senior status responsible for creating and maintaining the organization or enterprise’s business goals, and to secure the protection of the organization’s information technologies and assets.
Cloud Data Warehouse (CDW)
This is a SaaS, remote storage system where data is transmitted, stored, backed up, overseen, and made accessible to data users within an organization.
- With a Cloud-based Data Warehouse, users usually pay for their cloud data storage on a per-consumption, monthly rate.
- Examples: AWS, Azure, Snowflake, Redshift, BigQuery.
Cloud Data Security
Cloud data security is the practice of utilizing a diverse collection of applications, policies, and controls to protect data stored in cloud from various threats, including unauthorized access, leak, theft and exposure.
A system positioned between a client and a data center, web server, or SaaS application. It is a go-between for the client and server/application. It allows for secure access while protecting from various threats.
A form of data security that ensures that only specified users can access specific columns associated with their role or department.
Credentialed Access Threat
A credentialed access threat (sometimes also called a “privileged access threat”) occurs when a bad actor utilizes authorized credentials to gain access to valuable information within an organization, creating significant damage to the enterprise.
The Data Analyst discovers informed conclusions, useful information and uncovers data-driven outcomes by inspecting, cleaning, and transforming model data.
- The Data Analyst expounds and studies data to pinpoint business knowledge in order to resolve a targeted problem
The Data Architect creates and sets policies for the storage and access of data, aligns data sources within an organization, integrates new technologies into the data system and designs and manages the organization’s data systems.
- Although sometimes looped in with a Data Engineer or a Data Steward, the Data Architect’s role is quite different as this data professional is responsible for creating and maintaining the overall data architecture for the organization.
Database Administrator (DBA)
The Database Administrator monitors data operations so that all systems perform as intended. This essential data persona provides operational support as needed.
- A Database Administrator is most often referred to as a “DBA.”
- The DBA is responsible for understanding and managing the overall database environment within the organization.
Database Management system (DBMS)
This is an operating system that retrieves, defines, manipulates, and updates data within databases.
- There are four types of DBMS: relational, hierarchical, network, object-oriented
A data breach occurs when data is intentionally or unintentionally accessed, viewed, obtained, or removed by an unapproved or malicious service, application, or data user.
A collection of applications that are combined and used to analyze and collect information on the enterprise level.
The Data Engineer builds data structures that collect, organize, and convert raw data into usable details for data scientists and analysts to explicate. Data engineers prepare data for additional analytical or operational use.
- Often the Data Engineer’s primary function is to construct data pipelines to unite information from various originating systems.
Data governance is the process of overseeing the integrity, security, usability, and availability of data within an enterprise so that business goals and regulatory requirements can be met.
The process of gathering, storing, and utilizing data in a secure and cost-efficient manner so that business decisions can be acted upon.
Data Mesh is a style of data platform design that can work with a large variety of data types by leveraging a self-service formation within the domain.
- This data platform offers domain-driven design and is a flexible, scalable software development that matches the structure and language of the existing code with its corresponding business domain.
A rule or rules that indicates how employees are able to interact with data within the company.
The team within an organization that helps with decision-making and information procurement for the company, uncovering insights and information important to the organization’s growth. The Data Team follows and analyzes the progress of products and assets within the company.
- Most typically, the Data Team is comprised of one or more of the following: Data scientist, Data Engineer, and Data Analyst, though there can be variations from organization to organization.
An individual who accesses datasets for research, statistical, or business purposes.
Dynamic Data Masking (DDM)
The act of altering sensitive data so that it possesses little-to-no value to anyone unauthorized who has accessit it but is still usable and valuable to those with authorized access. Abbreviated as “DDM.”
This team manages and implements governance concerns and objectives, designs governance structure, chooses necessary technologies new to the organization and converts to them, and sets in place measurements, protocol, and control for sensitive data sets.
- The Governance Team allows enterprises to remain agile, while still being compliant with ever-changing legislation, while still upholding the necessary thresholds of data governance and control.
A set of data that provides information about various other data.
Modern Data Stack
The modern data stack is a collection of tools designed to help organizations and enterprises become more efficient and save money, as well as providing a means for aiding the organization to a more data-driven status.
Payment Card Industry compliance (PCI)
These requirements ensure that any place that processes, retains, or transmits information from credit cards, always upholds secure conditions.
Protected Health Information (PHI)
This is information pertaining to any area of healthcare, from care to prescriptions to payment, that can be associated with a specific person.
Personal Identifiable Information (PII)
This is any information that can be tied to an identifiable individual.
- Address, social security number, and birth date are all examples of PII.
Role-based Access Control (RBAC)
RBAC is a means of computer security which restricts access to a network based on the person's role within the organization.
- One of the major functions of RBAC is to underline and specify who has access to what data, and when, within the organization.
With Row-Level-Policy, users have access to a table without having access to all rows on that table. Database Administrators define policies to control how sensitive data is displayed and operated based on role, and Row-Level Policy helps to further perpetuate the specificity how that data governed and secured.
- Row-Level Policy allows administrators to filter access down to their users based on specific attributes pertaining to the user row in which they are categorized within the organization.
A team of professionals such as a CISO (Chief Information Security Officer), Security Engineer, Security Manager, and more, who work together to effectively analyze, manage, and execute tasks and technology that keep an organization’s data and systems protected.
Data or data sets that are determined to need greater protection and require carefully controlled access
Software as a Service (SaaS)
SaaS is a means of delivering and licensing software where users can access it through a subscription or membership online as opposed to relying on a physical computer.
- ALTR’s free platform is a great example of a modern SaaS product
Snowflake Structured Query Language (SnowSQL)
Snowflake Structured Query Language is the coding language required to connect to Snowflake and complete SQL queries.
Structured Query Language (SQL)
Structured Query Language is a programming language specific to the domain in which it exists. Its purpose is to manage data within a relational database management system (RDMS) or, in a relational data stream management system, to execute stream processing.
This form of governance policy enables the user to control access to multiple columns simultaneously within a policy. Governance on tagged columns is enforced automatically, eliminating the need to update policies as newly tagged data is introduced, saving valuable time for the data team.
- Where a specific tag is applied to a user or role within an organization, the associated governance policy will deploy automatically without the hands-on user needing to write or re-write access policy.
Tokenization is the process of replacing actual sensitive data elements with random and non-sensitive data elements (tokens) that have no exploitable value, allowing for a heightened level of data security during the data migration process.
- Tokens cannot be broken down, deciphered, or unencrypted. This makes Tokenization a valuable tool for any organization handling highly sensitive data (credit card data, PHI, PII, etc.) that is moving from one database to another (typically on-prem to cloud-based).