Show Menu
Cheatography

Snowflake SnowPro Core Cheat Sheet (DRAFT) by

Cheatsheet about snowflake

This is a draft cheat sheet. It is a work in progress and is not finished yet.

What is Snowflake?

Snowflake is a Data Solution provided as Softwa­re-­as-­a-S­ervice (SaaS). It’s not available on-premise. It combines a new SQL query engine with an innovative archit­ecture natively designed for the cloud.

Use cases

Snowflake is optimal as Data Warehouse, Data Lake, Data Exchange, Data Apps, Data Science, and Data Engine­ering.

Snowflake Use Cases

Capacity Options

On-Demand
Fixed rate for the consumed services
Pre-paid
Cheaper, but commitment to Snowflake

Micro-­par­titions

All data in Snowflake tables are automa­tically divided into micro-­par­tit­ions, contiguous units of storage between 50 and 500MB of uncomp­ressed data, organized in a columnar way.

They are immutable, meaning they cannot be changed once created.

Pruning process

Technique to analyze the smallest number of micro-­par­titions to solve a query. It retrieves all the necessary data to give a solution without looking at all the micro-­par­tit­ions, saving a lot of time to return the result. You can find a real example here

Load Data

Bulk Load
Loading batches of data from files already available at any stage into Snowflake tables
Continuous Load
Load small volumes of data (micro­-ba­tches) and increm­entally make them available for analysis.

Other Concepts

Parnet Connect
Technology & Solution partners
Compliance
HITRUST / HIPAA, ISO/IEC 27001, FedRAMP Moderate, PCI-DSS, etc
Data Market­place
For providers to buy or sell their datasets. Free, Person­alized, and Paid Listings
Column Level Security
Dynamic Data Masking & External Tokeni­zation

Sequences

Use case
Generate unique numbers across sessions and statements
nextval
Function to generate a set of distinct values

File Formats

Structured Data
CSV. Fastest way to load data.
Semi-s­tru­ctured Data
JSON, Parquet, XML, Avro, ORC
FLATTEN
Convert semi-s­tru­ctured data to a relational repres­ent­ation

Access Management Approaches

Discre­tionary Access Control (DAC)
Each object has an owner who can, in turn, grant access to that object
Role-Based Access Control (RBAC)
Access privileges are assigned to roles, which are, in turn, given to users
 

Editions

Standard
Enterprise
Business Critical
Virtual Private Snowflake

Cloud Providers

Amazon Web Services
Azure
Google Cloud Platform

Connect to Snowflake

Web Interface
SnowSQL (CLI Client)
ODBC
JDBC
SDK for Node, Python, Kafka, Go, and more!

Snowflake Objects

Account
Must be unique.
Warehouse
Virutal Machine to execute queries. Compute Part.
Database
Logical Collection of Schemas.
Schema
Logical Collection of Objects. The Public schema and the Inform­ati­on_­Schema are created when creating a Database.

Snowflake Objects

Types of tables

Permanent
Transient
Temporary
External

Types of views

Regular
Materi­alized
Secure

Stage Metadata

METADA­TA$­FIL­ENAME
Name of the staged data file the current row belongs to.
METADA­TA$­FIL­E_R­OW_­NUMBER
Row number for each record in the container staged data file.

Fail-Safe

Use Cases
It ensures historical data is protected in the event of a system failure or other catast­rophic event
Retention Period
NON-CO­NFI­GURABLE 7-day period
Example
No, you cannot recover this data alone; you MUST ask Snowflake support
Note: Fail-Safe requires additional storage, which will be reflected in your monthly storage charges