Created
September 6, 2025 17:23
-
-
Save shinyaa31/e6efbcba3937f38b07c23081c895b3f5 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| /*************************************************************************************************** | |
| Asset: Zero to Snowflake - Getting Started with Snowflake | |
| Version: v1 | |
| Copyright(c): 2025 Snowflake Inc. All rights reserved. | |
| **************************************************************************************************** | |
| Getting Started with Snowflake | |
| 1. Virtual Warehouses & Settings | |
| 2. Using Persisted Query Results | |
| 3. Basic Data Transformation Techniques | |
| 4. Data Recovery with UNDROP | |
| 5. Resource Monitors | |
| 6. Budgets | |
| 7. Universal Search | |
| ****************************************************************************************************/ | |
| -- Before we start, run this query to set the session query tag. | |
| ALTER SESSION SET query_tag = '{"origin":"sf_sit-is","name":"tb_zts","version":{"major":1, "minor":1},"attributes":{"is_quickstart":1, "source":"tastybytes", "vignette": "getting_started_with_snowflake"}}'; | |
| -- We'll begin by setting our Worksheet context. We will set our database, schema and role. | |
| USE DATABASE tb_101; | |
| USE ROLE accountadmin; | |
| /* 1. Virtual Warehouses & Settings | |
| ************************************************************** | |
| User-Guide: | |
| https://docs.snowflake.com/en/user-guide/warehouses-overview | |
| ************************************************************** | |
| Virtual Warehouses are the dynamic, scalable, and cost-effective computing power that lets you | |
| perform analysis on your Snowflake data. Their purpose is to handle all your data processing needs | |
| without you having to worry about the underlying technical details. | |
| Warehouse parameters: | |
| > WAREHOUSE_SIZE: | |
| Size specifies the amount of compute resources available per cluster | |
| in a warehouse. The available sizes range from X-Small to 6X-Large. | |
| Default: 'XSmall' | |
| > WAREHOUSE_TYPE: | |
| Defines the type of virtual warehouse, which dictates its architecture and behavior | |
| Types: | |
| 'STANDARD' for general purpose workloads | |
| 'SNOWPARK_OPTIMIZED' for memory-intensive workloads | |
| Default: 'STANDARD' | |
| > AUTO_SUSPEND: | |
| Specifies the period of inactivity after which the warehouse will automatically suspend itself. | |
| Default: 600s | |
| > INITIALLY_SUSPENDED: | |
| Determines whether the warehouse starts in a suspended state immediately after it is created. | |
| Default: TRUE | |
| > AUTO_RESUME: | |
| Determines whether the warehouse automatically resumes from a suspended state when a query is directed to it. | |
| Default: TRUE | |
| With that, let's create our first warehouse! | |
| */ | |
| -- Let's first look at the warehouses that already exist on our account that you have access privileges for | |
| SHOW WAREHOUSES; | |
| /* | |
| This returns the list of warehouses and their attributes: name, state (running or suspended), type, and size | |
| among many others. | |
| We can also view and manage all warehouses in Snowsight. To access the warehouse page, click the Admin button | |
| on the Navigation Menu, then click the 'Warehouses' link in the now expanded Admin category. | |
| Back on the warehouse page, we see a list of the warehouses on this account and their attributes. | |
| */ | |
| -- You can easily create a warehouse with a simple SQL command | |
| CREATE OR REPLACE WAREHOUSE my_wh | |
| COMMENT = 'My TastyBytes warehouse' | |
| WAREHOUSE_TYPE = 'standard' | |
| WAREHOUSE_SIZE = 'xsmall' | |
| MIN_CLUSTER_COUNT = 1 | |
| MAX_CLUSTER_COUNT = 2 | |
| SCALING_POLICY = 'standard' | |
| AUTO_SUSPEND = 60 | |
| INITIALLY_SUSPENDED = true | |
| AUTO_RESUME = false; | |
| /* | |
| Now that we have a warehouse, we should specify that this Worksheet uses this warehouse. We can | |
| do this either with a SQL command, or in the UI. | |
| */ | |
| -- Use the warehouse | |
| USE WAREHOUSE my_wh; | |
| /* | |
| We can try running a simple query, however, you will see an error message in the results pane, informing | |
| us that our warehouse MY_WH is suspended. Try it now. | |
| */ | |
| SELECT * FROM raw_pos.truck_details; | |
| /* | |
| An active warehouse is required for running queries, as well as all DML operations, so we'll | |
| need to resume our warehouse if we want to get insights from our data. | |
| The error message also came with a suggestion to run the SQL command: | |
| 'ALTER warehouse MY_WH resume'. Let's do just that! | |
| */ | |
| ALTER WAREHOUSE my_wh RESUME; | |
| /* | |
| We'll also set AUTO_RESUME to TRUE so we can avoid having to manually | |
| resume the warehouse should it suspend again. | |
| */ | |
| ALTER WAREHOUSE my_wh SET AUTO_RESUME = TRUE; | |
| --The warehouse is now running, so lets try to run the query from before | |
| SELECT * FROM raw_pos.truck_details; | |
| -- Now we are able to start running queries on our data | |
| /* | |
| Next, let's take a look at the power of warehouse scalability in Snowflake. | |
| Warehouses in Snowflake are designed for scalability and elasticity, giving you the power | |
| to adjust compute resources up or down based on workload needs. | |
| We can easily scale up our warehouses on the fly with a simple ALTER WAREHOUSE statement. | |
| */ | |
| ALTER WAREHOUSE my_wh SET warehouse_size = 'XLarge'; | |
| --Let's now take a look at the sales per truck. | |
| SELECT | |
| o.truck_brand_name, | |
| COUNT(DISTINCT o.order_id) AS order_count, | |
| SUM(o.price) AS total_sales | |
| FROM analytics.orders_v o | |
| GROUP BY o.truck_brand_name | |
| ORDER BY total_sales DESC; | |
| /* | |
| With the Results panel open, take a quick look at the toolbar in the top right. Here we see options to search, | |
| select columns, view query details and duration stats, view column stats, and download results. | |
| Search - Use search terms to filter the results | |
| Column selection - Enable/disable columns to display in the results | |
| Query details - Contains information related to the query like the SQL text, rows returned, query ID, the | |
| role and warehouse it was executed with. | |
| Query duration - Breaks down how long it took for the query to run by compilation, provisioning and execution times. | |
| Column stats - Displays data relating to the distributions of the columns on the results panel. | |
| Download results - Export and download the results as csv. | |
| */ | |
| /* 2. Using Persisted Query Results | |
| ******************************************************************* | |
| User-Guide: | |
| https://docs.snowflake.com/en/user-guide/querying-persisted-results | |
| ******************************************************************* | |
| Before we proceed, this is a great place to demonstrate another powerful feature in Snowflake: | |
| the Query Result Cache. | |
| When we first ran the above query it took several seconds to complete, even with an XL warehouse. | |
| Run the same 'sales per truck' query above and note the total run time in the Query Duration pane. | |
| You'll notice that it took several seconds the first time you ran it to only a few hundred milliseconds | |
| the next time. This is the query result cache in action. | |
| Open the Query History panel and compare the run times between the first time the query was run for the second time. | |
| Query Result Cache overview: | |
| - Results are retained for any query for 24 hours, however the timer is reset any time the query is executed. | |
| - Hitting the result cache requires almost no compute resources, ideal for frequently run reports or dashboards | |
| and managing credit consumption. | |
| - The cache resides in the Cloud Services Layer, meaning it is logically separated from individual warehouses. | |
| This makes it globally accessible to all virtual warehouses & users within the same account. | |
| */ | |
| -- We'll now start working with a smaller dataset, so we can scale the warehouse back down | |
| ALTER WAREHOUSE my_wh SET warehouse_size = 'XSmall'; | |
| /* 3. Basic Transformation Techniques | |
| Now that our warehouse is configured and running, the plan is to get an understanding of the distribution | |
| of our trucks' manufacturers, however, this information is embedded in another column 'truck_build' that stores | |
| information about the year, make and model in a VARIANT data type. | |
| VARIANT data types are examples of semi-structured data. They can store any type of data including OBJECT, | |
| ARRAY and other VARIANT values. In our case, the truck_build stores a single OBJECT which contains three distinct | |
| VARCHAR values for year, make and model. | |
| We'll now isolate all three properties into their own respective columns to allow for simpler and easier analytics. | |
| */ | |
| SELECT truck_build FROM raw_pos.truck_details; | |
| /* Zero Copy Cloning | |
| The truck_build column data consistently follows the same format. We'll need a separate column for 'make' | |
| to more easily perform quality analysis on it. The plan is to create a development copy of the truck table, add new columns | |
| for year, make, and model, then extract and store each property from the truck build VARIANT object into these new columns. | |
| Snowflake's powerful Zero Copy Cloning lets us create identical, fully functional and separate copies of database | |
| objects instantly, without using any additional storage space. | |
| Zero Copy Cloning leverages Snowflake's unique micropartition architecture to share data between the cloned object and original copy. | |
| Any changes to either table will result in new micropartitions created for only the modified data. These new micro-partitions are | |
| now owned exclusively by the owner, whether its the clone or original cloned object. Basically, any changes made to one table, | |
| will not be to either the original or cloned copy. | |
| */ | |
| -- Create the truck_dev table as a Zero Copy clone of the truck table | |
| CREATE OR REPLACE TABLE raw_pos.truck_dev CLONE raw_pos.truck_details; | |
| -- Verify successful truck table clone into truck_dev | |
| SELECT TOP 15 * | |
| FROM raw_pos.truck_dev | |
| ORDER BY truck_id; | |
| /* | |
| Now that we have a development copy of the truck table, we can start by adding the new columns. | |
| Note: To run all three statements at once, select them and click the blue 'Run' button at the top right of the screen, or use your keyboard. | |
| Mac: command + return | |
| Windows: Ctrl + Enter | |
| */ | |
| ALTER TABLE raw_pos.truck_dev ADD COLUMN IF NOT EXISTS year NUMBER; | |
| ALTER TABLE raw_pos.truck_dev ADD COLUMN IF NOT EXISTS make VARCHAR(255); | |
| ALTER TABLE raw_pos.truck_dev ADD COLUMN IF NOT EXISTS model VARCHAR(255); | |
| /* | |
| Now let's update the new columns with the data extracted from the truck_build column. | |
| We will use the colon (:) operator to access the value of each key in the truck_build | |
| column, then set that value to its respective column. | |
| */ | |
| UPDATE raw_pos.truck_dev | |
| SET | |
| year = truck_build:year::NUMBER, | |
| make = truck_build:make::VARCHAR, | |
| model = truck_build:model::VARCHAR; | |
| -- Verify the 3 columns were successfully added to the table and populated with the extracted data from truck_build | |
| SELECT year, make, model FROM raw_pos.truck_dev; | |
| -- Now we can count the different makes and get a sense of the distribution in our TastyBytes food truck fleet. | |
| SELECT | |
| make, | |
| COUNT(*) AS count | |
| FROM raw_pos.truck_dev | |
| GROUP BY make | |
| ORDER BY make ASC; | |
| /* | |
| After running the query above, we notice a problem in our dataset. Some trucks' makes are 'Ford' and some are 'Ford_', | |
| giving us two different counts for the same truck manufacturer. | |
| */ | |
| -- First we'll use UPDATE to change any occurrence of 'Ford_' to 'Ford' | |
| UPDATE raw_pos.truck_dev | |
| SET make = 'Ford' | |
| WHERE make = 'Ford_'; | |
| -- Verify the make column has been successfully updated | |
| SELECT truck_id, make | |
| FROM raw_pos.truck_dev | |
| ORDER BY truck_id; | |
| /* | |
| The make column looks good now so let's SWAP the truck table with the truck_dev table | |
| This command atomically swaps the metadata and data between two tables, instantly promoting the truck_dev table | |
| to become the new production truck table. | |
| */ | |
| ALTER TABLE raw_pos.truck_details SWAP WITH raw_pos.truck_dev; | |
| -- Run the query from before to get an accurate make count | |
| SELECT | |
| make, | |
| COUNT(*) AS count | |
| FROM raw_pos.truck_details | |
| GROUP BY | |
| make | |
| ORDER BY count DESC; | |
| /* | |
| The changes look great. We'll perform some cleanup on our dataset by first dropping the truck_build | |
| column from the production database now that we have split that data into three separate columns. | |
| Then we can drop the truck_dev table since we no longer need it. | |
| */ | |
| -- We can drop the old truck build column with a simple ALTER TABLE ... DROP COLUMN command | |
| ALTER TABLE raw_pos.truck_details DROP COLUMN truck_build; | |
| -- Now we can drop the truck_dev table | |
| DROP TABLE raw_pos.truck_details; | |
| /* 4. Data Recovery with UNDROP | |
| Oh no! We accidentally dropped the production truck table. 😱 | |
| Luckily we can use the UNDROP command to restore the table back to its state before being dropped. | |
| UNDROP is part of Snowflake's powerful Time Travel feature and allows for the restoration of dropped | |
| database objects within a configured data retention period (default 24 hours). | |
| Let's restore the production 'truck' table ASAP using UNDROP! | |
| */ | |
| -- Optional: run this query to verify the 'truck' table no longer exists | |
| -- Note: The error 'Table TRUCK does not exist or not authorized.' means the table was dropped. | |
| DESCRIBE TABLE raw_pos.truck_details; | |
| --Run UNDROP on the production 'truck' table to restore it to the exact state it was in before being dropped | |
| UNDROP TABLE raw_pos.truck_details; | |
| --Verify the table was successfully restored | |
| SELECT * from raw_pos.truck_details; | |
| -- Now drop the real truck_dev table | |
| DROP TABLE raw_pos.truck_dev; | |
| /* 5. Resource Monitors | |
| *********************************************************** | |
| User-Guide: | |
| https://docs.snowflake.com/en/user-guide/resource-monitors | |
| *********************************************************** | |
| Monitoring compute usage and spend is critical to any cloud-based workflow. Snowflake provides a simple | |
| and straightforward way to track warehouse credit usage with Resource Monitors. | |
| With Resource Monitors you define credit quotas and then trigger certain actions on | |
| associated warehouses upon reaching defined usage thresholds. | |
| Actions the resource monitor can take: | |
| -NOTIFY: Sends an email notification to specified users or roles. | |
| -SUSPEND: Suspends the associated warehouses when a threshold is reached. | |
| NOTE: Running queries are allowed to complete. | |
| -SUSPEND_IMMEDIATE: Suspends the associated warehouses when a threshold is reached and | |
| cancels all running queries. | |
| Now, we'll create a Resource Monitor for our warehouse my_wh | |
| Let's quickly set our account level role in Snowsight to accountadmin; | |
| To do so: | |
| - Click the User Icon in the bottom left of the screen | |
| - Hover over 'Switch Role' | |
| - Select 'ACCOUNTADMIN' in the role list panel | |
| Next we will use the accountadmin role in our Worksheet | |
| */ | |
| USE ROLE accountadmin; | |
| -- Run the query below to create the resource monitor via SQL | |
| CREATE OR REPLACE RESOURCE MONITOR my_resource_monitor | |
| WITH CREDIT_QUOTA = 100 | |
| FREQUENCY = MONTHLY -- Can also be DAILY, WEEKLY, YEARLY, or NEVER (for a one-time quota) | |
| START_TIMESTAMP = IMMEDIATELY | |
| TRIGGERS ON 75 PERCENT DO NOTIFY | |
| ON 90 PERCENT DO SUSPEND | |
| ON 100 PERCENT DO SUSPEND_IMMEDIATE; | |
| -- With the Resource Monitor created, apply it to my_wh | |
| ALTER WAREHOUSE my_wh | |
| SET RESOURCE_MONITOR = my_resource_monitor; | |
| /* 6. Budgets | |
| **************************************************** | |
| User-Guide: | |
| https://docs.snowflake.com/en/user-guide/budgets | |
| **************************************************** | |
| In the previous step we configured a Resource Monitor that allows for monitoring | |
| credit usage for Warehouses. In this step we will create a Budget for a more holistic | |
| and flexible approach to managing costs in Snowflake. | |
| While Resource Monitors are tied specifically to warehouse and compute usage, Budgets can be used | |
| to track costs and impose spending limits on any Snowflake object or service and notify users | |
| when the dollar amount reaches a specified threshold. | |
| */ | |
| -- Let's first create our budget | |
| CREATE OR REPLACE SNOWFLAKE.CORE.BUDGET my_budget() | |
| COMMENT = 'My Tasty Bytes Budget'; | |
| /* | |
| Before we can configure our Budget we need to verify an email address on the account. | |
| To verify your email address: | |
| - Click the User Icon in the bottom left of the screen | |
| - Click Settings | |
| - Enter your email address in the email field | |
| - Click 'Save' | |
| - Check your email and follow instructions to verify email | |
| NOTE: if you don't receive an email after a few minutes, click 'Resend Verification' | |
| With our new budget now in place, our email verified and our account-level role set to accountadmin, | |
| lets head over to the Budgets page in Snowsight to add some resources to our Budget. | |
| To get to the Budgets page in Snowsight: | |
| - Click the Admin button on the Navigation Menu | |
| - Click the first item 'Cost Management' | |
| - Click the 'Budgets' tab | |
| If prompted to select a warehouse, select tb_dev_wh, otherwise, ensure your warehouse is set to | |
| tb_dev_wh from the warehouse panel at the top right of the screen. | |
| On the budgets page we see metrics about our spend for the current period. | |
| In the middle of the screen shows a graph of the current spend with forecasted spend. | |
| At the bottom of the screen we see our 'MY_BUDGET' budget we created earlier. Click | |
| that to view the Budget page | |
| Clicking the '<- Budget Details' at the top right of the screen reveals the | |
| Budget Details panel. Here we can view information about our budget and all | |
| of the resources attached to it. We see there are no resources monitored so let's add some now. | |
| Click the 'Edit' button to open the Edit Budget panel; | |
| - Keep budget name the same | |
| - Set the spending limit to 100 | |
| - Enter the email you verified earlier | |
| - Click the '+ Tags & Resources' button to add a couple of resources | |
| - Expand Databases, then TB_101, then check the box next to the ANALYTICS schema | |
| - Scroll down to and expand 'Warehouses' | |
| - Check the box for 'TB_DE_WH' | |
| - Click 'Done' | |
| - Back in the Edit Budget menu, click 'Save Changes' | |
| */ | |
| /* 7. Universal Search | |
| ************************************************************************** | |
| User-Guide | |
| https://docs.snowflake.com/en/user-guide/ui-snowsight-universal-search | |
| ************************************************************************** | |
| Universal Search allows you to easily find any object in your account, plus explore data products in the Marketplace, | |
| relevant Snowflake Documentation, and Community Knowledge Base articles. | |
| Let's try it now. | |
| - To use Universal Search, begin by clicking 'Search' in the Navigation Menu | |
| - Here we see the Universal Search UI. Let's put in our first search term. | |
| - Enter 'truck' into the search bar and observe the results. The top sections are categories | |
| of relevant objects on your account, like databases, tables, views, stages, etc. Below your | |
| database objects you can see sections for relevant marketplace listings and documentation. | |
| - You may also provide search terms in natural language to describe what you're looking for. If we wanted to | |
| know where to start looking to answer which truck franchise has the most return customers, we could search | |
| something like 'Which truck franchise has the most loyal customer base?' Clicking 'View all >' button next to | |
| the 'Tables & Views' section will allow us to view all of the relevant tables and views relevant to our query. | |
| Universal Search returns several tables and views from different schemas. Note also how the relevant columns | |
| are listed for each object. These are all excellent starting points for data-driven answers about return customers. | |
| */ | |
| ------------------------------------------------------------------------- | |
| --RESET-- | |
| ------------------------------------------------------------------------- | |
| -- Drop created objects | |
| DROP RESOURCE MONITOR IF EXISTS my_resource_monitor; | |
| DROP TABLE IF EXISTS raw_pos.truck_dev; | |
| -- Reset truck details | |
| CREATE OR REPLACE TABLE raw_pos.truck_details | |
| AS | |
| SELECT * EXCLUDE (year, make, model) | |
| FROM raw_pos.truck; | |
| DROP WAREHOUSE IF EXISTS my_wh; | |
| -- Unset Query Tag | |
| ALTER SESSION UNSET query_tag; |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment