Skip to content

Instantly share code, notes, and snippets.

@sfc-gh-vsekar
Last active February 16, 2026 06:49
Show Gist options
  • Select an option

  • Save sfc-gh-vsekar/e0ef1474ad5ec52f3a4b3918f6cd42a2 to your computer and use it in GitHub Desktop.

Select an option

Save sfc-gh-vsekar/e0ef1474ad5ec52f3a4b3918f6cd42a2 to your computer and use it in GitHub Desktop.
# Just The Gist : Snowflake Dashboard on data sharing usage This gist is used for developing building a streamlit based dashboard app on the various data usage views provided by Snowflake. https://docs.snowflake.com/en/sql-reference/data-sharing-usage/ These offers telemetry insights on how consumers are using and adopting the data product or ap…

Just The Gist : Snowflake Dashboard on data sharing usage

This gist is used for developing building a streamlit based dashboard app on the various data usage views provided by Snowflake.

https://docs.snowflake.com/en/sql-reference/data-sharing-usage/

These offers telemetry insights on how consumers are using and adopting the data product or app. The views are available to the marketplace provider, using which they can gather insights and improve the product they are offering.

Instead of sharing code base; i am adopting the pattern of sharing the project requirement document (PRD); which can be used by AI coding tools like Cortex Code, to generate the executing code. This also gives the user the oppurtunity to make changes as per thier criteria.

File: PRD_DATASHARING.md

This is the project requirement document (PRD) which has been used to generate the streamlit application. The one that is shared here, is a refined version of the original PRD, which was raw and unformatted etc.. But the core aspects of the project is captured here.

You can refer to this doc in the plan mode and ask Cortex Code, to generate the assets for you. Derive your own version, by further modifying/enhancing.


File: AGENTS.md

This file is my own version of various rules that would be followed by the coding agent during the implementation phase. Feel free to adopt or change as you please

Product Requirements Document

Snowflake Data Sharing Usage Dashboard

Version: 1.0
Purpose: Build a dashboard to help Snowflake Marketplace data providers understand how their listings are being consumed by customers.


1. Introduction

1.1 Background

Snowflake Marketplace enables data providers to share datasets with consumers across Snowflake accounts. However, understanding adoption patterns, consumer behavior, and usage trends requires querying multiple system views and synthesizing disparate metrics.

1.2 Objective

Build an interactive dashboard that provides data providers with actionable insights into:

  • Which data products are most popular
  • How consumers engage with listings (views, clicks, installs)
  • What specific tables and columns are being queried
  • Trends over time and consumer distribution

1.3 Target Users

  • Data product managers
  • Marketplace listing owners
  • Data engineering teams responsible for shared datasets

2. Data Sources

Snowflake provides usage telemetry through the SNOWFLAKE.DATA_SHARING_USAGE schema. The dashboard should leverage these four views:

2.1 LISTING_CONSUMPTION_DAILY

Tracks daily job/query activity by consumers.

Column Description
EXCHANGE_NAME Marketplace exchange
LISTING_NAME Display name of the listing
LISTING_GLOBAL_NAME Unique global identifier
SNOWFLAKE_REGION Consumer's Snowflake region
CONSUMER_ACCOUNT_LOCATOR Consumer account identifier
CONSUMER_ACCOUNT_NAME Consumer account name
EVENT_DATE Date of activity
JOB_COUNT Number of jobs/queries executed
DAU/WAU/MAU_28 Daily/Weekly/Monthly active users

Documentation: https://docs.snowflake.com/en/sql-reference/data-sharing-usage/listing-consumption-daily

2.2 LISTING_TELEMETRY_DAILY

Tracks engagement events (views, clicks, get/request actions).

Column Description
LISTING_NAME Display name of the listing
LISTING_GLOBAL_NAME Unique global identifier
EVENT_DATE Date of event
EVENT_TYPE Category: CLICKS, LISTING_AUTOFULFILMENT, etc.
ACTION Specific action: GET_STARTED, GET_COMPLETED, etc.
EVENT_COUNT Number of events
CONSUMER_ACCOUNTS_REACHED_COUNT Unique consumers reached

Documentation: https://docs.snowflake.com/en/sql-reference/data-sharing-usage/listing-telemetry-daily

2.3 LISTING_EVENTS_DAILY

Tracks consumer lifecycle events (installs, uninstalls, purchase requests).

Column Description
LISTING_DISPLAY_NAME Display name of the listing
LISTING_GLOBAL_NAME Unique global identifier
EVENT_DATE Date of event
EVENT_TYPE GET, REQUEST, PURCHASE, UNINSTALL
SNOWFLAKE_REGION Consumer's region
CONSUMER_ACCOUNT_LOCATOR Consumer account identifier
CONSUMER_ACCOUNT_NAME Consumer account name
CONSUMER_EMAIL_HASH Hashed consumer email

Documentation: https://docs.snowflake.com/en/sql-reference/data-sharing-usage/listing-events-daily

2.4 LISTING_ACCESS_HISTORY

Provides object and column-level query access details.

Column Description
QUERY_DATE Date of query
QUERY_TOKEN Unique query identifier
LISTING_GLOBAL_NAME Unique global identifier
SHARE_NAME Name of the share
CONSUMER_ACCOUNT_LOCATOR Consumer account identifier
CONSUMER_ACCOUNT_NAME Consumer account name
SHARE_OBJECTS_ACCESSED JSON array of accessed objects/columns

Important: The SHARE_OBJECTS_ACCESSED column contains nested JSON that must be parsed to extract table and column names:

[
  {
    "objectName": "DB.SCHEMA.TABLE_NAME",
    "columns": [
      {"columnName": "COLUMN_A"},
      {"columnName": "COLUMN_B"}
    ]
  }
]

Documentation: https://docs.snowflake.com/en/sql-reference/data-sharing-usage/listing-access-history


3. Technical Considerations

3.1 Performance: Snapshot Approach

The DATA_SHARING_USAGE views can contain years of historical data, making direct queries slow. The recommended approach is to create snapshot tables that contain only the relevant time window.

Example SQL:

-- Create snapshot of consumption data for the past 90 days
CREATE OR REPLACE TRANSIENT TABLE listing_consumption_daily_snapshot AS
SELECT *
FROM snowflake.data_sharing_usage.listing_consumption_daily
WHERE event_date >= CURRENT_DATE() - 90;

Recommendation: Create transient snapshot tables for all four views. Use a configurable date range (e.g., 90 days for production, 5 days for development/demo).

3.2 JSON Parsing for Access History

Extracting table and column access from the nested JSON requires LATERAL FLATTEN:

-- Extract object and column names from access history
WITH base AS (
  SELECT 
    query_date,
    share_name,
    consumer_account_name,
    f.value:objectName::VARCHAR AS object_name,
    f.value:columns AS columns
  FROM listing_access_history_snapshot AS t,
       LATERAL FLATTEN(input => t.share_objects_accessed) AS f
)
SELECT 
  b.query_date,
  b.share_name,
  b.consumer_account_name,
  b.object_name,
  c.value:columnName::VARCHAR AS column_name
FROM base AS b,
     LATERAL FLATTEN(input => b.columns) AS c;

3.3 Recommended Tech Stack

  • Language: Python
  • Visualization: Streamlit (or alternative: Dash, Panel, Gradio)
  • Snowflake Connectivity: Snowpark or snowflake-connector-python
  • Package Management: UV, Poetry, or pip

4. Functional Requirements

4.1 Page 1: Executive Overview (Home)

Purpose: Provide a high-level summary of all data sharing activity.

KPIs to Display:

Metric Calculation
Total Listings COUNT(DISTINCT listing_global_name) across all views
Total Consumers COUNT(DISTINCT consumer_account_name) from consumption
Total Jobs SUM(job_count) from consumption
Total Events SUM(event_count) from telemetry

Visualizations:

  1. Top Data Products - Bar chart showing top 5-10 listings by total job count, with indicator of unique consumer count
  2. Consumption Trend - Line/area chart of daily job counts over the analysis period
  3. Telemetry Breakdown - Sunburst or treemap showing EVENT_TYPE → ACTION hierarchy

Interactivity: Display the analysis date range prominently.


4.2 Page 2: Consumption Analysis

Purpose: Analyze job/query activity patterns by listing and consumer.

Filters:

  • Listing selector (dropdown)

KPIs:

Metric Description
Total Jobs Sum of job_count for selected listing
Unique Consumers Distinct consumer accounts
Active Days Days with at least one job
Avg Daily Users Average DAU

Visualizations:

  1. Jobs Over Time - Area/bar chart of daily job counts
  2. Top Consumers - Bar chart of top 10 consumers by job count
  3. User Engagement - Display 1-day, 7-day, and 28-day unique user metrics
  4. Regional Distribution - Pie/donut chart of jobs by Snowflake region

4.3 Page 3: Telemetry & Conversion Funnel

Purpose: Understand engagement patterns and conversion rates.

Filters:

  • Listing selector (dropdown)

KPIs:

Metric Description
Total Events Sum of event_count
Consumer Touches Unique consumer accounts reached
Active Listings Listings with telemetry data

Visualizations:

  1. Event Breakdown - Treemap showing EVENT_TYPE → ACTION hierarchy with counts
  2. Conversion Funnel - Funnel chart showing progression:
    • Listing Views → Clicks → GET Started → GET Completed
    • Include conversion percentages between stages
  3. Trend by Event Type - Multi-line chart showing event types over time
  4. Click-Through Rate Table - Table showing per-listing metrics:
    • Clicks, Gets Started, Gets Completed, CTR %

4.4 Page 4: Events Timeline

Purpose: Track consumer lifecycle events (installs, uninstalls, requests).

Filters:

  • Listing selector (dropdown)
  • Event type selector (multi-select)

KPIs:

Metric Description
Total Events Count of all events
GETs (Installs) Count of GET events
Uninstalls Count of UNINSTALL events
Requests Count of REQUEST events

Visualizations:

  1. Event Timeline - Stacked bar chart by date, colored by event type
  2. Events by Type - Pie chart of event distribution
  3. Events by Listing - Bar chart of top listings by event count
  4. Consumer Activity Matrix - Pivot table showing event counts per consumer by event type
  5. Recent Events Table - Sortable table showing recent events with details (date, type, listing, consumer, region)

4.5 Page 5: Access History & Usage Heatmap

Purpose: Understand which specific tables and columns are being queried.

Filters:

  • Data Product (share name) selector
  • Consumer selector

KPIs:

Metric Description
Total Queries COUNT(DISTINCT query_token)
Unique Consumers Distinct consumer accounts
Active Days Days with query activity

Visualizations:

  1. Table & Column Heatmap - When a specific data product is selected:
    • Y-axis: Table names
    • X-axis: Column names
    • Color intensity: Access frequency
    • Sort by total usage (most accessed at top/left)
  2. Most Accessed Tables - Top 5 list with counts
  3. Most Accessed Columns - Top 5 list with counts
  4. Query Activity Over Time - Bar chart of daily query counts
  5. Top Objects - Bar chart of most accessed objects
  6. Access by Object Type - Pie chart (TABLE, VIEW, etc.)
  7. Consumer Breakdown - Bar chart of queries per consumer

5. Non-Functional Requirements

5.1 Usability

  • Dashboard should load within 5 seconds when using snapshot tables
  • All charts should be interactive (hover for details, click to filter where appropriate)
  • Filters should apply dynamically without page refresh

5.2 Configuration

  • Database, schema, and connection settings should be configurable (environment variables or config file)
  • Snapshot table names should be configurable
  • Date range for analysis should be visible to users

5.3 Code Quality

  • Separate data loading logic from visualization logic
  • Use caching where appropriate to avoid redundant queries
  • Include logging for debugging

6. Sample Implementation Queries

6.1 Get Date Range from Consumption Data

SELECT 
    MIN(event_date) AS min_date,
    MAX(event_date) AS max_date,
    DATEDIFF(day, MIN(event_date), MAX(event_date)) AS days_span
FROM listing_consumption_daily_snapshot;

6.2 Top Listings by Consumption

SELECT 
    listing_global_name,
    SUM(job_count) AS total_jobs,
    COUNT(DISTINCT consumer_account_name) AS unique_consumers
FROM listing_consumption_daily_snapshot
GROUP BY listing_global_name
ORDER BY total_jobs DESC
LIMIT 10;

6.3 Daily Consumption Trend

SELECT 
    event_date,
    SUM(job_count) AS total_jobs
FROM listing_consumption_daily_snapshot
GROUP BY event_date
ORDER BY event_date;

6.4 Telemetry by Event Type and Action

SELECT 
    event_type,
    action,
    SUM(event_count) AS total_events
FROM listing_telemetry_daily_snapshot
GROUP BY event_type, action
ORDER BY total_events DESC;

6.5 Conversion Funnel Metrics

SELECT 
    SUM(CASE WHEN event_type = 'LISTING_VIEW' THEN event_count ELSE 0 END) AS views,
    SUM(CASE WHEN event_type = 'CLICKS' THEN event_count ELSE 0 END) AS clicks,
    SUM(CASE WHEN action = 'GET_STARTED' THEN event_count ELSE 0 END) AS gets_started,
    SUM(CASE WHEN action = 'GET_COMPLETED' THEN event_count ELSE 0 END) AS gets_completed
FROM listing_telemetry_daily_snapshot;

6.6 Build Table/Column Access Heatmap Data

WITH parsed_access AS (
    SELECT 
        share_name,
        SPLIT_PART(f.value:objectName::VARCHAR, '.', -1) AS table_name,
        c.value:columnName::VARCHAR AS column_name,
        COUNT(*) AS access_count
    FROM listing_access_history_snapshot AS t,
         LATERAL FLATTEN(input => t.share_objects_accessed) AS f,
         LATERAL FLATTEN(input => f.value:columns) AS c
    WHERE share_name = '<selected_share>'
    GROUP BY share_name, table_name, column_name
)
SELECT * FROM parsed_access
ORDER BY access_count DESC;

7. Success Criteria

A complete implementation should demonstrate:

  1. Data Pipeline: Successfully creates and queries snapshot tables
  2. Executive View: Home page with KPIs and top listings visualization
  3. Drill-Down Capability: Each page allows filtering by listing
  4. Conversion Insights: Telemetry page shows a meaningful funnel visualization
  5. Granular Access Analysis: Access History page displays table/column heatmap
  6. Interactivity: Charts respond to filter changes; hover shows details
  7. Performance: Pages load quickly using snapshot approach

8. Out of Scope

The following are explicitly not required:

  • User authentication or role-based access
  • Real-time streaming updates
  • Automated snapshot refresh scheduling
  • Export functionality
  • Mobile optimization
  • Multi-language support

9. References


This PRD provides requirements for building a Snowflake Data Sharing Usage Dashboard. Implementations may vary in UI design, chart library choices, and additional features.

Development guidelines

Follow these guidelines as you develop / implement the codebase.

  • Develop the code using Python
  • use UV as the library management
  • For visualization use Streamlit
  • There is no need to package and distribute the application.
  • UV environment is already setup and has some libraries already; hence don't go about creating from scratch.
  • All source should be stored in src/ folder.
  • Do not create Solution Documentation, unless i specifically ask you too. Stop Wasting Credits un-necessarily
  • All logs should be stored in the logs/ folder
  • Specifications would be stored in the spec/ folder
  • IGNORE the contents in temp/ folder, do not even read or process it. It is a temporary store for my own saving notes, dummy scripts etc..
  • All tasks & activites should be stored in the file TASKS.md . You should also update the task status as you make progress.
  • You should keep track of various suggestion ,issues that you had faced ,changes in direction of implementation etc.. in the file HANDOFF.md.
  • The file HANDOFF.md should be referred for insights & context from previous sessions.
  • If there is some changes in the plan or direction of implementation, for ex for some unforseeable errors, we should update the TASKS.md accordingly, on the new steps that we will be implementing on.
  • All data used for testing and demo should be stored in the data/ folder.

Environment and configurations

The various environment specific information would be stored in local '.env' file. Inspect this file and understand various configuration.

Environment Variables hardcoding

Do not embedd or use environment specific values, ex hostname, role, warehouse names etc... in source code or scripts or documentations etc.. Always try to load from the project_config.yml file. If a scenario exists where this is not possible; consult with me on next plan of actions.

Scripting

  • Prefer to have bash scripts stored in bin/ folder, vs running command one at a time. The output of the various commands in the script should be logged into a file in the log/ folder.

NO Inline Script Execution

DO NOT execute Python code inline using any of these patterns:

  • python -c "..."
  • python3 -c "..."
  • uv run python -c "..."
  • Heredoc patterns like python << EOF ... EOF

INSTEAD, always:

  1. Create a proper Python script file in src/ or bin/ folder
  2. Run the script using uv run python path/to/script.py

This applies even for "quick" one-off tasks like:

  • Modifying JSON/notebook files
  • Checking file contents
  • Data transformations
  • Any multi-line Python logic

Exception: Simple single-line expressions for verification are acceptable, e.g., python3 -c "print('hello')"

Implementation aspects

IMPORTANT This code base is a demonstration for self learning only. Hence keep the implementation simple and understandable and easy to read. I prefer for now not to worry about:

  • Security
  • vigorous exception handling
  • test code
  • performance etc..

If in doubt, ask yourself the question, will this code be understandable by a junior developer? Keep it simple.

DisAllowed Commands

The following commands are not allowed; I want you to confirm with me before proceeding:

  • rm and its options
  • rmdir
  • DROP database, compute pools and other Snowflake objects that are not Table or Views
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment