cnolanminich/skills_assessment.md

## skills_assessment.md

      
    Raw
  

              skills_assessment.md
            
          
    Project & Session Comparison

Project Comparison: testing-new-skills vs testing-new-skills-2

Structural Differences


Aspect
Project 1 (testing-new-skills)
Project 2 (testing-new-skills-2)


dbt project location
Inside src/.../defs/dbt_project/
Top-level dbt_project/


dbt mart models
account_360, pipeline_summary, lead_conversion_funnel
fct_sales_pipeline, dim_account_health, fct_lead_conversion


Custom components
5 (incl. ScheduledJobComponent)
4 (no scheduling component)


dlt approach
Subclasses DltLoadCollectionComponent (built-in)
Fully custom component with inline dlt.pipeline()


Demo mode
All components have demo_mode: bool toggle
No demo mode — generates data directly


GCloud Function
3 assets per function (execution/status/result) + sensor triggers jobs
2 asset groups (orchestrated/observed) + observation sensor


Google Drive
Supports transform queries in config
Direct DuckDB reads, no transform layer


Schedules
3 scheduled jobs (daily ingestion, daily Snowflake, weekly Drive)
None


Salesforce pipeline
Separate salesforce_pipeline.py with dlt source definition
Data generation embedded in component class


Snowflake share
Has ShareTarget model, demo mode logging
Direct implementation, no demo mode


dbt Model Differences

The mart models take different analytical angles:

Project 1: account_360 (full account view), pipeline_summary (stage/fiscal aggregation), lead_conversion_funnel (source/industry conversion rates)
Project 2: fct_sales_pipeline (opportunity-level with deal tiering), dim_account_health (health scoring with Platinum/Gold/Silver/Bronze tiers), fct_lead_conversion (individual lead funnel tracking)

Project 2's models are more granular (row-per-entity), while Project 1's are more aggregated (summary metrics).

Session Comparison


Metric
Project 1 Session
Project 2 Session


Session ID
66a81ae4
ee3d6fea


Total tokens
~15.9M
~5.0M (3.2x fewer)


Output tokens
46,745
28,117


Cache read tokens
15.2M
4.8M


Messages
210
104


Duration
3 days (Mar 6-9, with revisits)
~21 minutes (Mar 9)


Tool calls
~140
~60


Errors/retries
~13 (import issues, pip not found, dlt API mismatches)
3 (path resolution, env vars, zsh glob)


Skill used
dagster-demo → dagster-expert (orchestrator chain)
dagster-expert directly


Approach Differences


Project 1
Project 2


Strategy
Orchestrator skill (dagster-demo) provided a 5-step workflow; heavy upfront reference reading (~10+ docs), then systematic execution
Direct dagster-expert skill invocation; targeted reference reads, then dove into building


API exploration
Spent time introspecting dagster-dlt Python APIs at runtime (uv run python -c "from dagster_dlt import ...") to find correct imports
Avoided built-in dlt component entirely — wrote custom component with raw dlt library


Component design
Tried to subclass existing components (DltLoadCollectionComponent) — hit import/API issues
Built all custom components from scratch using dg.Component base class


Error recovery
Multiple cycles of dg check defs → fix imports → re-check
Fewer errors; mostly env var / path issues, resolved quickly


Parallelism
More sequential tool calls
Aggressive parallel tool calls (scaffold 4 components simultaneously, read 5 files at once)


Key Takeaway

Project 2 was 3.2x more token-efficient and completed in ~21 minutes vs. spanning 3 days. The main factors:

Avoided complex subclassing — writing fully custom components with raw libraries (dlt, snowflake-connector, google APIs) was simpler than trying to extend Dagster's built-in integration components
Skipped the orchestrator skill — going directly to dagster-expert cut overhead
More parallel tool calls — scaffolding 4 components in one shot vs. sequentially
Fewer errors — simpler component architecture meant fewer import/API compatibility issues
No demo mode abstraction — less code surface to debug
Aspect	Project 1 (`testing-new-skills`)	Project 2 (`testing-new-skills-2`)
dbt project location	Inside `src/.../defs/dbt_project/`	Top-level `dbt_project/`
dbt mart models	`account_360`, `pipeline_summary`, `lead_conversion_funnel`	`fct_sales_pipeline`, `dim_account_health`, `fct_lead_conversion`
Custom components	5 (incl. `ScheduledJobComponent`)	4 (no scheduling component)
dlt approach	Subclasses `DltLoadCollectionComponent` (built-in)	Fully custom component with inline `dlt.pipeline()`
Demo mode	All components have `demo_mode: bool` toggle	No demo mode — generates data directly
GCloud Function	3 assets per function (execution/status/result) + sensor triggers jobs	2 asset groups (orchestrated/observed) + observation sensor
Google Drive	Supports transform queries in config	Direct DuckDB reads, no transform layer
Schedules	3 scheduled jobs (daily ingestion, daily Snowflake, weekly Drive)	None
Salesforce pipeline	Separate `salesforce_pipeline.py` with dlt source definition	Data generation embedded in component class
Snowflake share	Has `ShareTarget` model, demo mode logging	Direct implementation, no demo mode
Metric	Project 1 Session	Project 2 Session
Session ID	`66a81ae4`	`ee3d6fea`
Total tokens	~15.9M	~5.0M (3.2x fewer)
Output tokens	46,745	28,117
Cache read tokens	15.2M	4.8M
Messages	210	104
Duration	3 days (Mar 6-9, with revisits)	~21 minutes (Mar 9)
Tool calls	~140	~60
Errors/retries	~13 (import issues, pip not found, dlt API mismatches)	3 (path resolution, env vars, zsh glob)
Skill used	`dagster-demo` → `dagster-expert` (orchestrator chain)	`dagster-expert` directly
	Project 1	Project 2
Strategy	Orchestrator skill (`dagster-demo`) provided a 5-step workflow; heavy upfront reference reading (~10+ docs), then systematic execution	Direct `dagster-expert` skill invocation; targeted reference reads, then dove into building
API exploration	Spent time introspecting dagster-dlt Python APIs at runtime (`uv run python -c "from dagster_dlt import ..."`) to find correct imports	Avoided built-in dlt component entirely — wrote custom component with raw `dlt` library
Component design	Tried to subclass existing components (`DltLoadCollectionComponent`) — hit import/API issues	Built all custom components from scratch using `dg.Component` base class
Error recovery	Multiple cycles of `dg check defs` → fix imports → re-check	Fewer errors; mostly env var / path issues, resolved quickly
Parallelism	More sequential tool calls	Aggressive parallel tool calls (scaffold 4 components simultaneously, read 5 files at once)