Skip to content

Instantly share code, notes, and snippets.

@systay
Created January 16, 2026 07:50
Show Gist options
  • Select an option

  • Save systay/4dc8a91aea6d2ecf0ea84725f0f64925 to your computer and use it in GitHub Desktop.

Select an option

Save systay/4dc8a91aea6d2ecf0ea84725f0f64925 to your computer and use it in GitHub Desktop.

Analysis Report: predicates.JoinPredicate

Location: go/vt/vtgate/planbuilder/operators/predicates/predicate.go:29

The Problem Being Solved

During query planning, join predicates need to be split and pushed down early to compute accurate routing costs. However, if routes later merge, we need to restore the original predicate. This creates a tension:

  1. Early planning: a.col = b.col must become :bv1 = b.col on RHS (with a.col as a bind variable from LHS)
  2. Route merging: If both sides merge under one route, we need a.col = b.col back
  3. Final SQL: We need to emit the right form depending on whether merging happened

The JoinPredicate type solves this by providing stable identity for a predicate whose underlying expression changes during planning.


What the Type Looks Like

type JoinPredicate struct {
    ID      ID
    tracker *Tracker
}

It implements sqlparser.Expr and sqlparser.Visitable, acting as a wrapper/proxy around the real expression.

The Tracker maintains the actual expressions:

type Tracker struct {
    lastID      ID
    expressions map[ID]sqlparser.Expr
}

The Lifecycle

Phase 1: ApplyJoin Creation (apply_join.go:151-168)

When a join predicate is pushed down to RHS:

func (aj *ApplyJoin) AddJoinPredicate(ctx *plancontext.PlanningContext, expr sqlparser.Expr, pushDown bool) {
    for _, pred := range preds {
        col := breakExpressionInLHSandRHS(ctx, pred, lhsID)  // Split: a.col = b.col → LHS: a.col, RHS: :bv1 = b.col
        if pushDown {
            newPred := ctx.PredTracker.NewJoinPredicate(col.RHSExpr)  // Wrap RHS expr
            col.JoinPredicateID = &newPred.ID                         // Remember the ID
            rhs = rhs.AddPredicate(ctx, newPred)                      // Push wrapped predicate
        }
        aj.JoinPredicates.add(col)  // Store original + broken-up info
    }
}

At this point:

  • Original: a.col = b.col (stored in applyJoinColumn.Original)
  • Tracker[ID]: :bv1 = b.col (the broken-up RHS expression)
  • The JoinPredicate wrapper is pushed down to RHS for routing decisions

Phase 2: Routing Decisions

When the RHS evaluates routing, it unwraps the JoinPredicate:

// route.go:131-134
pred, isJP := in.(*predicates.JoinPredicate)
if isJP {
    expr = pred.Current()  // Get the actual expression from tracker
}

Similarly in sharded_routing.go:226-229 and SQL_builder.go:121-124.

Phase 3: Route Merging (query_planning.go:152-175)

When attempting to merge routes, predicates are temporarily restored:

for _, col := range aj.JoinPredicates.columns {
    if col.JoinPredicateID != nil {
        id := *col.JoinPredicateID
        oldExpr, _ := ctx.PredTracker.Get(id)     // Save: :bv1 = b.col
        original[id] = oldExpr
        ctx.PredTracker.Set(id, col.Original)     // Restore: a.col = b.col
    }
}
// ... attempt merge ...
defer func() {
    if res == NoRewrite {
        for id, expr := range original {
            ctx.PredTracker.Set(id, expr)         // Rollback if merge failed
        }
    }
}()

Phase 4: SQL Building (SQL_builder.go:604-610)

If we still have an ApplyJoin (routes didn't merge):

func buildApplyJoin(op *ApplyJoin, qb *queryBuilder) {
    preds := slice.Map(op.JoinPredicates.columns, func(jc applyJoinColumn) sqlparser.Expr {
        if jc.JoinPredicateID != nil {
            qb.ctx.PredTracker.Skip(*jc.JoinPredicateID)  // Mark as "done"
        }
        return jc.Original  // Use original for the JOIN ON clause
    })
    // ...
}

The Skip() sets the tracked expression to nil, preventing it from appearing in RHS SQL (since we're using the original in the JOIN ON clause instead).


Usage Sites Summary

File Line Purpose
apply_join.go 161 Create wrapper when pushing predicate to RHS
join.go 154 Create wrapper for CTE predicates
union.go 122-126 Clone for each UNION source
route.go 131 Unwrap to get actual expression
sharded_routing.go 226 Unwrap for vindex analysis
SQL_builder.go 121, 607 Unwrap for SQL output, skip when using original
evalengine/translate.go 551 Unwrap for expression evaluation
query_planning.go 159-172 Swap expressions during merge attempts
join_merging.go 268 Restore original after successful merge
cte_merging.go 87 Restore original for CTE merging

Is This a Good Solution?

Strengths

  1. Solves a real problem - The need to track predicates through transformation is genuine
  2. Stable identity - The ID remains constant even as the expression changes
  3. Transparent unwrapping - Implements sqlparser.Expr, so most code can ignore it
  4. Reversible - Can restore original expression when routes merge
  5. Centralized state - Single Tracker in PlanningContext manages all predicates

Weaknesses

  1. Hidden mutability - The expression changes behind the scenes, which can be surprising
  2. Manual lifecycle management - Must remember to call Skip(), Set(), Get() at right times
  3. Debugging difficulty - Stack traces show JoinPredicate but real expression is elsewhere
  4. Implicit coupling - Many places must know to unwrap JoinPredicate
  5. No compile-time safety - Easy to forget to handle JoinPredicate in new code

Alternative Approaches

Alternative 1: Explicit State Machine

Instead of mutable indirection, track predicate state explicitly:

type JoinPredicateState struct {
    Original     sqlparser.Expr  // a.col = b.col
    PushedDown   sqlparser.Expr  // :bv1 = b.col
    CurrentForm  PredicateForm   // enum: Original, PushedDown, Skipped
}

type PredicateForm int
const (
    FormOriginal PredicateForm = iota
    FormPushedDown
    FormSkipped
)

func (s *JoinPredicateState) Current() sqlparser.Expr {
    switch s.CurrentForm {
    case FormOriginal:
        return s.Original
    case FormPushedDown:
        return s.PushedDown
    case FormSkipped:
        return nil
    }
}

Pros:

  • Explicit state transitions
  • All forms available without mutation
  • Easier to debug (can see all forms at once)
  • No global tracker needed

Cons:

  • Larger struct stored everywhere
  • Still need to integrate with sqlparser.Expr interface
  • Requires threading state through more places

Alternative 2: Copy-on-Write with Lineage

Track predicate transformations as a chain:

type PredicateLineage struct {
    Current  sqlparser.Expr
    Previous *PredicateLineage  // nil if this is the original
}

func (p *PredicateLineage) Original() sqlparser.Expr {
    for p.Previous != nil {
        p = p.Previous
    }
    return p.Current
}

Pros:

  • Immutable - each transformation creates new node
  • Full history available
  • No central tracker

Cons:

  • Memory overhead for lineage chain
  • Doesn't integrate with sqlparser.Expr interface naturally
  • Still need to coordinate "which form to use" decisions

Alternative 3: Predicate Registry with Versioning

Similar to current but with explicit versions:

type PredicateRegistry struct {
    predicates map[ID]*VersionedPredicate
}

type VersionedPredicate struct {
    versions []sqlparser.Expr  // index 0 = original, higher = transformations
    active   int               // which version is "current"
}

func (vp *VersionedPredicate) Push(expr sqlparser.Expr) {
    vp.versions = append(vp.versions, expr)
    vp.active = len(vp.versions) - 1
}

func (vp *VersionedPredicate) Restore() {
    vp.active = 0
}

Pros:

  • Full history preserved
  • Explicit restore operation
  • Can inspect all versions for debugging

Cons:

  • More complex than current solution
  • Still requires wrapper type for sqlparser.Expr interface
  • Overkill if we only ever have 2 forms (original + pushed-down)

Recommendation

The current solution is pragmatic and appropriate for the problem it solves. The indirection pattern is a reasonable trade-off given:

  1. The expression genuinely needs to change identity during planning
  2. Integration with sqlparser.Expr interface is required
  3. The number of transformation states is small (original → pushed-down → skipped)

Potential improvements without major redesign:

  1. Add FormatFast debugging - Already done (shows JP(id):expr)
  2. Document the lifecycle - A comment block showing the phases would help
  3. Consider making Skip return the original - Slightly cleaner API for SQL building
  4. Add assertions - Panic if accessing a skipped predicate unexpectedly

The main risk is forgetting to handle JoinPredicate in new code that processes expressions. A linter or code review checklist item could help catch this.


References

  • go/vt/vtgate/planbuilder/operators/predicates/predicate.go - The type definition
  • go/vt/vtgate/planbuilder/operators/predicates/tracker.go - The registry
  • go/vt/vtgate/planbuilder/operators/apply_join.go:151-168 - Creation site
  • go/vt/vtgate/planbuilder/operators/query_planning.go:145-175 - Merge/restore logic
  • go/vt/vtgate/planbuilder/operators/SQL_builder.go:604-610 - Final consumption
  • go/vt/vtgate/planbuilder/operators/README.md:180-183 - Brief documentation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment