Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save jeje-andal/0a5748b883f752d0ede81ad7acc6a91b to your computer and use it in GitHub Desktop.

Select an option

Save jeje-andal/0a5748b883f752d0ede81ad7acc6a91b to your computer and use it in GitHub Desktop.
PR Review: Distributed Tracing Implementation (163)

📋 PR Review Metadata

┌─────────────────────────────────────────────────────────────────┐ │ ✅ File: C:\Documents\Notes\AI Answer\PR-Review-Gitea-AKServer-AKServer-163-20250313-001245.md │ │ 🔗 PR: https://git.andalsoftware.com/Andal.Kharisma/AK.Server/pulls/163 │ │ 📊 Decision: REQUEST CHANGES │ │ 📁 Files: 16 changed │ │ ⚠️ Risk: HIGH │ └─────────────────────────────────────────────────────────────────┘

Use this path for next step: C:\Documents\Notes\AI Answer\PR-Review-Gitea-AKServer-AKServer-163-20250313-001245.md


PR Review Report: Distributed Tracing Implementation

PR: #163 - Distributed Tracing Across AK.Server Repository: Andal.Kharisma/AK.Server Platform: Gitea Author: (From patch analysis) Review Date: 2026-03-13


Executive Summary

This PR implements distributed tracing across the AK.Server codebase, adding W3C Trace Context propagation through background task queues, workflow activities, and HTTP service calls. The implementation spans 16 files across Attendance, Orchestrator, and Persistence modules.

Overall Assessment: The PR contains a critical bug in PostGraphQLRequest.cs that ignores the explicit traceParent parameter, breaking cross-service distributed tracing. Additionally, the W3CTraceContextMiddleware changes introduce duplicate span creation. These issues must be resolved before merge.

Decision: REQUEST CHANGES


Stack Analysis

Backend (.NET/C#) Changes

Module Breakdown:

Module Files Purpose
Attendance 3 Background task queue tracing
Orchestrator 8 Workflow activity tracing, HTTP client tracing
Persistence 5 Core tracing infrastructure

Technology Patterns:

  • ActivitySource: Used for creating custom activities
  • W3C Trace Context: traceparent and tracestate header propagation
  • ActivityContext: For serializing trace context across async boundaries
  • using statements: Proper activity disposal

Critical Issues (Must Fix)

1. [CRITICAL] PostGraphQLRequest.cs Ignores Explicit traceParent Parameter

Location: src/Persistence/Helper/PostGraphQLRequest.cs (lines 481-498)

Problem: The refactored code completely ignores the traceParent method parameter and always uses Activity.Current:

// NEW CODE (BROKEN):
string? traceParentToUse = null;
if (Activity.Current != null)
{
    var traceId = Activity.Current.TraceId.ToString();
    var spanId = Activity.Current.SpanId.ToString();
    var flags = (Activity.Current.ActivityTraceFlags & ActivityTraceFlags.Recorded) != 0 ? "01" : "00";
    traceParentToUse = $"00-{traceId}-{spanId}-{flags}";
}

if (!string.IsNullOrEmpty(traceParentToUse))
{
    client.HttpClient.DefaultRequestHeaders.Add("traceparent", traceParentToUse);
}

Impact:

  • Breaks distributed tracing for all cross-service calls from workflow activities
  • Workflow activities explicitly pass TraceParent from the workflow context, but it's ignored
  • Each service call starts a new trace instead of continuing the existing one

Fix:

// CORRECTED CODE:
string? traceParentToUse = traceParent; // Use passed parameter first

// Only fall back to Activity.Current if no explicit traceParent provided
if (string.IsNullOrEmpty(traceParentToUse) && Activity.Current != null)
{
    var traceId = Activity.Current.TraceId.ToString();
    var spanId = Activity.Current.SpanId.ToString();
    var flags = (Activity.Current.ActivityTraceFlags & ActivityTraceFlags.Recorded) != 0 ? "01" : "00";
    traceParentToUse = $"00-{traceId}-{spanId}-{flags}";
}

if (!string.IsNullOrEmpty(traceParentToUse))
{
    client.HttpClient.DefaultRequestHeaders.Add("traceparent", traceParentToUse);
}

2. [HIGH] W3CTraceContextMiddleware Creates Duplicate Activities

Location: src/Persistence/Infrastructure/Tracing/W3CTraceContextMiddleware.cs (lines 637-639)

Problem: The middleware now creates a new Activity instead of enriching the existing one:

// NEW CODE (PROBLEMATIC):
activityToUse = new Activity("HTTP " + context.Request.Method + " " + context.Request.Path)
    .SetParentId(ActivityTraceId.CreateFromString(traceId), ActivitySpanId.CreateFromString(spanId))
    .Start();

Impact:

  • ASP.NET Core already creates an Activity for each request
  • This creates a second Activity, causing duplicate spans in traces
  • Breaks the parent-child relationship chain
  • May cause sampling inconsistencies

Fix: Remove the Activity creation logic and revert to enriching Activity.Current:

// CORRECTED CODE:
if (!string.IsNullOrEmpty(traceParent) && Activity.Current != null)
{
    _logger.LogDebug("Extracted traceparent from request: {TraceParent}", traceParent);

    // ASP.NET Core with OpenTelemetry already extracts traceparent automatically
    // Just enrich the current activity with additional context
    Activity.Current.SetTag("http.request_id", context.TraceIdentifier);
    Activity.Current.SetTag("http.route", context.Request.Path);

    if (!string.IsNullOrEmpty(traceState))
    {
        Activity.Current.SetTag("w3c.tracestate", traceState);
    }

    var customerId = context.User?.Identity?.Name ??
                    context.Request.Headers["X-Customer-ID"].FirstOrDefault();
    if (!string.IsNullOrEmpty(customerId))
    {
        Activity.Current.SetTag("customer.id", customerId);
    }
}

3. [MEDIUM] QueryGraphQLService.cs - Inconsistent traceParent Logic

Location: src/Orchestrator/GraphQL/Services/Implementations/QueryGraphQLService.cs (lines 458-466)

Problem: The code has inconsistent logic for generating traceparent:

var traceParentToUse = traceParent;
if (string.IsNullOrEmpty(traceParentToUse) && Activity.Current != null)
{
    var traceId = Activity.Current.TraceId.ToString();
    var spanId = Activity.Current.SpanId.ToString();
    var flags = (Activity.Current.ActivityTraceFlags & ActivityTraceFlags.Recorded) != 0 ? "01" : "00";
    traceParentToUse = $"00-{traceId}-{spanId}-{flags}";
}

Issue: This duplicates the logic from QueueTracePropagator.GetTraceParent(). Consider using the helper method for consistency.

Fix:

var traceParentToUse = traceParent ?? QueueTracePropagator.GetTraceParent();

Positive Findings

1. Proper Activity Disposal

All new Activity creations use using statements, ensuring proper disposal:

// GOOD: QueuedHostedService.cs
using var activity = workItem.ParentActivityContext.HasValue
    ? ActivitySource.StartActivity(...)
    : ActivitySource.StartActivity(...);

// GOOD: FinalizeProcessActivity.cs
using var activity = QueueTracePropagator.StartQueueActivity(
    "FinalizeProcessActivity",
    input.processParam.TraceParent);

2. Exception Recording with Context

The PR adds proper exception recording with contextual tags:

// GOOD: Exception handling in activities
catch (DataAccessException dae)
{
    QueueTracePropagator.RecordException(dae, new Dictionary<string, object?> { ["customer.id"] = input.customerId });
    throw new DataAccessException(errorLog, dae);
}

3. W3C Format Compliance

The traceparent format follows W3C specification:

  • 00-{traceId}-{spanId}-{flags}
  • traceId: 32 hex characters
  • spanId: 16 hex characters
  • flags: 00 or 01

4. Background Task Queue Trace Propagation

The WorkItem class now includes ActivityContext for proper trace propagation across the background task queue:

public class WorkItem
{
    public string Message { get; set; }
    public List<object> Data { get; set; }
    public string CustomerId { get; set; }

    /// <summary>
    /// Captured ActivityContext from the parent request for distributed tracing.
    /// </summary>
    public ActivityContext? ParentActivityContext { get; set; }
}

Code Quality Issues

1. Inconsistent Null-Conditional Operator Usage

Location: src/Attendance/Infrastructure/QueueChannel/BackgroundTaskQueueService.cs (line 17)

// Current:
workItem.ParentActivityContext ??= Activity.Current?.Context;

// Issue: If Activity.Current is null, this sets null. Consider explicit check.

2. Missing Documentation on DaprTracePropagationHandler

Location: src/Persistence/Infrastructure/Tracing/DaprTracePropagationHandler.cs

The new handler is registered but lacks integration tests or documentation on how it interacts with Dapr's built-in tracing.

3. Hardcoded Dapr Sidecar URL

Location: src/Orchestrator/GraphQL/Program.cs (lines 394-399)

services.AddHttpClient("DaprSidecar", client =>
{
    client.BaseAddress = new Uri("http://127.0.0.1:50001/");
    client.DefaultRequestHeaders.Add("dapr-app-id", "orchestrator");
})
.AddHttpMessageHandler<DaprTracePropagationHandler>();

The Dapr sidecar URL should be configurable via environment variables or configuration.


Cross-Stack Impact Analysis

Dependencies:

Attendance Module
  └── BackgroundTaskQueueService ──> WorkItem (with ActivityContext)
        └── QueuedHostedService ──> ActivitySource.StartActivity()

Orchestrator Module
  └── Workflow Activities ──> QueueTracePropagator.StartQueueActivity()
        ├── ProcessAttendanceActivity
        ├── ProcessHRMSActivity
        ├── ProcessPayrollActivity
        └── FinalizeProcessActivity
              └── PostGraphQLRequest.PostToGraphQL*() [BROKEN - ignores traceParent]

Persistence Module
  ├── QueueTracePropagator (shared utility)
  ├── W3CTraceContextMiddleware [PROBLEMATIC - duplicate activities]
  ├── GraphQLTracingInterceptor
  └── DaprTracePropagationHandler

Breaking Changes:

Component Change Type Impact
WorkItem Property Added Non-breaking - new nullable property
ProcessResult Property Added Non-breaking - new nullable property
PostGraphQLRequest Behavior Change BREAKING - ignores traceParent parameter
W3CTraceContextMiddleware Behavior Change BREAKING - creates duplicate activities

Testing Recommendations

Required Tests:

  1. Integration Test: Cross-Service Trace Propagation

    [Fact]
    public async Task WorkflowActivity_PreservesTraceContext_AcrossServiceCalls()
    {
        // Arrange: Start with a known trace ID
        var expectedTraceId = ActivityTraceId.CreateRandom();
    
        // Act: Execute workflow activity that calls another service
    
        // Assert: Verify traceparent header contains expected trace ID
    }
  2. Unit Test: PostGraphQLRequest Uses Explicit TraceParent

    [Fact]
    public async Task PostGraphQLRequest_UsesExplicitTraceParent_WhenProvided()
    {
        var explicitTraceParent = "00-abc123...-def456...-01";
    
        await PostGraphQLRequest.PostToGraphQLSAIS(query, customerId, null, token, explicitTraceParent);
    
        // Assert: HTTP request contains explicitTraceParent, not Activity.Current
    }
  3. Test: No Duplicate Activities Created

    [Fact]
    public async Task Middleware_DoesNotCreateDuplicateActivities()
    {
        var activityCountBefore = Activity.Current?.Source.Name;
    
        await _middleware.InvokeAsync(context);
    
        // Assert: Only one activity per request
    }

Security Considerations

  1. Trace ID Exposure: The PR adds X-Trace-Id and X-Trace-Parent headers to HTTP responses. This is standard practice but verify these don't expose sensitive internal information.

  2. CORS Headers: The PR exposes tracing headers via CORS:

    .WithExposedHeaders("X-Trace-Id", "X-Trace-Parent");

    This is correct for allowing frontend applications to access trace IDs.


Performance Impact

Aspect Impact Notes
Memory Low Activities properly disposed with using
CPU Low Minimal overhead from Activity creation
Network Low Small increase in header size (~55 bytes per request)
Database None No DB changes

Prioritized Recommendations

Must Fix (Blocking Merge):

  1. [CRITICAL] Fix PostGraphQLRequest.cs to use explicit traceParent parameter
  2. [HIGH] Revert W3CTraceContextMiddleware.cs to enrich Activity.Current instead of creating new Activities

Should Fix (Before Merge):

  1. [MEDIUM] Use QueueTracePropagator.GetTraceParent() in QueryGraphQLService.cs for consistency
  2. [MEDIUM] Make Dapr sidecar URL configurable

Nice to Have (Future PR):

  1. [LOW] Add integration tests for cross-service trace propagation
  2. [LOW] Add documentation for Dapr trace propagation

Quality Metrics

Metric Score Notes
Code Quality 75/100 Good patterns, but critical bugs introduced
Test Coverage N/A No tests visible in diff
Documentation 70/100 Good XML comments, missing architecture docs
Breaking Changes 2 Both unintentional and must be fixed
Performance 90/100 Proper disposal, minimal overhead

Overall Quality Score: 65/100 (would be 85+ without the critical bugs)


Next Steps

  1. Fix critical bugs in PostGraphQLRequest.cs and W3CTraceContextMiddleware.cs
  2. Add integration tests for cross-service trace propagation
  3. Verify fix with end-to-end trace visualization (Jaeger/Zipkin)
  4. Re-review after fixes applied

Appendix: Files Changed Summary

File Module Change Type
BackgroundTaskQueueService.cs Attendance Modified - Add trace context capture
QueuedHostedService.cs Attendance Modified - Add Activity creation
WorkItem.cs Attendance Modified - Add ActivityContext property
FinalizeProcessActivity.cs Orchestrator Modified - Add tracing
GetCustomerDataActivity.cs Orchestrator Modified - Add tracing
ProcessAttendanceActivity.cs Orchestrator Modified - Add tracing
ProcessHRMSActivity.cs Orchestrator Modified - Add tracing
ProcessPayrollActivity.cs Orchestrator Modified - Add tracing
SendSSEActivity.cs Orchestrator Modified - Add tracing
Program.cs Orchestrator Modified - Add middleware and handlers
DateProcessingToolService.cs Orchestrator Modified - Use QueueTracePropagator
HealthCheckService.cs Orchestrator Modified - Trace handling
QueryGraphQLService.cs Orchestrator Modified - Trace generation
PostGraphQLRequest.cs Persistence Modified - CRITICAL BUG
DaprTracePropagationHandler.cs Persistence New file
GraphQLTracingInterceptor.cs Persistence Modified - Add trace to context
QueueTracePropagator.cs Persistence Modified - W3C format
W3CTraceContextMiddleware.cs Persistence Modified - PROBLEMATIC
WorkflowInput.cs Persistence Modified - Add TraceParent property

Review complete. File path shown in metadata block above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment