Skip to content

Instantly share code, notes, and snippets.

@skohari
Created May 19, 2025 19:50
Show Gist options
  • Select an option

  • Save skohari/db4e2bb38becee26420cf2de5303f9d6 to your computer and use it in GitHub Desktop.

Select an option

Save skohari/db4e2bb38becee26420cf2de5303f9d6 to your computer and use it in GitHub Desktop.
vertexai deets
import logging
import vertexai
from vertexai.preview import rag
# Enable detailed logging to see the actual URLs
logging.basicConfig(level=logging.DEBUG)
logging.getLogger('google.cloud.aiplatform').setLevel(logging.DEBUG)
logging.getLogger('urllib3.connectionpool').setLevel(logging.DEBUG)
# Your current setup
vertexai.init(
project="your-project-id",
location="your-location",
api_endpoint="https://your-enterprise-endpoint.com"
)
# Try RAG operation and observe the logs
try:
print("Attempting RAG corpus creation...")
corpus = rag.create_corpus(display_name="test-corpus")
except Exception as e:
print(f"RAG failed: {e}")
# The logs above will show you what URL it's trying to reach
@skohari
Copy link
Author

skohari commented May 19, 2025

import os
import logging
import vertexai
from vertexai.preview import rag

Set up detailed gRPC logging

os.environ['GRPC_VERBOSITY'] = 'DEBUG'
os.environ['GRPC_TRACE'] = 'all'

Also enable Python logging

logging.basicConfig(level=logging.DEBUG)

try:
vertexai.init(
project="your-project-id",
location="your-location",
api_endpoint="https://your-enterprise-endpoint.com"
)

# This will show in logs what hostname it's trying to resolve
corpus = rag.create_corpus(display_name="test")

except Exception as e:
print(f"Error: {e}")
# Check the debug logs for the actual hostname being resolved

@skohari
Copy link
Author

skohari commented May 19, 2025

from google.cloud import resourcemanager
from google.iam.v1 import iam_policy_pb2

def check_rag_permissions(project_id):
"""Check if you have necessary permissions for RAG"""

try:
    # Get current user's permissions
    credentials, _ = default()
    
    # Permissions needed for RAG
    rag_permissions = [
        "aiplatform.ragCorpora.create",
        "aiplatform.ragCorpora.get", 
        "aiplatform.ragCorpora.list",
        "aiplatform.ragFiles.upload",
        "aiplatform.ragFiles.import",
    ]
    
    # Test permissions using Cloud Resource Manager
    client = resourcemanager.ProjectsClient(credentials=credentials)
    
    request = iam_policy_pb2.TestIamPermissionsRequest(
        resource=f"projects/{project_id}",
        permissions=rag_permissions
    )
    
    response = client.test_iam_permissions(request=request)
    
    print("RAG Permission Check Results:")
    for permission in rag_permissions:
        has_permission = permission in response.permissions
        status = "✅ GRANTED" if has_permission else "❌ DENIED"
        print(f"  {permission}: {status}")
    
    return len(response.permissions) == len(rag_permissions)
    
except Exception as e:
    print(f"Error checking permissions: {e}")
    return False

Check permissions

has_rag_permissions = check_rag_permissions("your-project-id")

@skohari
Copy link
Author

skohari commented May 20, 2025

Vendor Document Assessment System: Technical Documentation

System Overview

The Vendor Document Assessment System is a GenAI-powered application designed to evaluate vendor documentation against domain-specific assessment criteria. The system employs a Retrieval-Augmented Generation (RAG) approach to provide accurate, context-aware evaluations across multiple domains. This document details the system architecture, data flow, and key components that enable this functionality.

Architecture Components

1. FastAPI Backend

The core of the system is built on FastAPI, providing a high-performance, asynchronous API framework that handles all incoming requests and orchestrates the document processing workflow. Key features include:

  • RESTful API endpoints for document submission, assessment requests, and results retrieval
  • Asynchronous processing to handle concurrent document assessment requests
  • JWT-based authentication and role-based access control
  • Comprehensive request validation and error handling
  • Prometheus metrics and health check endpoints for monitoring
  • Swagger/OpenAPI documentation automatically generated for API endpoints

2. Salesforce Integration

The system integrates with Salesforce through its API to:

  • Poll for new vendor document submissions at configurable intervals
  • Retrieve document metadata including vendor information, document type, and relevant categories
  • Update assessment status in Salesforce upon completion
  • Maintain synchronization between systems using idempotent operations and event tracking

The integration uses OAuth 2.0 for secure authentication and implements connection pooling and rate limiting to ensure optimal performance while respecting Salesforce API constraints.

3. Document Processing Pipeline

Upon document retrieval, the system performs minimal preprocessing before leveraging Vertex AI RAG Engine's capabilities:

  • Format Detection: Automatically identifies document formats (PDF, DOCX, TXT, etc.)
  • Basic Content Validation: Ensures documents are not corrupted and meet basic quality criteria
  • Metadata Enrichment: Adds vendor metadata, document type, and classification information
  • GCS Upload: Uploads documents directly to Google Cloud Storage with appropriate organization

The system minimizes custom processing since Vertex AI RAG Engine handles the technical aspects of:

  • Document chunking
  • Text extraction from various formats
  • Embedding generation
  • Vector storage
  • Semantic retrieval

This approach significantly reduces system complexity and maintenance overhead while leveraging Google's optimized RAG implementation.

4. Google Cloud Storage Integration

Processed documents and their metadata are stored in Google Cloud Storage:

  • Documents are organized in a hierarchical bucket structure (vendor/category/document-id)
  • Content is stored in both raw and processed formats
  • Vector embeddings are stored alongside text chunks
  • Cloud Storage Object Lifecycle Management policies automate retention and deletion
  • Access is managed via IAM roles and signed URLs for temporary access

5. Vertex AI RAG Implementation

The system leverages Google Cloud's Vertex AI RAG Engine to abstract away many of the complex technical processes:

  • Direct Document Integration: The RAG Engine can directly read documents from the Google Cloud Storage bucket, eliminating the need for custom embedding and chunking implementations
  • Managed Vector Database: Uses Vertex AI's built-in vector storage capabilities for efficient similarity search
  • Automatic Chunking and Embedding: Vertex AI RAG Engine handles the technical aspects of document chunking and embedding generation
  • Custom Prompting: Domain-specific prompt templates are used to frame assessment questions
  • Context Retrieval: Top-K relevant chunks are retrieved based on semantic similarity
  • Response Generation: Large Language Model generates assessments based on retrieved context
  • Response Validation: Output is validated against assessment criteria for compliance and quality

6. Assessment Engine

The assessment component evaluates vendor documentation against predefined criteria:

  • Maintains a library of assessment templates for different domains (security, compliance, technical, etc.)
  • Dynamically selects relevant questions based on document type and vendor category
  • Employs a scoring framework for quantitative evaluation
  • Implements confidence scoring for uncertainty quantification
  • Provides evidence citations linking assessments to specific document sections
  • Supports human-in-the-loop review for assessments below confidence thresholds

Data Flow

  1. Document Intake: System polls Salesforce API or receives webhook notifications about new vendor documents
  2. Document Retrieval: Document metadata and file locations are obtained from Salesforce
  3. Processing: Documents undergo the processing pipeline stages
  4. Storage: Processed documents and metadata are stored in Google Cloud Storage
  5. RAG Corpus Creation: Vertex AI API creates/updates the RAG corpus with new document information
  6. Assessment Request: The system receives assessment requests with domain-specific questions
  7. Context Retrieval: Relevant document sections are retrieved from the RAG corpus
  8. Assessment Generation: Vertex AI generates assessments based on retrieved context
  9. Result Storage: Assessment results are stored and linked to the original documents
  10. Result Delivery: Results are returned via API and optionally pushed to Salesforce

Deployment Architecture

The system is deployed as containerized microservices in OpenShift Container Platform (OCP) clusters:

  • Separate services for API handling, document processing, and assessment generation
  • Autoscaling through OpenShift's HorizontalPodAutoscaler based on request load and processing queue depth
  • Deployment strategies leveraging OpenShift's DeploymentConfig for zero-downtime updates
  • Multi-cluster deployment with OpenShift's federation capabilities for failover and resilience
  • Service mesh implementation using OpenShift Service Mesh for secure service-to-service communication
  • Route configuration for external API access with TLS termination
  • Persistent volumes for stateful components backed by enterprise storage
  • Integration with OpenShift's built-in monitoring and logging stack
  • CI/CD pipeline integration through OpenShift Pipelines (Tekton)

Security Considerations

  • All data in transit is encrypted using TLS 1.3
  • All data at rest is encrypted using Google-managed encryption keys
  • API access requires authentication via JWT tokens
  • Fine-grained authorization through RBAC policies
  • Audit logging for all document access and assessment operations
  • Regular vulnerability scanning and dependency updates
  • Sensitive information handling compliant with GDPR and other relevant regulations

Monitoring and Observability

  • Prometheus metrics for system performance and health
  • Structured logging with correlation IDs for request tracing
  • Alerting based on error rates, latency, and queue depth
  • Dashboard visualizations for system status and assessment metrics
  • Automated anomaly detection for unusual system behavior

Disaster Recovery

  • Regular backups of configuration and assessment templates
  • Point-in-time recovery capabilities for document storage
  • Documented recovery procedures for various failure scenarios
  • Regular disaster recovery testing
  • Multi-region redundancy for critical components

@skohari
Copy link
Author

skohari commented May 20, 2025

`# Vendor Document Assessment System: Technical Documentation

System Overview

The Vendor Document Assessment System is a GenAI-powered application designed to evaluate vendor documentation against domain-specific assessment criteria. The system employs a Retrieval-Augmented Generation (RAG) approach to provide accurate, context-aware evaluations across multiple domains. This document details the system architecture, data flow, and key components that enable this functionality.

Architecture Components

1. FastAPI Backend

The core of the system is built on FastAPI, providing a high-performance, asynchronous API framework that handles all incoming requests and orchestrates the document processing workflow. Key features include:

  • RESTful API endpoints for document submission, assessment requests, and results retrieval
  • Asynchronous processing to handle concurrent document assessment requests
  • JWT-based authentication and role-based access control
  • Comprehensive request validation and error handling
  • Prometheus metrics and health check endpoints for monitoring
  • Swagger/OpenAPI documentation automatically generated for API endpoints

2. Salesforce Integration

The system integrates with Salesforce through its API to:

  • Poll for new vendor document submissions at configurable intervals
  • Retrieve document metadata including vendor information, document type, and relevant categories
  • Update assessment status in Salesforce upon completion
  • Maintain synchronization between systems using idempotent operations and event tracking

The integration uses OAuth 2.0 for secure authentication and implements connection pooling and rate limiting to ensure optimal performance while respecting Salesforce API constraints.

3. Document Processing Pipeline

Upon document retrieval, the system performs minimal preprocessing before leveraging Vertex AI RAG Engine's capabilities:

  • Format Detection: Automatically identifies document formats (PDF, DOCX, TXT, etc.)
  • Basic Content Validation: Ensures documents are not corrupted and meet basic quality criteria
  • Metadata Enrichment: Adds vendor metadata, document type, and classification information
  • GCS Upload: Uploads documents directly to Google Cloud Storage with appropriate organization

The system minimizes custom processing since Vertex AI RAG Engine handles the technical aspects of:

  • Document chunking
  • Text extraction from various formats
  • Embedding generation
  • Vector storage
  • Semantic retrieval

This approach significantly reduces system complexity and maintenance overhead while leveraging Google's optimized RAG implementation.

4. Google Cloud Storage Integration

Processed documents and their metadata are stored in Google Cloud Storage:

  • Documents are organized in a hierarchical bucket structure (vendor/category/document-id)
  • Content is stored in both raw and processed formats
  • Vector embeddings are stored alongside text chunks
  • Cloud Storage Object Lifecycle Management policies automate retention and deletion
  • Access is managed via IAM roles and signed URLs for temporary access

5. Vertex AI RAG Implementation

The system leverages Google Cloud's Vertex AI RAG Engine to abstract away many of the complex technical processes:

  • Direct Document Integration: The RAG Engine can directly read documents from the Google Cloud Storage bucket, eliminating the need for custom embedding and chunking implementations
  • Managed Vector Database: Uses Vertex AI's built-in vector storage capabilities for efficient similarity search
  • Automatic Chunking and Embedding: Vertex AI RAG Engine handles the technical aspects of document chunking and embedding generation
  • Custom Prompting: Domain-specific prompt templates are used to frame assessment questions
  • Context Retrieval: Top-K relevant chunks are retrieved based on semantic similarity
  • Response Generation: Large Language Model generates assessments based on retrieved context
  • Response Validation: Output is validated against assessment criteria for compliance and quality

6. Assessment Engine

The assessment component evaluates vendor documentation against predefined criteria:

  • Maintains a library of assessment templates for different domains (security, compliance, technical, etc.)
  • Dynamically selects relevant questions based on document type and vendor category
  • Employs a scoring framework for quantitative evaluation
  • Implements confidence scoring for uncertainty quantification
  • Provides evidence citations linking assessments to specific document sections
  • Supports human-in-the-loop review for assessments below confidence thresholds

Data Flow

  1. Document Intake: System polls Salesforce API or receives webhook notifications about new vendor documents
  2. Document Retrieval: Document metadata and file locations are obtained from Salesforce
  3. Processing: Documents undergo the processing pipeline stages
  4. Storage: Processed documents and metadata are stored in Google Cloud Storage
  5. RAG Corpus Creation: Vertex AI API creates/updates the RAG corpus with new document information
  6. Assessment Request: The system receives assessment requests with domain-specific questions
  7. Context Retrieval: Relevant document sections are retrieved from the RAG corpus
  8. Assessment Generation: Vertex AI generates assessments based on retrieved context
  9. Result Storage: Assessment results are stored and linked to the original documents
  10. Result Delivery: Results are returned via API and optionally pushed to Salesforce

Deployment Architecture

The system is deployed as containerized microservices in OpenShift Container Platform (OCP) clusters:

  • Separate services for API handling, document processing, and assessment generation
  • Autoscaling through OpenShift's HorizontalPodAutoscaler based on request load and processing queue depth
  • Deployment strategies leveraging OpenShift's DeploymentConfig for zero-downtime updates
  • Multi-cluster deployment with OpenShift's federation capabilities for failover and resilience
  • Service mesh implementation using OpenShift Service Mesh for secure service-to-service communication
  • Route configuration for external API access with TLS termination
  • Persistent volumes for stateful components backed by enterprise storage
  • Integration with OpenShift's built-in monitoring and logging stack
  • CI/CD pipeline integration through OpenShift Pipelines (Tekton)

Security Considerations

  • All data in transit is encrypted using TLS 1.3
  • All data at rest is encrypted using Google-managed encryption keys
  • API access requires authentication via JWT tokens
  • Fine-grained authorization through RBAC policies
  • Audit logging for all document access and assessment operations
  • Regular vulnerability scanning and dependency updates
  • Sensitive information handling compliant with GDPR and other relevant regulations

Monitoring and Observability

  • Prometheus metrics for system performance and health
  • Structured logging with correlation IDs for request tracing
  • Alerting based on error rates, latency, and queue depth
  • Dashboard visualizations for system status and assessment metrics
  • Automated anomaly detection for unusual system behavior

Disaster Recovery

  • Regular backups of configuration and assessment templates
  • Point-in-time recovery capabilities for document storage
  • Documented recovery procedures for various failure scenarios
  • Regular disaster recovery testing
  • Multi-region redundancy for critical components`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment