Date: 2026-01-16
Status: ✅ COMPLETED
Files Modified:
src/photos/tableManager.tssrc/photos/index.ts
Changes:
- Modified
markAsProcessing()to returnbooleaninstead ofvoid - Added WHERE clause to prevent updating records already in "processing" status
- Only allows update if status is not "processing" OR if stuck for more than 3 minutes
- Returns
trueif successfully marked,falseif already being processed - Updated
processListing()to check return value and skip if already being processed
Impact: Prevents multiple service instances from processing the same listing simultaneously.
Files Modified:
src/photos/index.ts
Changes:
- Changed
recoverOrphanedRecords()fromprivatetopublic - Reduced stuck timeout from 15 minutes to 3 minutes
- Added periodic recovery check every 60 loops (~5 minutes)
- Recovery now runs continuously during service operation, not just at startup
Impact: Stuck records are now recovered every ~5 minutes instead of only at service restart.
Files Modified:
src/photos/photoProcessor.ts
Changes:
- Added 2-minute timeout to RETS photo fetch operation
- Uses
Promise.race()to race fetch against timeout - Throws clear error message if timeout occurs
Impact: Prevents indefinite hangs when RETS server is unresponsive.
Files Modified:
src/photos/photoProcessor.ts
Changes:
- Added failure tracking (
failedCount) - Calculates success rate for each listing
- Throws error if all photos fail
- Throws error if success rate < 50%
- Logs warning if any photos fail but success rate is acceptable
Impact: Prevents listings from being marked as "completed" with empty or incomplete photo data.
Files Modified:
src/photos/index.ts
Changes:
- Added validation step before marking listing as completed
- Verifies manifest file exists on CDN/S3
- Verifies manifest can be read
- Verifies photo count in manifest matches processed count
- Only marks as completed if all validations pass
Impact: Ensures data integrity before marking listings as completed.
Files Modified:
src/photos/index.tssrc/photos/tableManager.tsscripts/unstick-processing.ts
Changes:
- Reduced stuck processing timeout from 15 minutes to 3 minutes across all components:
recoverOrphanedRecords()querymarkAsProcessing()WHERE clausegetListingsNeedingProcessing()queries (both ACTIVE and SOLD)unstick-processing.tsscript queries
Impact: Faster recovery of stuck records - from 15 minutes to 3 minutes.
Files Modified:
src/photos/index.tssrc/photos/photoProcessor.ts
Changes:
- Increased default batch size from 1 to 10
- Updated both service config and DEFAULT_CONFIG
Impact: Service can now process up to 10 listings in parallel, better utilizing server resources.
- Stuck Processing Detection: 3 minutes (was 15 minutes)
- RETS Photo Fetch: 2 minutes (new)
- Orphaned Record Recovery: Every ~5 minutes (was only at startup)
- Minimum Success Rate: 50% of photos must process successfully
- Batch Size: 10 listings processed in parallel (was 1)
- Manifest file existence check
- Manifest readability check
- Photo count verification
- Race condition prevention
Before deploying to production, test:
-
Race Condition Fix
# Run two instances simultaneously and verify no duplicate processing bun src/index.ts & bun src/index.ts &
-
Stuck Record Recovery
# Verify stuck records are recovered within 5 minutes # Monitor logs for recovery messages
-
RETS Timeout
# Simulate slow RETS server and verify 2-minute timeout -
Validation
# Process a listing and verify all validations pass bun run scripts/debug-photo-processing.ts <SystemID> <PropertyType>
-
Batch Processing
# Verify 10 listings are processed in parallel # Monitor resource usage (CPU, memory)
Watch for these log messages:
✅ Validation passed: X photos verified✅ Reset X orphaned records to failed status for retry⏭️ Skipping listing X: Already being processed by another instance
⚠️ Warning: X/Y photos failed to process (Z% success rate)⚠️ Found X orphaned processing records
❌ All X photos failed to process❌ Only X/Y photos processed successfully (Z% success rate, minimum 50% required)❌ RETS photo fetch timeout after 120s❌ Manifest validation failed: expected X photos, found Y
If issues occur, revert these commits:
- Batch size can be reduced via config without code change
- Timeout values can be adjusted in the code
- Validation can be temporarily disabled by commenting out the validation block
- Recovery Time: 80% faster (3 min vs 15 min)
- Throughput: 10x increase (10 parallel vs 1)
- Reliability: Significantly reduced stuck records
- Data Integrity: 100% validation before completion
- CPU: Moderate increase due to parallel processing
- Memory: Moderate increase (10x concurrent Sharp operations)
- Network: Moderate increase (10x concurrent RETS requests)
Monitor server resources after deployment and adjust batch size if needed.
- ✅ Deploy to staging environment
- ✅ Monitor for 24 hours
- ✅ Review logs for any errors or warnings
- ✅ Verify stuck records are recovered promptly
- ✅ Check resource usage (CPU, memory, network)
- ✅ Deploy to production if all tests pass
- All changes are backward compatible
- No database schema changes required
- Service can be deployed without downtime
- Existing stuck records will be recovered automatically on next service loop