Skip to content

Instantly share code, notes, and snippets.

@akaak
Created March 10, 2026 02:23
Show Gist options
  • Select an option

  • Save akaak/d9114e5935dabe0d49e121deea2b8c67 to your computer and use it in GitHub Desktop.

Select an option

Save akaak/d9114e5935dabe0d49e121deea2b8c67 to your computer and use it in GitHub Desktop.
eRegs: MongoDB → PostgreSQL Data Migration Plan

MongoDB → PostgreSQL Data Migration Plan

Context

The existing eRegs production app stores data in MongoDB (5 collections: users, employers, usersubscriptions, transactions, employees). The new eRegs app uses PostgreSQL via Prisma. This plan migrates all paid subscriber data — users, organizations, subscriptions, transactions, and drivers — into the new schema. The scripts must be idempotent (safe to run multiple times) to support staging validation and production execution.


Phase 1: Schema Additions (Prisma)

Edit prisma/schema.prisma then run npx prisma migrate dev --name add-migration-fields.

1.1 Add to User model

mongoId       String?  @unique  // MongoDB _id — idempotency key
accountNumber String?  @unique  // Human-readable ID (SAN-0071, EC5LFY5LD4)

1.2 Add to Organization model

mongoId          String?  @unique  // MongoDB Employer._id
billingFrequency String?           // "YEARLY" | "MONTHLY"

1.3 Add to Subscription model

mongoId          String?  @unique  // MongoDB usersubscription._id
billingFrequency String?           // "YEARLY" | "MONTHLY"

1.4 Add to Driver model

mongoId String?  @unique  // MongoDB employee._id — idempotency key
userId  String?  @unique  // FK → User.id (employee's PG User row)
user    User?    @relation(fields: [userId], references: [id], onDelete: SetNull)

Also add reverse relation to User: driverProfile Driver?

1.5 New Transaction model

enum TransactionType   { SUBSCRIPTION INVITE }
enum TransactionStatus { PAID PENDING FAILED REFUNDED }

model Transaction {
  id                      String            @id @default(cuid())
  mongoId                 String?           @unique
  mongoUserSubscriptionId String?           // denormalized ref to usersubscription._id
  userId                  String?
  user                    User?             @relation(fields: [userId], references: [id], onDelete: SetNull)
  orgId                   String?
  org                     Organization?     @relation(fields: [orgId], references: [id], onDelete: SetNull)
  stripeInvoiceId         String?
  stripeInvoiceUrl        String?
  stripeEventId           String?
  stripePaymentIntentId   String?
  stripeReceiptUrl        String?
  totalAmountCents        Int?
  unitPriceCents          Int?
  quantity                Int?
  type                    TransactionType
  status                  TransactionStatus
  invoiceDate             DateTime?
  createdAt               DateTime          @default(now())
  updatedAt               DateTime          @updatedAt

  @@index([userId])
  @@index([orgId])
  @@index([stripeInvoiceId])
}

Also add reverse relations to User and Organization:

// In User:         transactions Transaction[]
// In Organization: transactions Transaction[]

Phase 2: .env.datamigration File

Location: .env.datamigration (covered by .gitignore via .env*)

# Target PostgreSQL
DATABASE_URL="postgresql://USER:PASS@HOST:5432/DBNAME?sslmode=require"

# Source MongoDB
MONGODB_URI="mongodb+srv://mvpuser:...@cluster0.fxismfr.mongodb.net/eregsmvp_stagging"

# Migration control
DRY_RUN="false"     # "true" = log only, no writes
VERBOSE="false"     # "true" = per-document logs
LIMIT=""            # Max docs per collection for test runs ("" or "0" = no limit)

# Logging
LOG_FILE="scripts/data-migration/migration.log"

Phase 3: Script File Structure

All in scripts/data-migration/

lib/
  mongo.ts        # MongoClient singleton + typed collection accessors + TS interfaces
  prisma.ts       # PrismaClient singleton using .env.datamigration DATABASE_URL
  logger.ts       # MigrationLogger: console+file output, stats tracking
  transform.ts    # Pure field-mapping functions (testable, no DB calls)

00-validate-source.ts   # Read-only pre-flight: MongoDB counts + FK sanity
01-migrate-users.ts     # MongoDB users (EMPLOYER type only) → PG User
02-migrate-orgs.ts      # MongoDB employers → PG Organization; update User.organizationId
03-migrate-subs.ts      # MongoDB usersubscriptions → PG Subscription
04-migrate-txns.ts      # MongoDB transactions → PG Transaction
05-migrate-drivers.ts   # MongoDB employees → PG User (role=PRO) + Driver
06-validate-target.ts   # Post-migration SQL validation queries

run-all.ts              # Orchestrator: runs 01→05 in order, halts on failure

output/                 # Created at runtime
  receipt-urls.csv      # S3 PDF URLs from employee receipts (manual review)
  migration.log         # Appended each run

Pattern: Each script follows sync-local.ts convention — new PrismaPg({ connectionString }) + new PrismaClient({ adapter }). Each exports default async function main() and auto-runs when executed directly.

Invocation:

# Add mongodb driver (one-time)
npm install --save-dev mongodb

# Pre-flight
npx tsx --env-file=.env.datamigration scripts/data-migration/00-validate-source.ts

# Dry run
DRY_RUN=true npx tsx --env-file=.env.datamigration scripts/data-migration/run-all.ts

# Real run
npx tsx --env-file=.env.datamigration scripts/data-migration/run-all.ts

# Post-validation
npx tsx --env-file=.env.datamigration scripts/data-migration/06-validate-target.ts

Phase 4: Migration Order (Dependency Graph)

01: MongoDB users (EMPLOYER) → PG User
        ↓
02: MongoDB employers → PG Organization  (needs User.id)
        ↓              ↓
03: Subscriptions   05: Drivers          (need Organization.id)
        ↓
04: Transactions                          (needs User.id + Organization.id)

Phase 5: Idempotency Strategy

Every script checks mongoId before inserting. If found → skip (log "already migrated"). Also catch Prisma P2002 (unique constraint) as a secondary skip guard for fields like email and stripeCustomerId.

Collection PG Model Idempotency Key
users User User.mongoId
employers Organization Organization.mongoId
usersubscriptions Subscription Subscription.mongoId
transactions Transaction Transaction.mongoId
employees Driver Driver.mongoId

Phase 6: Field Mapping

6.1 users (EMPLOYER only) → User

MongoDB PostgreSQL Transform
_id mongoId .toString()
email email lowercase + trim
password password copy (bcrypt portable)
name.firstName + lastName name concat with space
userType = "EMPLOYER" role "FLEET_MANAGER"
accountNumber accountNumber copy as-is
deleted = true deletedAt set to updatedAt
createdAt, updatedAt same parse to Date
passwordResetLink, __v, employee, employer skip

Note: Employee users (EMPLOYEEE) are handled in step 05.

6.2 employersOrganization + update User

MongoDB PostgreSQL Transform
_id Organization.mongoId .toString()
user (→ User._id) ownerId look up User by mongoId
company name trim; fallback: email domain
numberOfInvitesRemaining freeInvitesLeft direct int
billingFrequency billingFrequency copy
tiralPeriodUsed (typo!) trialEndsAt false → createdAt+14d; true → createdAt
stripe.customer (log only) Subscription is authoritative
cardDetails SKIP PCI data — never read or log
access, price, numberOfTotalInvitesRemaining skip

After creating Org, update User.organizationId = org.id.

6.3 usersubscriptionsSubscription

MongoDB PostgreSQL Transform
_id mongoId .toString()
employer (→ Employer._id) orgId look up Org by mongoId
user (→ User._id) userId look up User by mongoId
stripeDetails.customerId stripeCustomerId copy
stripeDetails.id (sub_xxx) stripeSubscriptionId copy
stripeDetails.planId stripePriceId copy
stripeDetails.status status see enum map below
billingFrequency billingFrequency copy
cancelAtPeriodEnd cancelAtPeriodEnd boolean
startDate currentPeriodStart parse to Date
endDate currentPeriodEnd parse to Date
subscriptionType plan "employer"FLEET_MANAGER
cancelDate billingNotes append "Legacy cancel date: YYYY-MM-DD"
transaction, stripeDetails.paymentMethod skip

Status map: activeACTIVE, trialingTRIALING, past_duePAST_DUE, canceled/cancelledCANCELED, pausedPAUSED, else→TRIALING (warn).

6.4 transactionsTransaction

MongoDB PostgreSQL Transform
_id mongoId .toString()
user (→ User._id) userId look up User by mongoId
employer (→ User._id of employer!) orgId look up User→get ownedOrg.id
stripeHostedInvoiceUrl stripeInvoiceUrl copy
stripeInvoiceId stripeInvoiceId copy
totalAmount totalAmountCents copy directly (already cents: 4000 = $40.00)
status (int: 1=paid) status 1→PAID, 0→PENDING, -1→FAILED
invoiceDate invoiceDate parse to Date
type type "subscription"SUBSCRIPTION, "invite"INVITE
stripeEventId stripeEventId invite type only
paymentIntentId stripePaymentIntentId invite type only
stripeReceiptUrl stripeReceiptUrl invite type only
unit_price unitPriceCents copy directly (already cents)
quantity quantity invite type only
userSubscription mongoUserSubscriptionId denormalized string ref

6.5 employees → PG User (role=PRO) + Driver

Step A: Create PG User row for the employee:

MongoDB (users join) PostgreSQL User Transform
users._id mongoId .toString()
users.email email lowercase + trim
users.password password copy (bcrypt portable)
users.name.firstName + lastName name concat
userType = "EMPLOYEEE" role "PRO"
users.accountNumber accountNumber copy (10-char alphanumeric)
users.deleted = true deletedAt set to updatedAt

Step B: Create Driver record linked to PG User:

MongoDB employees PostgreSQL Driver Transform
_id mongoId .toString()
PG User.id from Step A userId FK to new User row
employer (→ employer User._id) orgId User.mongoId → User.ownedOrg.id
phone phone trim; null if empty
email from users join email from joined users doc
name from users join name firstName + lastName
dateAcknowledged non-null status ACKNOWLEDGED; set acknowledgedAt
accountStatus = true + no ack status ACTIVE
else status PENDING
dateInvited createdAt parse to Date
dateAcknowledged acknowledgedAt parse to Date
phone exists, no email inviteMethod SMS; else EMAIL
receipt (S3 info) write to output/receipt-urls.csv
passwordSetup, access, company, __v skip

Phase 7: Key Edge Cases

  1. EMPLOYEEE typo — script 05 handles both EMPLOYEE and EMPLOYEEE; both create PG User (role=PRO) + Driver row
  2. cardDetails — explicitly destructured-out and never read: const { cardDetails: _PCI, ...safeDoc } = employer
  3. tiralPeriodUsed typo — read exactly as written from MongoDB
  4. employees.employer is a User._id, not Employer._id — resolution chain: employee.employer (Mongo User _id) → PG User.mongoIdPG User.ownedOrganization.id
  5. Duplicate Stripe IDs — catch P2002 on stripeCustomerId; log warning, don't fail run
  6. Missing cross-references — if joined User doc not found in MongoDB, log warning and use partial data (don't abort)
  7. deleted: true users — still migrated; deletedAt set to updatedAt
  8. S3 receipt PDFs — collected into output/receipt-urls.csv for manual follow-up

Phase 8: Post-Migration Validation (06-validate-target.ts)

Count parity (compare against 00-validate-source output):

SELECT COUNT(*) FROM "User" WHERE "mongoId" IS NOT NULL;
-- repeat for Organization, Subscription, Transaction, Driver

FK integrity (all should return 0):

-- Orgs with invalid ownerId
SELECT COUNT(*) FROM "Organization" o
WHERE o."mongoId" IS NOT NULL
  AND NOT EXISTS (SELECT 1 FROM "User" u WHERE u.id = o."ownerId");

-- Subscriptions with invalid orgId
SELECT COUNT(*) FROM "Subscription" s
WHERE s."mongoId" IS NOT NULL AND s."orgId" IS NOT NULL
  AND NOT EXISTS (SELECT 1 FROM "Organization" o WHERE o.id = s."orgId");

-- Drivers with invalid orgId
SELECT COUNT(*) FROM "Driver" d
WHERE d."mongoId" IS NOT NULL
  AND NOT EXISTS (SELECT 1 FROM "Organization" o WHERE o.id = d."orgId");

Business logic warnings (non-zero = investigate):

-- Orgs without a Subscription
SELECT COUNT(*) FROM "Organization" o
WHERE o."mongoId" IS NOT NULL
  AND NOT EXISTS (SELECT 1 FROM "Subscription" s WHERE s."orgId" = o.id);

-- ACKNOWLEDGED drivers missing acknowledgedAt
SELECT COUNT(*) FROM "Driver"
WHERE "mongoId" IS NOT NULL AND status = 'ACKNOWLEDGED' AND "acknowledgedAt" IS NULL;

-- Duplicate Stripe customer IDs
SELECT "stripeCustomerId", COUNT(*) FROM "Subscription"
WHERE "stripeCustomerId" IS NOT NULL
GROUP BY "stripeCustomerId" HAVING COUNT(*) > 1;

-- Transaction amounts out of range
SELECT COUNT(*) FROM "Transaction"
WHERE "totalAmountCents" IS NOT NULL
  AND ("totalAmountCents" < 0 OR "totalAmountCents" > 1000000);

Critical Files

File Action
prisma/schema.prisma Add new fields + Transaction model
scripts/data-migration/lib/mongo.ts New — MongoDB connection + typed interfaces
scripts/data-migration/lib/transform.ts New — all pure mapping functions
scripts/data-migration/lib/logger.ts New — MigrationLogger class
scripts/data-migration/lib/prisma.ts New — PrismaClient using .env.datamigration
scripts/data-migration/00-validate-source.ts New — pre-flight read-only check
scripts/data-migration/01-migrate-users.ts New
scripts/data-migration/02-migrate-orgs.ts New
scripts/data-migration/03-migrate-subs.ts New
scripts/data-migration/04-migrate-txns.ts New
scripts/data-migration/05-migrate-drivers.ts New
scripts/data-migration/06-validate-target.ts New — post-migration validation
scripts/data-migration/run-all.ts New — orchestrator
.env.datamigration New — gitignored config
package.json Add mongodb dev dependency

Reference pattern for Prisma client init: scripts/sync-local.ts (uses PrismaPg adapter).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment