Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save arfar-x/2776bdeb59240522ec7411cfb7cf6448 to your computer and use it in GitHub Desktop.

Select an option

Save arfar-x/2776bdeb59240522ec7411cfb7cf6448 to your computer and use it in GitHub Desktop.
Join (Left outer join) operation in MongoDB
/*
من یک فیچری دارم بنام رسانه، هر کاربر میتونه یک رسانه بسازه و تعیین کنه که نوع اون چی باشه. هر رسانه فقط یک نوع میتونه داشته باشه، میتونه تلگرام باشه، بله باشه، یا وب‌هوک باشه.
هر رسانه در قالب یک داکیومنت جدا در مونگو ذخیره میشه. پس اگر یک کاربر دو نوع رسانه تلگرام و وب‌هوک رو خواست، ما براش دو تا درخواست توی دیبتابیس ذخیره میکنیم که یکیش نوع تلگرام هست (فیلد telegram_id داره) و یکی دیگه که نوع وب‌هوک هست (فیلد webhook_id داره).
حالا من میخوام چک کنم که کدوم کاربر ها دو نوع تلگرام و وب‌هوک رو دارن.
در نظر بگیر که هر نوع رسانه توی داکیومنت جدا ذخیره میشه.
*/
// MongoDB aggregations
[
{
$match: {
status: "ACTIVE",
deleted_at: null
}
},
{
$group: {
_id: "$user_id",
hasTelegram: {
$max: {
$cond: [
{
$and: [
{ $gt: [{ $type: "$telegram_id" }, "missing"] },
{ $ne: ["$telegram_id", null] }
]
},
1,
0
]
}
},
hasWebhook: {
$max: {
$cond: [
{
$and: [
{ $gt: [{ $type: "$webhook_id" }, "missing"] },
{ $ne: ["$webhook_id", null] }
]
},
1,
0
]
}
},
hasBale: {
$max: {
$cond: [
{
$and: [
{ $gt: [{ $type: "$bale_id" }, "missing"] },
{ $ne: ["$bale_id", null] }
]
},
1,
0
]
}
}
}
},
{
$match: {
hasTelegram: 1,
hasBale: 1,
hasWebhook: 1,
}
}
]
db.purchases.aggregate([
{
$lookup: {
from: 'users',
// Here, we cast 'user_id' in purchases collection into 'ObjectId' type to match with '_id' on users collection.
let: { userId: { $toObjectId: '$user_id' } }, // We declare a value names 'userId' to be used in the pipeline.
pipeline: [
{
$match: {
$expr: {
$eq: ["$_id", "$$userId"] // The values for both 'users:_id' and 'purchases:user_id' are evaluated to be equal.
}
}
}
],
as: 'user'
}
},
{
$unwind: { path: "$user", preserveNullAndEmptyArrays: true }
},
{
$match: {
price: {$gte: 50000}
}
},
{
$project: {
_id: 1, // We want to enable the '_id' of a purchase.
user_id: '$user._id',
username: '$user.username',
title: '$user.title',
user_active: '$user.active',
price: 1,
created_at: 1,
user_created_at: "$user.created_at"
}
},
{
$sort: {created_at: 1}
},
{
$limit: 10
}
])

MongoDB Aggregation Query Documentation

This aggregation query performs the following tasks:

  • Filters documents that are active and not deleted.
  • Groups documents by media_id and filters groups with more than 5 documents.
  • Looks up related documents from the new_media_tasks collection.
  • Groups the joined documents by filter_id within each media_id.
  • Returns nested grouped documents with counts, sorted by the number of filter_id groups per media_id.

Aggregation Pipeline

[
  // Step 1: Filter documents where `deleted_at` is null and `status` is "ACTIVE"
  {
    $match: {
      deleted_at: null,
      status: "ACTIVE"
    }
  },

  // Step 2: Group documents by `media_id` and count the number of documents per media
  {
    $group: {
      _id: "$media_id",
      count: { $sum: 1 }
    }
  },

  // Step 3: Keep only media groups with more than 5 documents
  {
    $match: {
      count: { $gt: 5 }
    }
  },

  // Step 4: Lookup related documents from `new_media_tasks` collection matching by `media_id`
  {
    $lookup: {
      from: "new_media_tasks",
      localField: "_id",
      foreignField: "media_id",
      as: "documents"
    }
  },

  // Step 5: Unwind the `documents` array to work with each document individually
  {
    $unwind: "$documents"
  },

  // Step 6: Group documents by both `media_id` and `filter_id`, accumulate documents and count per filter
  {
    $group: {
      _id: {
        media_id: "$_id",
        filter_id: "$documents.filter_id"
      },
      count: { $sum: 1 },
      documents: { $push: "$documents" }
    }
  },

  // Step 7: Group again by `media_id` to nest filters within each media document
  {
    $group: {
      _id: "$_id.media_id",
      filters: {
        $push: {
          filter_id: "$_id.filter_id",
          count: "$count",
          documents: "$documents"
        }
      }
    }
  },

  // Step 8: Add a field with the count of filters for sorting
  {
    $addFields: {
      filters_count: { $size: "$filters" }
    }
  },

  // Step 9: Sort documents descending by the number of filters
  {
    $sort: { filters_count: -1 }
  }
]

Result Structure

The output documents will be structured like this:

{
  "_id": "media_id_value",
  "filters": [
    {
      "filter_id": "filter_id_1",
      "count": number_of_documents_for_this_filter,
      "documents": [
        { /* document 1 */ },
        { /* document 2 */ },
        ...
      ]
    },
    ...
  ],
  "filters_count": total_number_of_filters
}
// Users collection
{
_id: ObjectId('id1'), // ObjectId
username: 'johndoe',
fullname: 'John Doe',
active: true,
created_at: 2014-01-01T00:00:01.000+00:00
},
{
_id: ObjectId('id2'),
username: 'robertjohnson',
fullname: 'Robert Johnson',
active: true,
created_at: 2014-01-01T00:00:01.000+00:00
},
{
_id: ObjectId('id3'),
username: 'williw',
fullname: 'William White',
active: false,
created_at: 2014-01-01T00:00:01.000+00:00
},
...
// Purchases collection
{
_id: ObjectId('pu_id1'),
user_id: 'id1', // string
title: 'Book cart',
price: 10000,
created_at: 2014-01-01T00:00:01.000+00:00
},
{
_id: ObjectId('pu_id2'),
user_id: 'id1', // string
title: 'Clothes cart',
price: 3500,
created_at: 2014-01-01T00:00:01.000+00:00
},
{
_id: ObjectId('pu_id3'),
user_id: 'id3', // string
title: 'Car stuff',
price: 87000,
created_at: 2014-01-01T00:00:01.000+00:00
},
{
_id: ObjectId('pu_id4'),
user_id: 'id2', // string
title: 'Kitchen redesigning',
price: 120000,
created_at: 2014-01-01T00:00:01.000+00:00
},
...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment