One-to-Many Relationship Example

In MongoDB, a one-to-many relationship can be modeled in two main ways:
  • Embedding: Embed many related documents in an array within the parent document.
  • Referencing: Store related documents in a separate collection and reference them via an identifier.
  • I'll provide an example using both approaches in the MongoDB shell.
  • Scenario: Modeling Authors and Books
  • We will model an author who writes multiple books, which is a classic one-to-many relationship.
1. Embedding Approach (One-to-Many)
  • In this approach, the books are embedded directly inside the author document as an array.
  • Step 1: Inserting Data (Embedding)

  use libraryDB  # Switch to or create the database

  db.authors.insertOne({
    _id: 1,
    name: "George Orwell",
    age: 46,
    books: [
      {
        title: "1984",
        genre: "Dystopian",
        published_year: 1949
      },
      {
        title: "Animal Farm",
        genre: "Political Satire",
        published_year: 1945
      }
    ]
  })

  • In this case, the books field is an array, and each book is stored as an embedded document within the author document.
  • Step 2: Querying Data (Embedding)
  • To retrieve an author along with their books, you can simply query the authors collection:

  db.authors.findOne({ _id: 1 })

  // Output:
  {
    "_id": 1,
    "name": "George Orwell",
    "age": 46,
    "books": [
      {
        "title": "1984",
        "genre": "Dystopian",
        "published_year": 1949
      },
      {
        "title": "Animal Farm",
        "genre": "Political Satire",
        "published_year": 1945
      }
    ]
  }


Explanation of Embedding:

Advantages:
  • Simpler data retrieval: Since the books are stored directly within the author document, you don’t need to perform additional queries.
  • Single query for updates: You can update the entire author and their books in one go.
Disadvantages:
  • Document size limit: MongoDB has a 16 MB document size limit. If an author writes too many books, the document can grow large.
  • Data duplication: If books are referenced by other entities (e.g., publishers), duplication can occur.
2. Referencing Approach (One-to-Many)
  • In this approach, the books are stored in a separate collection, and the author document references the book_ids in an array. This approach avoids embedding large amounts of data inside a single document.
  • Step 1: Inserting Data (Referencing)
  • Insert data into the books collection:

    db.books.insertMany([
      {
        _id: 101,
        title: "1984",
        genre: "Dystopian",
        published_year: 1949
      },
      {
        _id: 102,
        title: "Animal Farm",
        genre: "Political Satire",
        published_year: 1945
      }
    ])

  • Insert data into the authors collection with references to book_ids:

    db.authors.insertOne({
      _id: 1,
      name: "George Orwell",
      age: 46,
      book_ids: [101, 102]  # Array of references to the books
    })

  • Step 2: Querying Data (Referencing)
  • To get an author and their books, you need to:
    • 1. Query the authors collection to get the book_ids.
    • 2. Use the book_ids to query the books collection.
  • Step 2.1: Find the author:

    var author = db.authors.findOne({ _id: 1 })

  • Step 2.2: Find the books using the book_ids:

    db.books.find({ _id: { $in: author.book_ids } })

    // Output (from the books collection):
    [
      {
        "_id": 101,
        "title": "1984",
        "genre": "Dystopian",
        "published_year": 1949
      },
      {
        "_id": 102,
        "title": "Animal Farm",
        "genre": "Political Satire",
        "published_year": 1945
      }
    ]

  • Step 3: Using $lookup to Join Collections
  • Alternatively, you can use the $lookup operator to join the authors and books collections in a single query:

    db.authors.aggregate([
      {
        $lookup: {
          from: "books",         // Collection to join with
          localField: "book_ids", // Field in the authors collection
          foreignField: "_id",    // Field in the books collection
          as: "books"             // Output array field name
        }
      }
    ])

    // Output:
    [
      {
        "_id": 1,
        "name": "George Orwell",
        "age": 46,
        "book_ids": [101, 102],
        "books": [
          {
            "_id": 101,
            "title": "1984",
            "genre": "Dystopian",
            "published_year": 1949
          },
          {
            "_id": 102,
            "title": "Animal Farm",
            "genre": "Political Satire",
            "published_year": 1945
          }
        ]
      }
    ]


Explanation of Referencing:

Advantages:
  • Flexible and scalable: The books can grow in number without causing the author document to become too large.
  • No duplication: Since the books are stored in a separate collection, other entities (like publishers or libraries) can reference them without duplicating data.
Disadvantages:
  • More complex queries: You need to perform multiple queries or use $lookup to retrieve related data.
  • Data consistency: It’s possible for an author to reference a book that doesn’t exist, which introduces potential data integrity issues unless you enforce checks at the application level.
Conclusion: When to Use Embedding vs. Referencing in One-to-Many Relationships
  • Use Embedding when:
    • The related data (e.g., books) is always accessed with the parent document (e.g., author).
    • The size of the embedded data is small and won’t grow indefinitely.
    • You want simplicity in your data model with fewer collections to manage.
  • Use Referencing when:
    • The related data (e.g., books) might be accessed independently of the parent document (e.g., author).
    • The size of the related data is large or could grow over time.
    • You want to share related data between different entities (e.g., a book is written by an author but also published by a publisher).
    • You need to avoid hitting the document size limit (16 MB in MongoDB).
  • By using either approach, you can effectively model one-to-many relationships based on the specific requirements of your application.

No comments:

Post a Comment

Primitive Types in TypeScript

In TypeScript, primitive types are the most basic data types, and they are the building blocks for handling data. They correspond to simple ...