NoSQL Database - MongoDB

This post introduces MongoDB, a NoSQL database. If you need a refresher on traditional SQL databases, check out our post on SQL first!

NoSQL

Before we dive into MongoDB specifically, let’s take a step back and understand what NoSQL actually means. The term NoSQL stands for “Not Only SQL” (not “No SQL”), and it represents a fundamental shift in how we think about storing and retrieving data.

Why NoSQL Emerged?

In the early 2000s, companies like Google, Amazon, and Facebook faced a problem: Traditional SQL databases weren’t designed to handle the massive scale and varied data types of modern web applications.

They needed databases that could:

Scale horizontally across thousands of servers (not just vertically with bigger hardware)
Handle unstructured or semi-structured data (JSON, logs, user-generated content)
Provide high availability even when parts of the system fail
Iterate quickly without rigid schema migrations
Process massive volumes of data in real-time

This led to the development of NoSQL databases, a diverse family of database systems designed for specific modern use cases.

NoSQL Databases Types

NoSQL isn’t a single technology; it’s a category encompassing different database types, each optimized for specific scenarios:

flowchart TB
    NoSQL["NoSQL Databases"]
    
    NoSQL --> Doc["Document Databases"]
    NoSQL --> KV["Key-Value Stores"]
    NoSQL --> Col["Column-Family Stores"]
    NoSQL --> Graph["Graph Databases"]
    
    Doc --> DocEx["MongoDB, CouchDB<br/>Use: CMS, catalogs, user profiles"]
    KV --> KVEx["Redis, DynamoDB<br/>Use: Caching, sessions, real-time"]
    Col --> ColEx["Cassandra, HBase<br/>Use: Time-series, analytics, IoT"]
    Graph --> GraphEx["Neo4j, Amazon Neptune<br/>Use: Social networks, recommendations"]
    
    style NoSQL fill:#4caf50,color:#fff
    style Doc fill:#2196f3,color:#fff
    style KV fill:#2196f3,color:#fff
    style Col fill:#2196f3,color:#fff
    style Graph fill:#2196f3,color:#fff

Document Databases

Store data as JSON-like documents (MongoDB, CouchDB)

Document databases store data in self-contained documents (typically JSON or BSON format). Each document can have a different structure, making them ideal for applications with evolving or varied data models.

Best for: Content management, product catalogs, user profiles, mobile applications

Example structure:

  
{
  "userId": "123",
  "name": "Sarah",
  "preferences": {
    "theme": "dark",
    "notifications": true
  },
  "orders": [...]
}

Key-Value Stores

Simple but fast, store data as key-value pairs (Redis, DynamoDB)

The simplest NoSQL model. Each item is stored as a key-value pair, similar to a hash table or dictionary. Extremely fast for simple lookups.

Best for: Caching, session management, real-time analytics, leaderboards

Example:

"session:abc123" → {"userId": "123", "loginTime": "2026-01-18T10:00:00Z"}
"user:123:cart" → ["item1", "item2", "item3"]

Column-Family Stores

Organize data in columns rather than rows (Cassandra, HBase)

Store data in columns rather than rows, allowing efficient storage and retrieval of sparse data. Excellent for write-heavy workloads and time-series data.

Best for: Time-series data, IoT sensor data, analytics, logging

Think of it as: Storing all “temperature” readings together, all “humidity” readings together, rather than storing each sensor’s complete record together.

Graph Databases

Optimized for relationships between entities (Neo4j, Amazon Neptune)

Designed to store and query relationships between data points. Data is stored as nodes (entities) and edges (relationships).

Best for: Social networks, recommendation engines, fraud detection, knowledge graphs

Example: Finding friends-of-friends, “people who bought this also bought,” shortest path between entities.

NoSQL vs SQL: Core Differences

Aspect	SQL Databases	NoSQL Databases
Data Model	Tables with rows and columns	Documents, key-value, columns, or graphs
Schema	Fixed schema, defined upfront	Flexible or schema-less
Scaling	Vertical (bigger servers)	Horizontal (more servers)
Transactions	ACID (strong consistency)	Eventually consistent (some support ACID)
Query Language	SQL (standardized)	Database-specific APIs
Best For	Complex queries, relationships	Flexibility, scale, speed
Examples	PostgreSQL, MySQL, Oracle	MongoDB, Redis, Cassandra, Neo4j

The CAP Theorem: NoSQL Trade-offs

Understanding NoSQL requires knowing about the CAP theorem, which states that distributed databases can only guarantee two of these three properties:

Consistency: All nodes see the same data at the same time
Availability: Every request receives a response (success or failure)
Partition Tolerance: System continues operating despite network failures

flowchart TB
    CAP["CAP Theorem:<br/>Choose 2 of 3"]
    
    CAP --> CP["CP: Consistency + Partition Tolerance<br/><i>Sacrifice availability during network issues</i>"]
    CAP --> AP["AP: Availability + Partition Tolerance<br/><i>Sacrifice immediate consistency</i>"]
    CAP --> CA["CA: Consistency + Availability<br/><i>Cannot handle network partitions</i>"]
    
    CP --> CPEx["MongoDB (with write concern),<br/>HBase, Redis Cluster"]
    AP --> APEx["Cassandra, DynamoDB,<br/>CouchDB"]
    CA --> CAEx["Traditional SQL databases<br/>(single server only)"]
    
    style CAP fill:#4caf50,color:#fff
    style CP fill:#ff9800,color:#fff
    style AP fill:#ff9800,color:#fff
    style CA fill:#ff9800,color:#fff

SQL databases traditionally chose consistency and availability (CA), which works fine for single servers but struggles in distributed systems. NoSQL databases are designed for distributed environments, typically choosing either CP or AP.

When to Use NoSQL?

✅ Choose NoSQL when:

You need to scale horizontally across many servers
Your data structure changes frequently or varies between records
You’re building for high availability and can tolerate eventual consistency
You need to handle massive volumes of data
Your data is hierarchical or doesn’t fit well into tables
You need extremely fast writes or reads for simple queries

❌ Stick with SQL when:

You need complex transactions with rollback support
Your data has many complex relationships requiring JOINs
Data integrity and consistency are absolutely critical (banking, healthcare)
Your schema is stable and well-defined
Your team has deep SQL expertise and limited NoSQL experience

The best modern applications often use both SQL and NoSQL databases together, choosing the right tool for each specific use case. This approach is called polyglot persistence.

MongoDB

MongoDB is a document database, one of the four main NoSQL types. It stores data in flexible, JSON-like documents, making it one of the most popular NoSQL databases for web applications. MongoDB combines:

The flexibility of document storage
Rich query capabilities (similar to SQL’s expressiveness)
Strong consistency options when needed
Horizontal scalability for growth

Now that you understand the broader NoSQL context, let’s explore MongoDB specifically.

What is MongoDB?

As a junior developer, you’ve probably heard the term NoSQL thrown around. MongoDB is one of the most popular NoSQL databases, and for good reason. Unlike traditional SQL databases that store data in rigid tables with predefined schemas, MongoDB stores data in flexible, JSON-like documents. This makes it incredibly intuitive for developers who already work with JSON in their applications.

Think of it this way: in SQL, you’re organizing data like a spreadsheet with strict columns and rows. With MongoDB, you’re storing data more like a collection of related notes, each note can have different information, and you can easily add new fields without restructuring everything.

Why MongoDB Matters?

Intuitive data model: If you understand JSON, you already understand MongoDB’s data format
Flexibility: Schema-less design means you can iterate quickly during development
Scalability: Built with modern, distributed applications in mind
Developer-friendly: Rich query language and excellent documentation
Industry demand: Used by companies like Forbes, Adobe, Google, and eBay

MongoDB’s document-based approach can significantly reduce the complexity of data modeling for certain types of applications, especially those with rapidly evolving requirements.

Core Concepts: Documents, Collections, and Databases

Let’s break down MongoDB’s fundamental building blocks.

Documents

A document is MongoDB’s basic unit of data, similar to a row in SQL, but far more flexible. Documents are stored in BSON format (Binary JSON), which looks just like JSON.

  
{
  "_id": "507f1f77bcf86cd799439011",
  "firstName": "Sarah",
  "lastName": "Johnson",
  "email": "sarah.johnson@example.com",
  "age": 28,
  "skills": ["Java", "Python", "MongoDB"],
  "address": {
    "street": "123 Developer Lane",
    "city": "Tech City",
    "country": "USA"
  },
  "joinDate": "2025-01-15T08:00:00Z"
}

Notice how we can have:

Simple fields (firstName, age)
Arrays (skills)
Nested objects (address)
Different data types in a single document

The _id Field

Every document in MongoDB must have an _id field. This is MongoDB’s primary key that uniquely identifies each document. If you don’t provide one, MongoDB automatically generates a unique ObjectId. This is similar to auto-increment primary keys in SQL databases, but ObjectIds are more powerful:

Globally unique across collections and databases
12-byte values that include timestamp information
Sortable by creation time (the timestamp is embedded in the first 4 bytes)

MongoDB automatically generates an ObjectId like this:

  
{
  "_id": "507f1f77bcf86cd799439011",
  "name": "John Doe"
}

You can also provide your own custom _id value using any BSON type (string, number, UUID, etc.):

  
{
  "_id": "user_12345",
  "name": "John Doe"
}

When you provide your own _id, make sure it’s truly unique to avoid insertion errors. MongoDB will reject duplicate _id values.

Collections

A collection is a group of documents, similar to a table in SQL. However, unlike tables, collections don’t enforce a schema. Each document in a collection can have different fields.

flowchart TB
    subgraph Database["Database: company_db"]
        direction TB
        subgraph Users["Collection: users"]
            direction LR
            U1["Document 1
            {name, email, age}"]
            U2["Document 2
            {name, email, skills[]}"]
            U3["Document 3
            {name, email, address{}}"]
        end
        
        subgraph Projects["Collection: projects"]
            direction LR
            P1["Document 1
            {title, owner, status}"]
            P2["Document 2
            {title, team[], deadline}"]
        end
    end
    
    style Database fill:#e1f5ff
    style Users fill:#fff9e6
    style Projects fill:#fff9e6

Databases

A database is a container for collections. Each MongoDB server can host multiple databases, and each database is completely isolated from the others.

flowchart LR
    Server["MongoDB Server"] --> DB1["company_db"]
    Server --> DB2["blog_db"]
    Server --> DB3["analytics_db"]
    
    DB1 --> C1["users"]
    DB1 --> C2["projects"]
    DB1 --> C3["departments"]
    
    style Server fill:#4caf50,color:#fff
    style DB1 fill:#2196f3,color:#fff
    style DB2 fill:#2196f3,color:#fff
    style DB3 fill:#2196f3,color:#fff

MongoDB vs SQL

As a junior developer, one of the most important skills is choosing the right tool for the job. Let’s compare MongoDB with traditional SQL databases:

Aspect	MongoDB	SQL (PostgreSQL/MySQL)
Data Model	Document-based (JSON-like)	Table-based (rows & columns)
Schema	Flexible, schema-less	Fixed schema, requires migrations
Relationships	Embedded documents or references	Foreign keys and JOINs
Scalability	Horizontal (add more servers)	Primarily vertical (bigger servers)
Transactions	Supported (since v4.0)	ACID transactions built-in
Best For	Rapid development, varied data	Complex relationships, consistency-critical
Query Language	MongoDB Query Language (MQL)	SQL (standardized)
Learning Curve	Easier for JSON-familiar developers	Steeper, requires understanding of relations

When to Choose MongoDB?

✅ Use MongoDB when:

Your data structure evolves frequently
You’re building a prototype or MVP
You need to store hierarchical or nested data
Horizontal scalability is important
You’re working with real-time analytics
Your application uses JSON extensively

Examples: Content management systems, real-time analytics, IoT applications, mobile app backends, catalogs with varied product attributes

When to Choose SQL?

✅ Use SQL when:

You have complex relationships between entities
Data integrity and consistency are critical (e.g., financial systems)
You need complex JOIN operations
Your schema is stable and well-defined
You require robust transaction support with rollbacks

Examples: Banking systems, e-commerce transactions, inventory management, traditional enterprise applications

Don’t fall into the trap of thinking NoSQL means “no relationships.” MongoDB can handle relationships, it just approaches them differently than SQL databases.

Polyglot Persistence

Many modern applications use both SQL and NoSQL databases together. It’s increasingly common to use both types of databases in a single application architecture. For example:

Use PostgreSQL for user authentication and financial transactions (where consistency is critical)
Use MongoDB for user profiles, activity logs, and content storage (where flexibility matters)
Use Redis for caching and session management (where speed is essential)

This is called polyglot persistence, using different data storage technologies for different parts of your application based on their specific needs.

The key is to choose the right tool for each job rather than forcing a single database to handle all use cases.

CRUD Operations

CRUD stands for Create, Read, Update, and Delete, the four fundamental operations you’ll perform on any database. Let’s see how these work in MongoDB using Java.

Setup with Java

First, add the MongoDB Java driver to your project. If you’re using Maven, add this dependency:

  
<dependency>
    <groupId>org.mongodb</groupId>
    <artifactId>mongodb-driver-sync</artifactId>
    <version>4.11.0</version>
</dependency>

Here’s a basic connection setup:

  
import com.mongodb.client.MongoClient;
import com.mongodb.client.MongoClients;
import com.mongodb.client.MongoDatabase;
import com.mongodb.client.MongoCollection;
import org.bson.Document;

public class MongoDBExample {
    public static void main(String[] args) {
        // Connect to MongoDB (default: localhost:27017)
        MongoClient mongoClient = MongoClients.create("mongodb://localhost:27017");
        
        // Access database
        MongoDatabase database = mongoClient.getDatabase("company_db");
        
        // Access collection
        MongoCollection<Document> users = database.getCollection("users");
        
        System.out.println("Connected to MongoDB successfully!");
    }
}

Create (Insert)

Adding new documents to your collection:

  
import org.bson.Document;
import java.util.Arrays;

// Insert a single document
Document newUser = new Document("firstName", "Sarah")
    .append("lastName", "Johnson")
    .append("email", "sarah.johnson@example.com")
    .append("age", 28)
    .append("skills", Arrays.asList("Java", "Python", "MongoDB"))
    .append("address", new Document("city", "Tech City")
        .append("country", "USA"));

users.insertOne(newUser);
System.out.println("User inserted with ID: " + newUser.getObjectId("_id"));

// Insert multiple documents
Document user1 = new Document("firstName", "John")
    .append("lastName", "Doe")
    .append("email", "john.doe@example.com");

Document user2 = new Document("firstName", "Jane")
    .append("lastName", "Smith")
    .append("email", "jane.smith@example.com");

users.insertMany(Arrays.asList(user1, user2));
System.out.println("Multiple users inserted!");

Read (Query)

Retrieving documents from your collection:

  
import com.mongodb.client.FindIterable;
import static com.mongodb.client.model.Filters.*;

// Find all users
FindIterable<Document> allUsers = users.find();
for (Document user : allUsers) {
    System.out.println(user.toJson());
}

// Find one user by email
Document foundUser = users.find(eq("email", "sarah.johnson@example.com")).first();
System.out.println("Found user: " + foundUser.getString("firstName"));

// Find users older than 25
FindIterable<Document> seniorUsers = users.find(gt("age", 25));

// Find users with specific skill
FindIterable<Document> javaDevs = users.find(eq("skills", "Java"));

// Complex query: users in Tech City with age > 25
FindIterable<Document> techCityDevs = users.find(
    and(
        eq("address.city", "Tech City"),
        gt("age", 25)
    )
);

MongoDB queries return cursors, not all documents at once. This is memory-efficient for large result sets. You can iterate through results or convert them to a list.

Update

Modifying existing documents:

  
import static com.mongodb.client.model.Updates.*;
import com.mongodb.client.result.UpdateResult;

// Update a single document
UpdateResult result = users.updateOne(
    eq("email", "sarah.johnson@example.com"),
    set("age", 29)
);
System.out.println("Modified documents: " + result.getModifiedCount());

// Update multiple fields
users.updateOne(
    eq("email", "john.doe@example.com"),
    combine(
        set("age", 30),
        set("city", "New York"),
        currentDate("lastModified")
    )
);

// Add an item to an array
users.updateOne(
    eq("email", "sarah.johnson@example.com"),
    push("skills", "Docker")
);

// Update multiple documents
users.updateMany(
    lt("age", 25),
    set("status", "junior")
);

MongoDB Update Operators

MongoDB provides various operators for different update operations:

Operator	Purpose	Example Use Case
`$set`	Set field value	Update user’s email
`$unset`	Remove field	Delete deprecated field
`$inc`	Increment numeric value	Increase view count
`$push`	Add item to array	Add new skill to user
`$pull`	Remove item from array	Remove a tag
`$addToSet`	Add to array if not exists	Unique badge collection
`$currentDate`	Set to current date	Update lastModified timestamp

Example combining multiple operators:

  
users.updateOne(
    eq("email", "sarah@example.com"),
    combine(
        inc("loginCount", 1),
        currentDate("lastLogin"),
        addToSet("badges", "early-adopter")
    )
);

Using the right update operator can significantly improve performance and code clarity. For example, $inc is atomic and thread-safe for counters.

Delete

Removing documents from your collection:

  
import com.mongodb.client.result.DeleteResult;

// Delete one document
DeleteResult result = users.deleteOne(eq("email", "john.doe@example.com"));
System.out.println("Deleted documents: " + result.getDeletedCount());

// Delete multiple documents
users.deleteMany(lt("age", 18));

// Delete all documents in collection (be careful!)
users.deleteMany(new Document());

Always be cautious with delete operations, especially deleteMany(). Consider using a “soft delete” pattern (marking documents as deleted rather than removing them) for important data.

Data Modeling in MongoDB

One of the biggest differences between MongoDB and SQL is how you model relationships between data. Let’s explore the two main approaches:

Embedded Documents (Denormalization)

Store related data within the same document:

  
Document blogPost = new Document("title", "Introduction to MongoDB")
    .append("author", new Document()
        .append("name", "Sarah Johnson")
        .append("email", "sarah@example.com")
        .append("bio", "Senior Developer"))
    .append("content", "MongoDB is a NoSQL database...")
    .append("comments", Arrays.asList(
        new Document()
            .append("user", "John Doe")
            .append("text", "Great post!")
            .append("date", new Date()),
        new Document()
            .append("user", "Jane Smith")
            .append("text", "Very helpful!")
            .append("date", new Date())
    ))
    .append("tags", Arrays.asList("database", "nosql", "mongodb"));

posts.insertOne(blogPost);

classDiagram
    class BlogPost {
        +ObjectId _id
        +String title
        +String content
        +Author author
        +List~Comment~ comments
        +List~String~ tags
    }
    
    class Author {
        +String name
        +String email
        +String bio
    }
    
    class Comment {
        +String user
        +String text
        +Date date
    }
    
    BlogPost *-- Author : embeds
    BlogPost *-- Comment : embeds multiple
    
    note for BlogPost "All data in one document
    Fast reads, no JOINs needed"

✅ Use embedded documents when:

Data is accessed together
You have one-to-few relationships
Data doesn’t change frequently
You want fast read performance

References (Normalization)

Store related data in separate collections and reference by ID:

  
// Insert user
Document user = new Document("_id", "user_123")
    .append("name", "Sarah Johnson")
    .append("email", "sarah@example.com");
users.insertOne(user);

// Insert post with user reference
Document post = new Document("title", "Introduction to MongoDB")
    .append("authorId", "user_123")  // Reference to user
    .append("content", "MongoDB is a NoSQL database...");
posts.insertOne(post);

// To get post with author, perform two queries:
Document foundPost = posts.find(eq("title", "Introduction to MongoDB")).first();
String authorId = foundPost.getString("authorId");
Document author = users.find(eq("_id", authorId)).first();

System.out.println("Post by: " + author.getString("name"));

classDiagram
    class User {
        +String _id
        +String name
        +String email
    }
    
    class Post {
        +ObjectId _id
        +String title
        +String content
        +String authorId
    }
    
    Post --> User : references
    
    note for Post "Stores only author ID
    Requires separate queries
    Better for frequently changing data"

✅ Use references when:

Data is accessed independently
You have one-to-many or many-to-many relationships
Data changes frequently
You want to avoid data duplication

The golden rule: Embed for read performance, reference for write performance and data consistency.

Practical Applications

MongoDB shines in specific scenarios. Here are real-world applications where developers often encounter MongoDB.

1. Content Management Systems (CMS)

Blog posts, articles, and pages with varying structures:

  
Document article = new Document("type", "blog-post")
    .append("title", "My First MongoDB Post")
    .append("slug", "my-first-mongodb-post")
    .append("author", "Sarah Johnson")
    .append("content", "Article content here...")
    .append("metadata", new Document()
        .append("featured", true)
        .append("publishDate", new Date())
        .append("views", 0))
    .append("seo", new Document()
        .append("metaDescription", "Learn about MongoDB")
        .append("keywords", Arrays.asList("mongodb", "database")));

2. User Profiles and Preferences

Flexible user data with varying attributes:

  
Document userProfile = new Document("userId", "user_456")
    .append("preferences", new Document()
        .append("theme", "dark")
        .append("notifications", true)
        .append("language", "en"))
    .append("activityLog", Arrays.asList(
        new Document("action", "login").append("timestamp", new Date()),
        new Document("action", "profile_update").append("timestamp", new Date())
    ))
    .append("socialLinks", new Document()
        .append("github", "sarahj")
        .append("linkedin", "sarah-johnson"));

3. Product Catalogs with Varied Attributes

E-commerce products with different specifications:

  
// Electronics product
Document laptop = new Document("category", "electronics")
    .append("name", "Developer Laptop Pro")
    .append("specs", new Document()
        .append("cpu", "Intel i7")
        .append("ram", "16GB")
        .append("storage", "512GB SSD"))
    .append("price", 1299.99);

// Clothing product (completely different structure)
Document tshirt = new Document("category", "clothing")
    .append("name", "Developer T-Shirt")
    .append("sizes", Arrays.asList("S", "M", "L", "XL"))
    .append("colors", Arrays.asList("black", "navy", "gray"))
    .append("price", 24.99);

products.insertMany(Arrays.asList(laptop, tshirt));

4. Real-Time Analytics and Logging

Capturing events and metrics:

  
Document event = new Document("eventType", "page_view")
    .append("userId", "user_789")
    .append("page", "/products/mongodb-guide")
    .append("timestamp", new Date())
    .append("metadata", new Document()
        .append("browser", "Chrome")
        .append("device", "mobile")
        .append("referrer", "google.com"));

analytics.insertOne(event);

5. Mobile App Backends

Syncing data between mobile apps and servers:

  
Document syncData = new Document("deviceId", "device_abc123")
    .append("userId", "user_456")
    .append("lastSync", new Date())
    .append("pendingChanges", Arrays.asList(
        new Document("collection", "todos")
            .append("operation", "insert")
            .append("data", new Document("task", "Learn MongoDB")),
        new Document("collection", "notes")
            .append("operation", "update")
            .append("data", new Document("noteId", "note_1")
                .append("content", "Updated content"))
    ));

Best Practices

As you start working with MongoDB, keep these guidelines in mind.

1. Design Your Schema Based on Access Patterns

Think about how your application will query data, not just how to structure it logically:

  
// ❌ Bad: Storing user's order history as references
Document user = new Document("name", "Sarah")
    .append("orderIds", Arrays.asList("order1", "order2", "order3"));
// Requires multiple queries to show order history

// ✅ Good: Embed summary of recent orders
Document user = new Document("name", "Sarah")
    .append("recentOrders", Arrays.asList(
        new Document("orderId", "order3")
            .append("date", new Date())
            .append("total", 99.99)
            .append("status", "delivered"),
        new Document("orderId", "order2")
            .append("date", new Date())
            .append("total", 49.99)
            .append("status", "delivered")
    ));
// One query gets user with recent order summaries

2. Use Indexes for Better Performance

Create indexes on fields you query frequently:

  
// Create index on email field for faster lookups
users.createIndex(new Document("email", 1));

// Create compound index for common queries
users.createIndex(new Document("status", 1).append("createdAt", -1));

// Create unique index to prevent duplicates
users.createIndex(
    new Document("email", 1),
    new IndexOptions().unique(true)
);

// Check existing indexes
for (Document index : users.listIndexes()) {
    System.out.println(index.toJson());
}

Indexes speed up reads but slow down writes. Create indexes strategically based on your query patterns, not on every field.

3. Validate Your Data

While MongoDB is schema-less, you should still validate data:

  
// Using JSON Schema validation
Document validator = Document.parse("""
    {
        $jsonSchema: {
            bsonType: "object",
            required: ["firstName", "lastName", "email"],
            properties: {
                firstName: {
                    bsonType: "string",
                    description: "must be a string and is required"
                },
                email: {
                    bsonType: "string",
                    pattern: "^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$",
                    description: "must be a valid email address"
                },
                age: {
                    bsonType: "int",
                    minimum: 0,
                    maximum: 150,
                    description: "must be an integer between 0 and 150"
                }
            }
        }
    }
    """);

// Apply validation to collection
database.createCollection("users",
    new CreateCollectionOptions().validationOptions(
        new ValidationOptions().validator(validator)
    ));

4. Handle Connections Properly

Always close your MongoDB connections:

  
public class MongoDBService {
    private final MongoClient mongoClient;
    private final MongoDatabase database;

    public MongoDBService(String connectionString, String databaseName) {
        this.mongoClient = MongoClients.create(connectionString);
        this.database = mongoClient.getDatabase(databaseName);
    }

    public MongoCollection<Document> getCollection(String collectionName) {
        return database.getCollection(collectionName);
    }

    // Important: Close connection when done
    public void close() {
        mongoClient.close();
    }
}

// Usage
MongoDBService service = new MongoDBService("mongodb://localhost:27017", "company_db");
try {
    MongoCollection<Document> users = service.getCollection("users");
    // Perform operations...
} finally {
    service.close(); // Always close!
}

5. Use Projection to Limit Returned Fields

Don’t retrieve more data than you need:

  
import static com.mongodb.client.model.Projections.*;

// ❌ Bad: Return entire document when you only need name and email
Document user = users.find(eq("_id", "user_123")).first();

// ✅ Good: Return only needed fields
Document user = users.find(eq("_id", "user_123"))
    .projection(fields(include("firstName", "lastName", "email"), excludeId()))
    .first();

// Returns: {"firstName": "Sarah", "lastName": "Johnson", "email": "sarah@example.com"}

6. Be Cautious with Embedded Arrays

Avoid embedding unbounded arrays that can grow indefinitely:

  
// ❌ Bad: Embedding all comments (could be thousands)
Document post = new Document("title", "Popular Post")
    .append("comments", Arrays.asList(/* potentially thousands of comments */));

// ✅ Good: Store recent comments, reference rest
Document post = new Document("title", "Popular Post")
    .append("recentComments", Arrays.asList(/* last 10 comments */))
    .append("totalComments", 1523);

// Store full comments in separate collection

MongoDB documents have a 16MB size limit. Embedding too much data in a single document will hit this limit and cause errors.

7. Learn the Aggregation Framework

For complex queries and data transformations:

  
import static com.mongodb.client.model.Aggregates.*;
import static com.mongodb.client.model.Accumulators.*;

// Example: Get average age by city
List<Document> pipeline = Arrays.asList(
    // Group by city and calculate average age
    group("$address.city", avg("avgAge", "$age")),
    // Sort by average age descending
    sort(new Document("avgAge", -1)),
    // Limit to top 5 cities
    limit(5)
);

AggregateIterable<Document> results = users.aggregate(pipeline);
for (Document result : results) {
    System.out.println(result.toJson());
}

// Output: [
//   {"_id": "Tech City", "avgAge": 32.5},
//   {"_id": "New York", "avgAge": 30.2},
//   ...
// ]

Getting Started: Your First Project

Ready to try MongoDB yourself? Here’s a quick checklist:

Install MongoDB
- Download from mongodb.com
- Or use MongoDB Atlas (free cloud-hosted MongoDB)

Add MongoDB Driver to Your Project

  
<!-- Maven -->
<dependency>
    <groupId>org.mongodb</groupId>
    <artifactId>mongodb-driver-sync</artifactId>
    <version>4.11.0</version>
</dependency>

Start Simple
- Connect to MongoDB
- Create a collection
- Insert a few documents
- Query them back
Practice CRUD Operations
- Build a simple todo app
- Create a blog backend
- Model a product catalog
Explore Advanced Features
- Try aggregation pipelines
- Experiment with indexes
- Learn about transactions

MongoDB Compass is a free GUI tool that lets you visualize your data, run queries, and understand performance. It’s invaluable for learning and debugging.

Conclusion

MongoDB opens up a new way of thinking about data storage. Its flexible document model, intuitive JSON-like format, and powerful query capabilities make it an excellent choice for modern applications, especially when you need to iterate quickly or handle varied data structures.

As a developer, understanding both SQL and NoSQL databases makes you more versatile. You’ll encounter MongoDB in many real-world projects, from startups building MVPs to large companies handling massive scale.

Key takeaways:

MongoDB stores data in flexible JSON-like documents
Choose MongoDB for flexibility and rapid development; choose SQL for complex relationships and strict consistency
Master CRUD operations first, then explore aggregations and indexing
Design your schema based on how you’ll query the data
Use embedded documents for data accessed together; use references for independent or frequently changing data

Start experimenting with MongoDB today. Build a small project, make mistakes, and learn from them. That’s the best way to truly understand when and how to use this powerful database.