MongoDB introduction

Published on

This is my guide to getting up to speed with MongoDB and Mongoose library in Js.

Table of Contents

Very high level overview of what MongoDB is:

  • Document based database, designed for large scalability
  • Built for easy sharding
  • created in 2007
  • Data is stored in JSON (binary JSON) internally
  • Data is stored in collections
  • Schemas are optional. Database migrations are basically optional - you can send whatever shape JSON you like to the collection
  • related data stored logically next to each other on disk - resulting in fast reads
  • Joins are not a thing, unlike in regular SQL

Installing MongoDB on your machine

There is a nice guide on https://www.mongodb.com/docs/manual/installation/ on installing the community version locally on your machine. On Mac OS its easy to install via brew but check out the installation instructions on their site.

You can install the MongoDB Compass app if you want a nice GUI to explore your db (locally or on production). You can download it from the link above.

Tools

  • Atlas is the very popular cloud hosting for mongodb. they have a nice free tier). Nice UI, easy to scale

Command tools:

  • mongod is the database server
  • mongos is the sharding server
  • mongo is the CLI shell you can use to query your mongo db instance

Using mongosh:

  • (type exit to quit)
  • show all dbs with show dbs. This will return the different databases (including admin, config and local which are always present)
  • Use a db with use the-db-name-here. You can then query on that db.
    • if you type in a db that does not exist, it will create that DB for you (and start using it)
  • When using a db, show all collections with show collections
  • When using a db, delete the data in a database with db.dropDatabase()
  • When using a db, you can query with things like db.products.insertOne({name: "sony tv"}). (the db object is the current database)

Shape of the data

  • Rows in the db always have a unique _id field (ObjectId)
  • Other fields can be any name. They work like JSON
  • Multiple data types - standard ones like string, null, object, boolean. Also things like MaxKey, MinKey, BSONRegexp, ObjectId
  • We don't really have to use schemas but I cannot see many reason why you would not use schemas. Maybe because I came from a SQL background, but using schemas seems like a no brainer. You can define the shape of your documents before using them, and enforce them in your library code (e.g. mongoose). Using schemas means you can validate your data (the types/shape of object) and ensure everything is consistent.

Indexes

  • You can add indexes, to speed up read queries

Inserts

db.products.insertOne({
  type: "t-shirt",
  size: "xl",
  name: "some t-shirt",
  price: 20
});

Note: we didn't have to predefine the schema. We can add data with any fields.

db.products.insertOne({
  type: "t-shirt",
  name: "another t-shirt",
  color: "black",
  qwerty: "anything here",
});

We can also nest object data:

db.products.insertOne({
  type: "t-shirt",
  name: "third t-shirt",
  colorOptions: ["black", "orange", "red"],
  shippingCosts: {
      nextDay: 10,
      slow: 2
  }
});

You can also insert multiple records at the same time with insertMany()

db.products.insertMany(
    [
        {
            type: "t-shirt",
            name: "sports t-shirt",
        },
        {
            type: "t-shirt",
            name: "football t-shirt",
        },
]
);

Reads (querying for data in mongodb)

Get all records in a collection by calling .find() with no arguments:

db.products.find()

Or you can pass in a criteria to find specific record(s).

The following query will return all rows that have a name of 'football t-shirt'

// find all that match:
db.products.find({
  name: "football t-shirt"
});

// or find just one record
db.products.findOne({
  name: "football t-shirt"
});

This example finds all with a price less than 25:

db.products.find({
  price: { $lt: 25 }   
});

If you have a document with nested objects, you can query it with the dot notation:

db.people.find({ "contact.email": "test@example.com" })

If you only want to return a certain number of rows, you can use .limit()

db.products.find().limit(5)

Example of sorting (reverse order) and picking top 3, in JS:

// from https://www.mongodb.com/docs/drivers/node/current/fundamentals/crud/read-operations/limit/ 
// define an empty query document
const query = {};
// sort in descending (-1) order by length
const sort = { length: -1 };
const limit = 3;
const cursor = myColl.find(query).sort(sort).limit(limit);
await cursor.forEach(console.dir);

Or you can sort by specific fields

db.products.find().sort({ name: 1}) 
// or {name: -1} to reverse order sort by name field

If you want to select only certain fields:

// finds all, and returns just name (and _id)
db.products.find({}, {name: 1})

// finds all, returns just name
db.products.find({}, {name: 1, age: 0})

You can also do queries with criteria that includes AND ($and), OR ($or), IN ($in)

e.g.

await db.products.find({
 $or: [
     {name: "some t-shirt"},
     {name: "football t-shirt"},
 ]
})

You can also do comparisons between two or more fields, with $expr.

await db.products.find({
    $expr: {
        $lt: ["$orders", "$quantity"]
    }
})

Updating

Updating is easy - you pass in a criteria (your WHERE in normal SQL), then an object of what you want to update (You have to use the $set in here)

Example updating all products from black to orange:

await db.products.updateOne(
{
    color: "black"
},
{
    $set: {
        color: "orange"
    }
}
)
// you can also use `.updateMany()` which works in a similar way

You can also increment / decrement, like this:

await db.people.updateOne(
{
    name: "Simon",
},
{
    $inc: {
        age: 1
    }
})

That query will update the age property by one. You can pass in something like age: 10 to increment by 10.

Data aggregation pipelines

  • Can take multiple documents, and combine them into one result document

Mongoose

Although you can connect directly to your MongoDB library, almost every app in the JS world will use Mongoose to connect to it.

You pass in a connection string such as mongodb://localhost:27017/your-db-name. You can have anything as your db name - if it does not exist then it will be created.

const mongoose = require('mongoose')

mongoose.connect('mongodb://localhost:27017/your-db-name').then(() => {
  console.log("Connected!")
})

You can create schemas like this in Mongoose:

// this does not make any changes in your db
// we will use this schema later on
const blogPostSchema = new mongoose.Schema({
  title: String,
  body: String,
});

const BlogPost = mongoose.model('blogpost', blogPostSchema);

You can automatically set timestamps by passing in options (2rd param):

const blogPostSchema = new mongoose.Schema({title: String}, {timestamps: true});

and you can insert some data easily:

function db() {
  return mongoose.connect('mongodb://localhost:27017/your-db-name')
}

db().then(async (connection) => {
  const newPost = await BlogPost.create({
    title: "New post",
    body: "Hello, world",
  })
  console.log("Created new post", newPost);
})

This will connect to your db, then create a new blog post.

The console.log should show you the mongo document that was created.

When it console.logs, it will include the title and body - but also __v and _id attributes. These are generated automatically.

The _id is the ObjectId (mentioned previously). The id is very commonly used. You can configure it to use a different key (than _id) The __v is the current version of that document. Every update will increment this by one. You rarely will need to use it.

Adding validation to schemas in Mongoose

You can define your schema with validation. Note: the validation is always done by the library (mongoose) and not by the db (mongodb) itself.

const blogPost = new mongoose.Schema({
  title: {
    type: String,
    required: true
  },
  isActive: {
    type: Boolean,
    default: false,
  },
  slug: {
    unique: true,
    type: String,
    required: true
  },
  body: {
    type: String,
    required: true
  }
})

Querying for data in mongoose

Given a schema object (like blogPost above), you can call functions on it to query for data


const BlogPost = mongoose.Schema(...) // see above

const post1 = await BlogPost.find({title: "some blog title");
const post2 = await BlogPost.findById(yourId);

// note the {new: true} option is required to return the NEWLY updated model
const newParams = {title: "Updated title"}
const original = await BlogPost.findByIdAndUpdate(yourId, newParams);
// original.title !== 'Updated title'
const updated = await BlogPost.findByIdAndUpdate(yourId, newParams, {new: true})
// updated.title === 'Updated title'

Note that these are promise-like. They kind of act like preparing SQL statements. They have a .then() but are not returning real Promises. If you need a real promise, you can call .exec() to execute the query.

These are basically the same:

await BlogPost.findOne();
await BlogPost.findOne().exec();

But you do get better stack traces when used with .exec(). It also means it returns a real promise, which other parts of your code may be expecting.

Comparison criteria in MongoDB

Easy to filter:

const adults = await Person.find({ age: {$gt: 18} });
const fiveAdults = await Person.find({ age: {$gt: 18} }).limit(5) ;
const fiveAdultsSorted = await Person.find({age: {$gt: 18}}).sort('name').limit(5) // sorted by their name
const fiveAdultsSorted = await Person.find({age: {$gt: 18}}).sort({name: 1}).limit(5) // sorted by their name
const fiveAdultsSortedReverse = await Person.find({age: {$gt: 18}}).sort('-name').limit(5) // sorted by their name (reverse)
const fiveAdultsSortedReverse = await Person.find({age: {$gt: 18}}).sort({name: -1}).limit(5) // sorted by their name (reverse)
const middleAged = await Person.find({age: {$gt: 40, $lt: 60}})
const fewPets = await Person.find({ numPets: {$lt: 5} });
const manyPets = await Person.find({ numPets: {$gte: 5} });
const multipleFilters = await Person.find({numPets: 2, age: {$lt: 40}})

If there are arrays, its also quite easy

const personSchema = new mongoose.Schema({
  age: Number,
  numPets: Number,
  petNames: [{type: String}]
})
const Person = mongoose.model('person', personSchema);

// pass in a string to find exact match for one:
const personWithAPetCalledBruce = Person.findOne({petNames: 'Bruce'})

// or use $in:
const personWithAPetCalledBruceOrSimon = Person.findOne({$in: {petNames: ['Bruce', 'Simon']}})

How to select certain fields

If you are used to SQL, this is the equivalent of SELECT col1, col2 FROM tbl.

const userNameOnly = User.find({}).select({
  name: 1,
});
const userWithoutEmail = User.find({}).select({
  email: -1
});

Relations (associations)

  • embedded document (also known as "subdocuments") 'associate' a document with other document(s).

Example:

const userRef = 'user'
const blogPostSchema = new mongoose.Schema({
  title: String,
  body: String,
  author: {
    type: mongoose.Schema .Types.ObjectId, // << 
    required: true,
    ref: userRef // << use same ref name as when you call mongoose.model(x)
  }
});
const BlogPost = mongoose.model('blogpost', blogPostSchema);

const userSchema = new mongoose.Schema({
  email: String,
  
})
const User = mongoose.model(userRef, blogPostSchema);

You can use these like this - the first code snippet shows creating it with this association between 2 schemas:


const author = await User.create({email: "test@example.com"})
const blogPost = await BlogPost.create({
  title: "New Post",
  body: "A blog post",
  author: author.id, 
})

and this example shows how to query for it - use the populate() call to populate all associated schemas

const result = await User.findById(author._id).populate('author').exec()

Compound indexes

MongoDB supports compound indexes, where a single index structure holds references to multiple fields [1] within a collection's documents.

docs

var PersonSchema = new Schema({
    name: { type: String }
    email: { type: String, index: true }
})

Mongoose - middleware

Mongoose supports middleware. There are four types:

  • aggregate middleware
  • document middleware
  • model middleware
  • query middleware.

Aggregate middleware

Aggregate middleware is for YourModel.aggregate() calls. The middleware will run when you run exec() on an aggregate object.

Document middleware

This is one of the 2 main middleware types you will probably work with.

It works with a bunch of document (instance of a Model class) functions:

  • validate
  • save
  • remove
  • updateOne
  • deleteOne
  • init

Model middleware

This is different than document middleware. You can use model middleware only on the insertMany function.

Query middleware

This is the other commonly used middleware.

It runs your middleware functionality when you run exec() or then() on a query object (or if you await it).

  • count
  • countDocuments
  • deleteMany
  • deleteOne
  • estimatedDocumentCount
  • find
  • findOne
  • findOneAndDelete
  • findOneAndRemove
  • findOneAndReplace
  • findOneAndUpdate
  • remove
  • replaceOne
  • update
  • updateOne
  • updateMany
  • validate

Note: Query middleware is not executed on subdocuments.

How to use them

You will normally run them with yourSchema.pre('type', yourFunction) or .post()

For example:

  • personSchema.pre('save', yourFunction) will run yourFunction before saving.
  • personSchema.post('save', yourFunction) will run yourFunction after saving.
  • personSchema.pre('validate', yourFunction) will run yourFunction before validation.
  • personSchema.post('validate', yourFunction) will run yourFunction after validation.

synchronous vs asyncronous

Based on if there is a 2nd arg in the function callback (which is next())

init hook is sync only.


personSchema.pre('save', function() {
  // sync
});

personSchema.pre('save', function(_document) {
  // sync
});

personSchema.pre('save', function(_document, next) {
  // async
  setTimeout(next, 1000)
});


### Example of using middleware

```ts
const employeeSchema = new mongoose.Schema({
  name: String
});

const companySchema = new mongoose.Schema({
  employees: [employeeSchema]
});

companySchema.pre('findOneAndUpdate', function() {
  console.log('Middleware on parent company doc'); // Will be executed
});

employeeSchema.pre('findOneAndUpdate', function() {
  console.log('Middleware on employee doc'); // Will not be executed (as is a child document)
});

use cases for middlewares in mongoose

  • data validation
  • custom logging
  • automatic clean up of other documents when another document is updated/deleted
  • combining updates with sending that change to clients via websockets

Mongoose - virtuals

Virtuals are something that are in the mongoose JS package (and not really from the core mongoDB itself).

In Mongoose, a virtual is a property that is not stored in MongoDB. Virtuals are typically used for computed properties on documents.

They work a bit like computed properties in vue.

Example code from their docs which explains it well:

const userSchema = mongoose.Schema({
  email: String
});

// Create a virtual property `domain` that's computed from `email`.
userSchema.virtual('domain').get(function() {
  return this.email.slice(
    this.email.indexOf('@') + 1
  );
});
const User = mongoose.model('User', userSchema);

const doc = await User.create({ email: 'test@gmail.com' });

// `domain` is now a property on User documents.

doc.domain; // 'gmail.com'

The .id property is actually a virtual (for the ._id property) that is always set up by mongoose.

As well as the 'getter' style of virtuals, you can also have 'setter' style of virtuals. So you can 'set' a value, and it maps the change to different field(s) on the real db call

Another example that makes it clear, from the docs

const userSchema = mongoose.Schema({
  firstName: String,
  lastName: String
});

// Create a virtual property `fullName` with a getter and setter.
userSchema.virtual('fullName').
  get(function() { return `${this.firstName} ${this.lastName}`; }).
  set(function(v) {
    // `v` is the value being set, so use the value to set
    // `firstName` and `lastName`.
    const firstName = v.substring(0, v.indexOf(' '));
    const lastName = v.substring(v.indexOf(' ') + 1);
    this.set({ firstName, lastName });
  });
const User = mongoose.model('User', userSchema);

const doc = new User();
// Vanilla JavaScript assignment triggers the setter
doc.fullName = 'Jean-Luc Picard';

doc.fullName; // 'Jean-Luc Picard'
doc.firstName; // 'Jean-Luc'
doc.lastName; // 'Picard'

Note: for setters/getters, you have to use normal functions (not arrow functions) as they need access to this.