MongoDB Querying Tips & Tricks

| Comments

This post contains a bunch of tricks I picked up over the years working with MongoDB. They are also implemented in MQL as helper functions.

Filter on creation time

Did you know you can filter on creation time without adding a timestamp field?

By default, a document’s _id field contains an ObjectId. This ObjectId is composed from several values: number of seconds from the epoch, machine identifier, process identifier and a counter.

Since the ObjectId starts with a timestamp you can order by it or use it for filtering. The following function generates an ObjectId from a python datetime object:

1
2
3
4
5
6
7
8
import datetime, time, bson

def oid_from_date(dt):
    seconds_from_epoch = int(time.mktime(dt.timetuple()))
    return bson.ObjectId(hex(seconds_from_epoch)[2:] + '0' * 16)

>>> oid_from_date(datetime.datetime(2015, 7, 4, 12, 0, 0))
ObjectId('5597a0900000000000000000')

We can now use this for querying documents in a specific range:

1
>>> connection.test.test.find({_id: {'$gt': oid_from_date(datetime.datetime(2015, 7, 3)), '$lt': oid_from_date(datetime.datetime(2015, 7, 4))}})

Here’s how you would do it with MQL:

1
select * from test.test where _id between oidFromDate('2015-07-03') and oidFromDate('2015-07-04')

Convert datetime to date

It’s often useful to aggregate data in different time resolutions: yearly, monthly, daily, etc’.

MongoDB lets us group by several fields but having a seperate field for year, month and day isn’t natural.

As of MongoDB 3.0 there’s a builtin function called dateToString that does exactly that. The following solution is useful if you’re still on 2.6.

In order to generate a date from a datetime we can construct a number that contains the year, month and day. For example, the datetime 2015-07-04 12:00:00 can be represented as a date using a number: 20150704. Here’s how you would do it with MQL:

1
select year(timestamp) * 10000 + month(timestamp) * 100 + dayOfMonth(timestamp) as date from test.test

There’s a builtin helper function that generates the same expression:

1
select extractDate(timestamp) as date from test.test

Here’s how you can do it with MongoDB’s aggregation framework:

1
2
3
4
5
6
7
connection.test.test.aggregate([
  {'$project':
    {'date':
      {'$add': [{'$multiply': [{'$year': '$timestamp'}, 10000]},
                {'$multiply': [{'$month': '$timestamp'}, 100]},
                {'$dayOfMonth': '$timestamp']}}}
])

You can use the same concept for aggregating on other resolutions using additional functions like $hour and $minute.

Floating point manipulation

floor has been introduced in MongoDB 3.2 but since it’s not out yet, you can do the following:

1
2
3
4
connection.test.test.aggregate([
  {'$project':
    {'number': {'$subtract': ['$number', {'$mod': ['$number', 1]}]}}}
])

Here’s how you would do it with MQL:

1
select number - number % 1 from test.test

There’s a builtin helper function that generates the same expression:

1
select toInt(number) from test.test

Compare fields

MongoDB’s find() doesn’t support comparing fields but the aggregation framework gives us the building blocks to implement it:

1
2
3
4
5
6
connection.test.test.aggregate([
  {'$project':
    {'a = b': {'$eq': ['$a', '$b']}}},
  {'$match':
    {'a = b': True}}
])

Here’s how you would do it with MQL:

1
select a from test.test where a=b

Comments