insert_many with upsert - PyMongo

Question

I have some data like this:

data = [{'_id': 1, 'val': 5},
        {'_id': 2, 'val': 1}}]

current data in db:

>>> db.collection.find_one()
    {'_id': 1, 'val': 3}

I always receive unique rows but am not sure if any of them already exists in DB (such as the case above). And I want to update them based on two types of requirements.

Requirement 1:

Do NOT update the rows if _id already exists. This is kinda easy in a way:

from pymongo.errors import BulkWriteError
try:
  db.collection.insert_many(data, unordered=False)
except BulkWriteError:
  pass

executing the above would insert 2nd row but won't update the first; but it also raises the exception.

1. Is there any better way of doing the above operation (for bulk inserts) ?

Requirement 2

This is similar to update_if_exists & insert if not exists combined. So the following data:

data2 = [{'_id': 1, 'val': 9},
         {'_id': 3, 'val': 4}}]

should update the row with _id=1 and insert the 2nd row in DB.

The problem is I get thousands of rows at one time and am not sure if checking and updating one-by-one is efficient.

2. Is this requirement possible in MongoDB without iterating over each row and with as few operations as possible ?

score 11 · Answer 1 · answered Jan 22 '19 at 18:20

11

You can generate a list of updates to pass to bulk write API that will send all the operations together but they will still be executed one by one on the server, but without causing an error.

from pymongo import UpdateOne
data2 = [{'_id': 1, 'val': 9}, {'_id': 3, 'val': 4}]
upserts=[ UpdateOne({'_id':x['_id']}, {'$setOnInsert':x}, upsert=True) for x in data2]
result = db.test.bulk_write(upserts)

You can see in the result that when _id is found the operation is a no-op, but when it's not found, it's an insert.

answered Jan 22 '19 at 18:20

Asya Kamsky

41,784
5
109
133

Shouldnt this be [unordered](http://api.mongodb.com/python/current/examples/bulk.html#unordered-bulk-write-operations) ? – ocean800 Feb 07 '19 at 22:48
no, that wouldn't achieve what the OP described. – Asya Kamsky Feb 08 '19 at 05:57

insert_many with upsert - PyMongo

1 Answers1