I have some data like this:
data = [{'_id': 1, 'val': 5},
{'_id': 2, 'val': 1}}]
current data in db:
>>> db.collection.find_one()
{'_id': 1, 'val': 3}
I always receive unique rows but am not sure if any of them already exists in DB (such as the case above). And I want to update them based on two types of requirements.
Requirement 1:
Do NOT update the rows if _id
already exists. This is kinda easy in a way:
from pymongo.errors import BulkWriteError
try:
db.collection.insert_many(data, unordered=False)
except BulkWriteError:
pass
executing the above would insert 2nd
row but won't update the first; but it also raises the exception.
1. Is there any better way of doing the above operation (for bulk inserts) ?
Requirement 2
This is similar to update_if_exists
& insert if not exists
combined. So the following data:
data2 = [{'_id': 1, 'val': 9},
{'_id': 3, 'val': 4}}]
should update the row with _id=1
and insert the 2nd
row in DB.
The problem is I get thousands of rows at one time and am not sure if checking and updating one-by-one is efficient.
2. Is this requirement possible in MongoDB without iterating over each row and with as few operations as possible ?