As already mentioned here, the behavior is not specific to PyMongo.
The reason is because the count_documents
method in PyMongo performs an aggregation query and does not use any metadata. see collection.py#L1670-L1688
pipeline = [{'$match': filter}]
if 'skip' in kwargs:
pipeline.append({'$skip': kwargs.pop('skip')})
if 'limit' in kwargs:
pipeline.append({'$limit': kwargs.pop('limit')})
pipeline.append({'$group': {'_id': None, 'n': {'$sum': 1}}})
cmd = SON([('aggregate', self.__name),
('pipeline', pipeline),
('cursor', {})])
if "hint" in kwargs and not isinstance(kwargs["hint"], string_type):
kwargs["hint"] = helpers._index_document(kwargs["hint"])
collation = validate_collation_or_none(kwargs.pop('collation', None))
cmd.update(kwargs)
with self._socket_for_reads(session) as (sock_info, slave_ok):
result = self._aggregate_one_result(
sock_info, slave_ok, cmd, collation, session)
if not result:
return 0
return result['n']
This command has the same behavior as the collection.countDocuments
method.
That being said, if you willing to trade accuracy for performance, you can use the estimated_document_count
method which on the other hand, send a count
command to the database with the same behavior as collection.estimatedDocumentCount
See collection.py#L1609-L1614
if 'session' in kwargs:
raise ConfigurationError(
'estimated_document_count does not support sessions')
cmd = SON([('count', self.__name)])
cmd.update(kwargs)
return self._count(cmd)
Where self._count
is a helper sending the command.