8

I am trying to implement an ndb model audit so that all changes to properties are stored within each model instance. Here is the code of the _pre_put_hook I chose to implement that.

def _pre_put_hook(self):
    # save a history record for updates
    if not (self.key is None or self.key.id() is None):
        old_object = self.key.get(use_cache=True)
        for attr in dir(self):
            if not callable(getattr(self, attr)) and not attr.startswith("_"):
                if getattr(self, attr) != getattr(old_object, attr):
                    logging.debug('UPDATE: {0}'.format(attr))
                    logging.debug('OLD: {0} NEW: {1}'.format(getattr(old_object, attr), getattr(self, attr)))

The problem is old_object is always populated with the same values of the self (object) being updated. How can I access the property values of the old object BEFORE the put() being actually made (_pre_put)?

Cato
  • 145
  • 6
  • Could you add a little more context? Maybe the complete model? Why get() the entity from inside the hook (again)? I you want to put() a changed entity, I presume you've already fetched it. – Matthias Eisen Jan 22 '14 at 23:03
  • 8
    Try self.key.get(use_cache=False), since the context cache will have a reference to the same entity (self). – David Bennett Jan 23 '14 at 04:01
  • 3
    Use the `_post_get_hook` to squirrel away the original values, so that you have them available to you when `_pre_put_hook` is run. – Tim Hoffman Jan 23 '14 at 04:33
  • 2
    Thanks @DavidBennett: your `self.key.get(use_cache=False)` suggestion worked just right! I get the old values in the object as it is fetched from the datastore before applying the put(). – Cato Jan 23 '14 at 12:12
  • @Cato How did you end up solving this? – Lee Jun 11 '23 at 15:31
  • 1
    @Lee I used David Bennett's solution: I worked on this almost a decade ago, so I don't clearly remember all the context. But reading the comments here, it is clear David's solution worked just fine. – Cato Jun 17 '23 at 13:40

1 Answers1

6

EDIT:

I realized over time I was doing a bunch of work that didn't need to be done (alot of CPU/memory used copying entire entities and passing them around when may not be needed). Here's the updated version which stores a reference to the original protobuf and only deserializes it if you need it

  __original = None    # a shadow-copy of this object so we can see what changed... lazily inflated
  _original_pb = None  # the original encoded Protobuf representation of this entity

  @property
  def _original(self):
    """
     Singleton to deserialize the protobuf into a new entity that looks like the original from database
    """
    if not self.__original and self._original_pb:
      self.__original = self.__class__._from_pb(self._original_pb)
    return self.__original

  @classmethod
  def _from_pb(cls, pb, set_key=True, ent=None, key=None):
    """
    save copy of original pb so we can track if anything changes between puts
    """
    entity = super(ChangesetMixin, cls)._from_pb(pb, set_key=set_key, ent=ent, key=key)
    if entity._original_pb is None and not entity._projection:
      # _from_pb will get called if we unpickle a new object (like when passing through deferred library)
      #   so if we are being materialized from pb and we don't have a key, then we don't have _original
      entity.__original = None
      entity._original_pb = pb
    return entity

Make a clone of the entity when you first read it:

Copy an entity in Google App Engine datastore in Python without knowing property names at 'compile' time

and put it on the entity itself so it can be referenced later when desired. That way you don't have to do a second datastore read just to make the comparison

We override two different Model methods to make this happen:

@classmethod
def _post_get_hook(cls, key, future):
    """
    clone this entity so we can track if anything changes between puts

    NOTE: this only gets called after a ndb.Key.get() ... NOT when loaded from a Query
      see _from_pb override below to understand the full picture

    also note: this gets called after EVERY key.get()... regardless if NDB had cached it already
      so that's why we're only doing the clone() if _original is not set...
    """
    entity = future.get_result()
    if entity is not None and entity._original is None:
        entity._original = clone(entity)

@classmethod
def _from_pb(cls, pb, set_key=True, ent=None, key=None):
    """
    clone this entity so we can track if anything changes between puts

    this is one way to know when an object loads from a datastore QUERY
      _post_get_hook only gets called on direct Key.get()
      none of the documented hooks are called after query results

    SEE: https://code.google.com/p/appengine-ndb-experiment/issues/detail?id=211
    """
    entity = super(BaseModel, cls)._from_pb(pb, set_key=set_key, ent=ent, key=key)
    if entity.key and entity._original is None:
        # _from_pb will get called if we unpickle a new object (like when passing through deferred library)
        #   so if we are being materialized from pb and we don't have a key, then we don't have _original
        entity._original = clone(entity)
    return entity
Community
  • 1
  • 1
Nicholas Franceschina
  • 6,009
  • 6
  • 36
  • 51