Re: Caching query results

From: Andrus Adamchik (andru..bjectstyle.org)
Date: Mon Sep 25 2006 - 15:44:26 EDT

  • Next message: Christian Mittendorf: "Why is method encodeProperty(String, Object, boolean) of XMLEncoder private?"

    Hi Francesco,

    On Sep 25, 2006, at 10:56 AM, Francesco Fuzio wrote:
    > Thank you for the answers: I'm definitely looking forward to trying
    > the 3.0 cool features you mentioned.
    >
    > As for 2.1 (since for us is important to keep data updated without
    > relying on expiration timing) I was thinking about this approach
    > (for a clustered environment)

    That would be version 1.2.*, right?

    > 1) Enable Cayenne Replicated Shared Object Cache
    > 2) Disable Cayenne Query (i.e list ) Cache
    > 3) Use a Caching framework supporting automatic distributed refresh/
    > invalidation policy (e.g Oscahe or Ehcache) to save query results
    > as list of ObjectId's.
    > 4) In case of Query "Cache Hit" use the cached ObjectId's to
    > retrieve the associated DataObjects via the DataContext [ public
    > Persistent <http://incubator.apache.org/cayenne/1_2/api/cayenne/org/
    > objectstyle/cayenne/Persistent.html> *localObject*(ObjectId <http://
    > incubator.apache.org/cayenne/1_2/api/cayenne/org/objectstyle/
    > cayenne/ObjectId.html> id, Persistent <http://incubator.apache.org/
    > cayenne/1_2/api/cayenne/org/objectstyle/cayenne/Persistent.html>
    > prototype)]
    >
    > What do you think, is this approach reasonable? Will it work?

    This should work (you'll just use your own cache as a front end to
    the DataContext query API), and should provide a clean path to the
    future 3.0 migration. You'll need to consider a few things though:

    A. Query cache key generation. In 1.2 this is based on Query name
    which is pretty dumb and barely usable; in 3.0 SelectQuery and
    SQLTemplate are smart enough to build the cache key based on their
    state. You may copy some of that code.

    B. Invalidation Strategies. That's a tricky one....

    I couldn't come up with a well-performing generic solution (I tried,
    see CAY-577). Consider that events that may cause automatic
    invalidation are object deletion, insertion and updating (update can
    affect the ordering and also whether an object still matches the
    query condition). So *every* commit can potentially invalidate any
    number of cached lists for a given entity.

    The trick is to create an efficient algorithm to invalidate just the
    right cache entries and avoid invalidating the entire entity cache.
    Manually scanning and rearranging all lists on every commit is of
    course very inefficient.

    So in 3.0 we added "cache group" notion so that users could
    categorize queries based on some criteria and then invalidate the
    whole category of cache entries. (Cache group notion is supported by
    OSCache by the way). Here is an example.... Consider a "BlogPost"
    entity. All queries that fetch a date range of BlogPosts can be
    arbitrarily divided into "old_posts" and "new_posts" categories. So
    once a user updates/deletes/removes a BlogPost, a code can check the
    date of this post and invalidate either "old_posts" or "new_posts".

    This is just one solution that we came up with. Not automatic, but
    fairly simple and efficient. You can come up with your own
    strategies. If you can think of a better generic algorithm for
    invalidation, please share.

    Andrus



    This archive was generated by hypermail 2.0.0 : Mon Sep 25 2006 - 15:45:42 EDT