Re: caching documentation

From: Andrus Adamchik (andru..bjectstyle.org)
Date: Wed Mar 05 2008 - 06:20:07 EST

  • Next message: Marcin Skladaniec: "Re: caching documentation"

    On Mar 4, 2008, at 1:26 AM, Marcin Skladaniec wrote:

    > Hi
    >
    > The documentation on caching (http://cayenne.apache.org/doc/caching-and-fresh-data.html
    > and http://cayenne.apache.org/doc/object-caching.html) isn't very
    > comprehensive,

    Agreed. There are lots of new features related to caching in 3.0, and
    we do not communicate them well to the users as of yet.

    > it does not answer questions like:
    >
    > - what is actually stored in cache pks? datarows ? objectIds ?

    There are two types of cache: object cache [1] and query cache.

    * Object cache (stored at ObjectContext): Map<ObjectId, Persistent>
    (it may not be declared as such, but this is what it is).
    * Object cache (stored at DataDomain... so really a snapshot cache):
    Map<ObjectId, DataRow>
    * Query cache (stored at ObjectContext, aka LOCAL_CACHE): Map<String,
    List<Persistent|DataRow>
    * Query cache (stored at DataDomain, aka SHARED_CACHE): Map<String,
    List<DataRow>

    > - does caching change when paging is on ?

    Yes, there are some caveats, and a few things were tweaked recently.
    LOCAL_CACHE works (both ROP and two tier). There is no SHARED_CACHE
    support (and I want to make this more formal - throw an
    IllegalStateException if pagination and SHARED_CACHE are used
    together). One reason why I want to do that is that it appeared under
    ROP as if SHARED_CACHE worked, when it fact things worked differently,
    as a side effect of the special handling of paginated lists on the ROP
    server (see below).

    > - does caching require special measures when used with ROP ?
    > (meaning the propagation of changes between contexts)

    Not really, maybe an understanding of how it is implemented. Paginated
    list is always cached in the *server* local cache, regardless of the
    query cache settings. I.e. "LOCAL_CACHE + paginated list + ROP" means
    caching on both server and client; "NO_CACHE + paginated list + ROP"
    still means caching on the server. This is done in order to avoid
    transferring unresolved ID's to the client.

    > - how to properly use SelectQuery.setCacheGroups()?

    Cache groups are ignored unless you use advanced implementations of
    QueryCache on the server (e.g. OSCache). RefreshQuery can also target
    cache groups (see below). "cache group" is a mechanism to allow
    backend code to perform smart cache invalidation without knowing
    anything about the nature of the queries. E.g. you can have two groups
    "objects_that_change_often" and "objects_that_rarely_change",
    corresponding to 2 OSCache invalidation rules, "once per minute" vs.
    "once per day"... Now when you add new queries, you do not need to
    change configuration, if they fall into one of the existing "groups"...

    So the trick with cache groups is to find common data invalidation
    patterns in your app. Each repeating pattern becomes a group. This is
    a logical task, with very little code involved.

    > what happens when a query has more than one cache group specified?

    Invalidation rules for all groups are combined. I rarely used that in
    practice, but still think this allows some extra flexibility, e.g. if
    the same query falls in a broad category and also in a very specific
    one. E.g. "objects_that_rarely_change" and
    "objects_that_change_when_event_X_occurs".

    > - how long the cache entries sit in the memory, is there a way to
    > invalidate all cache from time to time ?

    Query cache (both shared and local): default mechanism is LRU and no
    expiration. OSCache allows to configure size and advanced expiration
    rules per cache group.

    Snapshot cache: LRU. Size configurable in the Modeler.

    Object cache (server): Unlimited size map with weak references.

    > - how to invalidate cache using RefreshQuery, the http://cayenne.apache.org/doc/refreshquery.html
    > is just a list of suggestions on how it might work in the future.

    Yeah, this is not documented properly. I need to poke around a bit
    more to provide accurate information on RefreshQuery behavior. It was
    an early idea of cache handling, but I stopped using it in my own
    apps, as OSCache works beautifully, supports clustering, etc., etc.
    And rather importantly - it removes cache management logic from the
    code (i.e. explicit invalidation vs. configuration-based one).

    > Me and Ari are willing to document the caching feature, but we would
    > need some help.

    Awesome! I'd imagine the trick here is to separate everything
    discussed here into "internal-design-not-relevant-to-the-user" part
    and "cache-user-guide" part to avoid confusing people and exposing too
    many implementation details that will likely change over time.

    Andrus

    [1] http://cayenne.apache.org/doc/object-caching.html



    This archive was generated by hypermail 2.0.0 : Wed Mar 05 2008 - 06:20:40 EST