Re: caching documentation

From: Andrus Adamchik (andru..bjectstyle.org)
Date: Wed Mar 05 2008 - 06:20:07 EST

Next message: Marcin Skladaniec: "Re: caching documentation"

Previous message: Andrus Adamchik: "Re: Expressions & objects state"
In reply to: Marcin Skladaniec: "caching documentation"
Next in thread: Marcin Skladaniec: "Re: caching documentation"
Reply: Marcin Skladaniec: "Re: caching documentation"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Mar 4, 2008, at 1:26 AM, Marcin Skladaniec wrote:

> Hi
>
> The documentation on caching (http://cayenne.apache.org/doc/caching-and-fresh-data.html
> and http://cayenne.apache.org/doc/object-caching.html) isn't very
> comprehensive,

Agreed. There are lots of new features related to caching in 3.0, and
we do not communicate them well to the users as of yet.

> it does not answer questions like:
>
> - what is actually stored in cache pks? datarows ? objectIds ?

There are two types of cache: object cache [1] and query cache.

* Object cache (stored at ObjectContext): Map<ObjectId, Persistent>
(it may not be declared as such, but this is what it is).
* Object cache (stored at DataDomain... so really a snapshot cache):
Map<ObjectId, DataRow>
* Query cache (stored at ObjectContext, aka LOCAL_CACHE): Map<String,
List<Persistent|DataRow>
* Query cache (stored at DataDomain, aka SHARED_CACHE): Map<String,
List<DataRow>

> - does caching change when paging is on ?

Yes, there are some caveats, and a few things were tweaked recently.
LOCAL_CACHE works (both ROP and two tier). There is no SHARED_CACHE
support (and I want to make this more formal - throw an
IllegalStateException if pagination and SHARED_CACHE are used
together). One reason why I want to do that is that it appeared under
ROP as if SHARED_CACHE worked, when it fact things worked differently,
as a side effect of the special handling of paginated lists on the ROP
server (see below).

> - does caching require special measures when used with ROP ?
> (meaning the propagation of changes between contexts)

Not really, maybe an understanding of how it is implemented. Paginated
list is always cached in the *server* local cache, regardless of the
query cache settings. I.e. "LOCAL_CACHE + paginated list + ROP" means
caching on both server and client; "NO_CACHE + paginated list + ROP"
still means caching on the server. This is done in order to avoid
transferring unresolved ID's to the client.

> - how to properly use SelectQuery.setCacheGroups()?

Cache groups are ignored unless you use advanced implementations of
QueryCache on the server (e.g. OSCache). RefreshQuery can also target
cache groups (see below). "cache group" is a mechanism to allow
backend code to perform smart cache invalidation without knowing
anything about the nature of the queries. E.g. you can have two groups
"objects_that_change_often" and "objects_that_rarely_change",
corresponding to 2 OSCache invalidation rules, "once per minute" vs.
"once per day"... Now when you add new queries, you do not need to
change configuration, if they fall into one of the existing "groups"...

So the trick with cache groups is to find common data invalidation
patterns in your app. Each repeating pattern becomes a group. This is
a logical task, with very little code involved.

> what happens when a query has more than one cache group specified?

Invalidation rules for all groups are combined. I rarely used that in
practice, but still think this allows some extra flexibility, e.g. if
the same query falls in a broad category and also in a very specific
one. E.g. "objects_that_rarely_change" and
"objects_that_change_when_event_X_occurs".

> - how long the cache entries sit in the memory, is there a way to
> invalidate all cache from time to time ?

Query cache (both shared and local): default mechanism is LRU and no
expiration. OSCache allows to configure size and advanced expiration
rules per cache group.

Snapshot cache: LRU. Size configurable in the Modeler.

Object cache (server): Unlimited size map with weak references.

> - how to invalidate cache using RefreshQuery, the http://cayenne.apache.org/doc/refreshquery.html
> is just a list of suggestions on how it might work in the future.

Yeah, this is not documented properly. I need to poke around a bit
more to provide accurate information on RefreshQuery behavior. It was
an early idea of cache handling, but I stopped using it in my own
apps, as OSCache works beautifully, supports clustering, etc., etc.
And rather importantly - it removes cache management logic from the
code (i.e. explicit invalidation vs. configuration-based one).

> Me and Ari are willing to document the caching feature, but we would
> need some help.

Awesome! I'd imagine the trick here is to separate everything
discussed here into "internal-design-not-relevant-to-the-user" part
and "cache-user-guide" part to avoid confusing people and exposing too
many implementation details that will likely change over time.

Andrus

[1] http://cayenne.apache.org/doc/object-caching.html

Next message: Marcin Skladaniec: "Re: caching documentation"
Previous message: Andrus Adamchik: "Re: Expressions & objects state"
In reply to: Marcin Skladaniec: "caching documentation"
Next in thread: Marcin Skladaniec: "Re: caching documentation"
Reply: Marcin Skladaniec: "Re: caching documentation"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2.0.0 : Wed Mar 05 2008 - 06:20:40 EST