Re: Object Caching

From: Hans Pikkemaat (h.pikkemaa..si-solutions.nl)
Date: Thu Nov 12 2009 - 06:46:30 EST

  • Next message: Hans Pikkemaat: "Re: complex query vs performIteratedQuery vs Prefetching"

    Hi,

    What I can see when I use paging in combination with SQLTemplate is this:

    Cayenne first runs the main SQLTemplate query which is stored in memory
     
    When I get the first page it determines the key values of the main query
    which it then
    uses in a new query which will return the main table plus the detail
    table data.
    This will produce the main table object through which the detail table
    is accessible.

    The problem here is that the key of the main table is used only. The
    SQLTemplate query was manually
    constructed and does a query on the main table and a left join to the
    detail table so this will produce
    a duplicate key value where a main table record has 2 related detail
    table records.

    This doesnt have to be a problem, actually the query does return the
    number of records used as page
    size. But internally in cayenne something weird happens. Somehow the
    duplicate records are remove
    and the IncrementalFaultList.checkPageResultConsistency method throws an
    exception for this.

    Because the main query returns the main object but also the detail
    object I find it strange
    that the query generated for the page only uses the main table key. I
    would expect that
    it also would use the key of the detail table.

    An example. Say I have a main table key 1 and related detail records
    with key 1, 2 and 3.
    Say I run the SQLTemplate which returns key 1 but only key 1 and 2 for
    the detail table.

    The page query will now run for all detail records and return all
    records which I did not
    request.

     From this I'm concluding that if an SQLTemplate is used it is not
    usefull (read: faulty) to
    include the detail table in this query. When paging is used all the
    detail tables are automatically
    queried.

    If I write the main SQLTemplate query such it only returns the main
    object then the
    Exception does not occur.

    My conclusion is then that if you want to use paging with SQLTemplate
    the main
    query should only return the main table. Prefetching will then return
    ALL related
    table records.

    tx

    HPI

    Andrus Adamchik wrote:
    > Yeah, still need to check that one.
    >
    > On Nov 12, 2009, at 10:43 AM, Hans Pikkemaat wrote:
    >
    >
    >> Hi,
    >>
    >> Yes, the paginated query would indeed be the only way for me to go
    >> forward.
    >> The problem however is that I get the exception I posted earlier.
    >>
    >> tx
    >>
    >> Hans
    >>
    >> Andrus Adamchik wrote:
    >>
    >>> For paginated queries we contemplated a strategy of a list with
    >>> constant size of fully resolved objects. I.e. when a page is
    >>> swapped in, some other (LRU?) page is swapped out. We decided
    >>> against it, as in a general case it is hard to consistently
    >>> predict which page should be swapped out.
    >>>
    >>> However it should be rather easy to write such a list for a
    >>> specific case with a known access order (e.g. a standard iteration
    >>> order). In fact I would vote to even include such implementation
    >>> in Cayenne going forward.
    >>>
    >>> More specifically, you can extend IncrementalFaultList [1],
    >>> overriding 'resolveInterval' to swap out previously read pages,
    >>> turning them back into ids. And the good part is that you can use
    >>> your extension directly without any need to modify the rest of
    >>> Cayenne.
    >>>
    >>> Andrus
    >>>
    >>>
    >>> [1] http://cayenne.apache.org/doc/api/org/apache/cayenne/access/IncrementalFaultList.html
    >>>
    >>>
    >>> On Nov 12, 2009, at 10:07 AM, Hans Pikkemaat wrote:
    >>>
    >>>
    >>>> Hi,
    >>>>
    >>>> So this means that if I use a generic query that the query
    >>>> results are always stored
    >>>> completely in the object store (or the query cache if I configure
    >>>> it).
    >>>>
    >>>> Objects are returned in a list so as long I have a reference to
    >>>> this list (because I'm
    >>>> traversing it) these objects are not garbage collected.
    >>>>
    >>>> If I use the query cache the full query results are cached. This
    >>>> means that I can only
    >>>> tell it to remove the whole query.
    >>>>
    >>>> Effectively this means I'm unable to run a big query and process
    >>>> the results as a stream.
    >>>> So I cannot process the first results and then somehow make them
    >>>> available for
    >>>> garbage collection.
    >>>>
    >>>> The only option I have would be the iterated query but this is
    >>>> only usefull for queries
    >>>> one 1 table without any relations because it is not possible to
    >>>> use prefetching nor is
    >>>> it possible to manually construct relations between obects.
    >>>>
    >>>> My conclusion here is that cayenne is simply not suitable for
    >>>> doing large batch wise
    >>>> query processing because of the memory implications.
    >>>>
    >>>> tx
    >>>>
    >>>> HPI
    >>>>
    >>>> Andrus Adamchik wrote:
    >>>>
    >>>>
    >>>>> As mentioned in the docs, individual objects and query lists are
    >>>>> cached independently. Of course query lists contain a subset of
    >>>>> cached
    >>>>> object store objects inside the lists. An object won't get gc'd
    >>>>> if it
    >>>>> is also stored in the query list.
    >>>>>
    >>>>> Now list cache expiration is controlled via query cache factory. By
    >>>>> default this is an LRU map, so as long as the map has enough
    >>>>> space to
    >>>>> hold lists (its capacity == # of lists, not # of objects), the
    >>>>> objects
    >>>>> won't get gc'd.
    >>>>>
    >>>>> You can explicitly remove entries from the cache via QueryCache
    >>>>> remove
    >>>>> and removeGroup methods. Or you can use a different
    >>>>> QueryCacheFactory
    >>>>> that implements some custom expiration/cleanup mechanism.
    >>>>>
    >>>>> Andrus
    >>>>>
    >>>>> On Nov 11, 2009, at 3:43 PM, Hans Pikkemaat wrote:
    >>>>>
    >>>>>
    >>>>>
    >>>>>
    >>>>>> Hi,
    >>>>>>
    >>>>>> I use the latest version of cayenne, 3.0b and am experimenting
    >>>>>> with
    >>>>>> the object caching features.
    >>>>>>
    >>>>>> The documentation states that committed objects are purged from
    >>>>>> the
    >>>>>> cache because it uses weak references.
    >>>>>> (http://cayenne.apache.org/doc/individual-object-caching.html)
    >>>>>>
    >>>>>> If I however run a query using SQLTemplate which caches the
    >>>>>> objects
    >>>>>> into the dataContext local cache (objectstore),
    >>>>>> the objects don't seem to be purged at all. If I simply run the
    >>>>>> query dump the contents using an iterator on the resulting
    >>>>>> List then the nr of registered objects in the objectstore stays
    >>>>>> the
    >>>>>> same (dataContext.getObjectStore().registeredObjectsCount()).
    >>>>>> Even if I manually run System.gc() I don't see any changes (I know
    >>>>>> this can be normal as gc() doesn't guarantee anything)
    >>>>>>
    >>>>>> What am I doing wrong? Under which circumstances will cayenne
    >>>>>> purge
    >>>>>> the cache?
    >>>>>>
    >>>>>> tx
    >>>>>>
    >>>>>> Hans
    >>>>>>
    >>>>>>
    >>>>>>
    >>>>>>
    >>
    >>
    >
    >

    -- 
    	TSi Solutions
    Neptunusstraat 25
    7521 WC Enschede
    

    Tel. +31 (0)88 - 25 00 000 Fax. +31 (0)88 - 25 00 122 Hans Pikkemaat Java Developer (Services Team) E-mail: h.pikkemaa..si-solutions.nl <mailto:h.pikkemaat@tsi-solutions.nl> www.tsi-solutions.nl <http://www.tsi-solutions.nl/> www.toeristiek.nl <http://www.toeristiek.nl/>

    10 jaar TSi Solutions ... marktleider in het automatiseren en outsourcen van werkprocessen in de reisbranche ... toonaangevende partij voor het verzamelen, structureren en beschikbaarstellen van reiscontent ... Reisrevue Innovatieveer 2008 - Veervolle vermelding ... Winnaar Reisrevue Innovatieveer 2009 ... Top 20 positie in 2008 Deloitte Technology Fast50 Nederland ... Top 10 positie in 2009 Deloitte Technology Fast50 Benelux ... genomineerd voor Technology 500 EMEA 2009 TSi Solutions is de handelsnaam van Travel Service International b.v.[KvK 06091935] DISCLAIMER: De informatie opgenomen in dit bericht kan vertrouwelijk zijn en is uitsluitend bestemd voor de geadresseerde. Indien u dit bericht onterecht ontvangt, wordt u verzocht de inhoud niet te gebruiken en de afzender direct te informeren door het bericht te retourneren. The information contained in this message may be confidential and is intended to be exclusively for the addressee. Should you receive this message unintentionally, please do not use the contents herein and notify the sender immediately by return e-mail.



    This archive was generated by hypermail 2.0.0 : Thu Nov 12 2009 - 06:47:14 EST