Hi,
I ran some tests using 3.0b with SQLTemplate in combination with
prefetching and found
a possible new problem.
It seems that when running the query in eg 1 minute, it takes about 2
minutes before cayenne
has constructed the prefetched objects.
My query produces 2.5 million records. The query will take about 30
minutes. Construction
of the objects will then take an extra hour.
This is not really workable.
Hans
Hans Pikkemaat wrote:
> Hi,
>
> What I can see when I use paging in combination with SQLTemplate is this:
>
> Cayenne first runs the main SQLTemplate query which is stored in memory
>
> When I get the first page it determines the key values of the main
> query which it then
> uses in a new query which will return the main table plus the detail
> table data.
> This will produce the main table object through which the detail table
> is accessible.
>
> The problem here is that the key of the main table is used only. The
> SQLTemplate query was manually
> constructed and does a query on the main table and a left join to the
> detail table so this will produce
> a duplicate key value where a main table record has 2 related detail
> table records.
>
> This doesnt have to be a problem, actually the query does return the
> number of records used as page
> size. But internally in cayenne something weird happens. Somehow the
> duplicate records are remove
> and the IncrementalFaultList.checkPageResultConsistency method throws
> an exception for this.
>
> Because the main query returns the main object but also the detail
> object I find it strange
> that the query generated for the page only uses the main table key. I
> would expect that
> it also would use the key of the detail table.
>
> An example. Say I have a main table key 1 and related detail records
> with key 1, 2 and 3.
> Say I run the SQLTemplate which returns key 1 but only key 1 and 2 for
> the detail table.
>
> The page query will now run for all detail records and return all
> records which I did not
> request.
>
> From this I'm concluding that if an SQLTemplate is used it is not
> usefull (read: faulty) to
> include the detail table in this query. When paging is used all the
> detail tables are automatically
> queried.
>
> If I write the main SQLTemplate query such it only returns the main
> object then the
> Exception does not occur.
>
> My conclusion is then that if you want to use paging with SQLTemplate
> the main
> query should only return the main table. Prefetching will then return
> ALL related
> table records.
>
> tx
>
> HPI
>
> Andrus Adamchik wrote:
>> Yeah, still need to check that one.
>>
>> On Nov 12, 2009, at 10:43 AM, Hans Pikkemaat wrote:
>>
>>
>>> Hi,
>>>
>>> Yes, the paginated query would indeed be the only way for me to go
>>> forward.
>>> The problem however is that I get the exception I posted earlier.
>>>
>>> tx
>>>
>>> Hans
>>>
>>> Andrus Adamchik wrote:
>>>
>>>> For paginated queries we contemplated a strategy of a list with
>>>> constant size of fully resolved objects. I.e. when a page is
>>>> swapped in, some other (LRU?) page is swapped out. We decided
>>>> against it, as in a general case it is hard to consistently
>>>> predict which page should be swapped out.
>>>>
>>>> However it should be rather easy to write such a list for a
>>>> specific case with a known access order (e.g. a standard iteration
>>>> order). In fact I would vote to even include such implementation
>>>> in Cayenne going forward.
>>>>
>>>> More specifically, you can extend IncrementalFaultList [1],
>>>> overriding 'resolveInterval' to swap out previously read pages,
>>>> turning them back into ids. And the good part is that you can use
>>>> your extension directly without any need to modify the rest of
>>>> Cayenne.
>>>>
>>>> Andrus
>>>>
>>>>
>>>> [1] http://cayenne.apache.org/doc/api/org/apache/cayenne/access/IncrementalFaultList.html
>>>>
>>>>
>>>> On Nov 12, 2009, at 10:07 AM, Hans Pikkemaat wrote:
>>>>
>>>>
>>>>> Hi,
>>>>>
>>>>> So this means that if I use a generic query that the query
>>>>> results are always stored
>>>>> completely in the object store (or the query cache if I configure
>>>>> it).
>>>>>
>>>>> Objects are returned in a list so as long I have a reference to
>>>>> this list (because I'm
>>>>> traversing it) these objects are not garbage collected.
>>>>>
>>>>> If I use the query cache the full query results are cached. This
>>>>> means that I can only
>>>>> tell it to remove the whole query.
>>>>>
>>>>> Effectively this means I'm unable to run a big query and process
>>>>> the results as a stream.
>>>>> So I cannot process the first results and then somehow make them
>>>>> available for
>>>>> garbage collection.
>>>>>
>>>>> The only option I have would be the iterated query but this is
>>>>> only usefull for queries
>>>>> one 1 table without any relations because it is not possible to
>>>>> use prefetching nor is
>>>>> it possible to manually construct relations between obects.
>>>>>
>>>>> My conclusion here is that cayenne is simply not suitable for
>>>>> doing large batch wise
>>>>> query processing because of the memory implications.
>>>>>
>>>>> tx
>>>>>
>>>>> HPI
>>>>>
>>>>> Andrus Adamchik wrote:
>>>>>
>>>>>
>>>>>> As mentioned in the docs, individual objects and query lists are
>>>>>> cached independently. Of course query lists contain a subset of
>>>>>> cached
>>>>>> object store objects inside the lists. An object won't get gc'd
>>>>>> if it
>>>>>> is also stored in the query list.
>>>>>>
>>>>>> Now list cache expiration is controlled via query cache factory. By
>>>>>> default this is an LRU map, so as long as the map has enough
>>>>>> space to
>>>>>> hold lists (its capacity == # of lists, not # of objects), the
>>>>>> objects
>>>>>> won't get gc'd.
>>>>>>
>>>>>> You can explicitly remove entries from the cache via QueryCache
>>>>>> remove
>>>>>> and removeGroup methods. Or you can use a different
>>>>>> QueryCacheFactory
>>>>>> that implements some custom expiration/cleanup mechanism.
>>>>>>
>>>>>> Andrus
>>>>>>
>>>>>> On Nov 11, 2009, at 3:43 PM, Hans Pikkemaat wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I use the latest version of cayenne, 3.0b and am experimenting
>>>>>>> with
>>>>>>> the object caching features.
>>>>>>>
>>>>>>> The documentation states that committed objects are purged from
>>>>>>> the
>>>>>>> cache because it uses weak references.
>>>>>>> (http://cayenne.apache.org/doc/individual-object-caching.html)
>>>>>>>
>>>>>>> If I however run a query using SQLTemplate which caches the
>>>>>>> objects
>>>>>>> into the dataContext local cache (objectstore),
>>>>>>> the objects don't seem to be purged at all. If I simply run the
>>>>>>> query dump the contents using an iterator on the resulting
>>>>>>> List then the nr of registered objects in the objectstore stays
>>>>>>> the
>>>>>>> same (dataContext.getObjectStore().registeredObjectsCount()).
>>>>>>> Even if I manually run System.gc() I don't see any changes (I know
>>>>>>> this can be normal as gc() doesn't guarantee anything)
>>>>>>>
>>>>>>> What am I doing wrong? Under which circumstances will cayenne
>>>>>>> purge
>>>>>>> the cache?
>>>>>>>
>>>>>>> tx
>>>>>>>
>>>>>>> Hans
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>
>>
>>
>
> --
> TSi Solutions
> Neptunusstraat 25
> 7521 WC Enschede
>
> Tel. +31 (0)88 - 25 00 000
> Fax. +31 (0)88 - 25 00 122
> Hans Pikkemaat
> Java Developer (Services Team)
> E-mail: h.pikkemaa..si-solutions.nl
> <mailto:h.pikkemaa..si-solutions.nl>
> www.tsi-solutions.nl <http://www.tsi-solutions.nl/>
> www.toeristiek.nl <http://www.toeristiek.nl/>
>
>
> 10 jaar TSi Solutions
> ... marktleider in het automatiseren en outsourcen van werkprocessen
> in de reisbranche
> ... toonaangevende partij voor het verzamelen, structureren en
> beschikbaarstellen van reiscontent
> ... Reisrevue Innovatieveer 2008 - Veervolle vermelding
> ... Winnaar Reisrevue Innovatieveer 2009
> ... Top 20 positie in 2008 Deloitte Technology Fast50 Nederland
> ... Top 10 positie in 2009 Deloitte Technology Fast50 Benelux
> ... genomineerd voor Technology 500 EMEA 2009
> TSi Solutions is de handelsnaam van Travel Service International
> b.v.[KvK 06091935]
> DISCLAIMER: De informatie opgenomen in dit bericht kan vertrouwelijk
> zijn en is uitsluitend bestemd voor de geadresseerde.
> Indien u dit bericht onterecht ontvangt, wordt u verzocht de inhoud
> niet te gebruiken en de afzender direct te informeren door het bericht
> te retourneren.
> The information contained in this message may be confidential and is
> intended to be exclusively for the addressee.
> Should you receive this message unintentionally, please do not use the
> contents herein and notify the sender immediately by return e-mail.
This archive was generated by hypermail 2.0.0 : Fri Nov 13 2009 - 06:05:07 EST