Re: New prefetching algorithms

From: Andrus Adamchik (andru..bjectstyle.org)
Date: Mon Sep 07 2009 - 16:55:34 EDT

Next message: Apache Hudson Server: "Cayenne-trunk - Build # 448 - Failure"

Previous message: Andrus Adamchik: "Re: New prefetching algorithms"
In reply to: Andrus Adamchik: "Re: New prefetching algorithms"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Finally some good news on performance. After tweaking of the prefetch
strategies, I got the following test numbers on PostgreSQL, fetching/
prefetching a few thousands of objects (smaller number of milliseconds
means faster processing) :

(disjoint)
n:1 ... M6 ...... 51 ms
n:1 ... trunk ... 45 ms

(joint)
n:1 ... M6 ...... 100 ms
n:1 ... trunk ... 45 ms

(disjoint)
1:n ... M6 ...... 100 ms
1:n ... trunk ... 54 ms

(disjoint)
n:m ... M6 ...... 54 ms
n:m ... trunk ... 51 ms

So the trunk code significantly improves on 3.0M6 when prefetching to-
many and joint to-ones relationships, and somewhat improves on other
cases (within a margin of error I guess).

Andrus

On Sep 7, 2009, at 8:53 AM, Andrus Adamchik wrote:

> Been thinking about the new prefetching model some more and found a
> glaring performance hole - the most common N:1 prefetch case will
> result in a cartesian product processing in memory. E.g. if one
> Artist has 3 Paintings, and the Paintings are fetched with Artist
> prefetch, the Artist DB data will be read repeatedly 3 times. The
> result will be correct - 3 Paintings all pointing to a single Artist
> object, however processing will be much slower.
>
> Now will be making another pass over the code to restore the old
> prefetch strategy for N:1 relationships. Hopefully the resulting
> code will be tighter than it used to be.
>
> Andrus
>
>
> On Sep 6, 2009, at 9:43 PM, Andrus Adamchik wrote:
>
>> Good to have a little time again to hack Cayenne internals.
>>
>> Just committed a pretty big change to the prefetching algorithm
>> motivated by CAY-1250 bug report. So combining prefetching and
>> inheritance now works 100%.
>>
>> One visible effect of this change is that all disjoint prefetch
>> queries will now include the ID's of the source side of the
>> prefetch relationship and a mandatory join to the source entity. In
>> return for this small inefficiency (increased result set size...
>> hopefully most ID's are small), we get a bunch of benefits, main
>> one being the ability to process related fetched objects in a
>> consistent manner regardless of the relationship semantics (1..1,
>> 1..N, N..M). This strategy was used before for flattened
>> relationships, now it is used for everything. On the other hand
>> this change allowed to optimize some related cases, so all in all,
>> there may be no performance penalty.
>>
>> It is still possible to go back and optimize it further to prevent
>> the addition of the extra columns to the resultset in some cases
>> (e.g. if both joined FK and PK are present in the result, only
>> fetch one of them), I wish we could do that in some central
>> location (like SelectTranslator) instead of writing endless if/else
>> in the prefetch processing code.
>>
>> Now the prefetch code is easier to make sense of, with fewer if/
>> else. And I am planning to refactor it further.
>>
>> Also I came very close to fixing the biggest remaining limitation
>> of disjoint prefetching:
>>
>> https://issues.apache.org/jira/browse/CAY-1025
>>
>> Andrus
>>
>>
>
>

Next message: Apache Hudson Server: "Cayenne-trunk - Build # 448 - Failure"
Previous message: Andrus Adamchik: "Re: New prefetching algorithms"
In reply to: Andrus Adamchik: "Re: New prefetching algorithms"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2.0.0 : Mon Sep 07 2009 - 16:56:12 EDT