In fact it will be fairly easy to extend Cayenne to do that... Just
swap ToManyList (current List implementation in a relationship) with
IncrementalFaultList. This is 100% db-independent. Of course this
would require a hint in the mapping.
Andrus
On Sep 6, 2005, at 10:59 AM, Gili wrote:
>
> Michael, to the best of my knowledge Hibernate provides index/
> limit support in a database-agnostic way without fetching all the
> results, so this *is* possible. Even if not all databases support
> this feature we should support it for those who do (which are the
> majority). The performance benefit is huge.
>
> If Cayenne wants to know how many results there are, issuing a
> "select count(*)" query should be enough and far more efficient
> than actually retrieving a list of all PKs.
>
> Lastly, I don't think 100,000 records is a lot. I mean, this is
> what databases are all about: massive amounts of data. We should be
> designing for this kind of thing, as it is not the exceptional
> case. Currently I've got 2,464 records in one of my tables and my
> application is just at its birth stages. In another year I wouldn't
> be surprised if I reach or surpass 100,000 records.
>
> Gili
>
> Gentry, Michael (Contractor) wrote:
>
>> Cayenne (or another ORM) will always need to know how many records
>> are
>> in the relationship. It doesn't matter if it fetches the entire
>> object
>> or the primary key, it still needs to fetch all of them. There is no
>> database-agnostic way of fetching N records starting at record
>> 300, so
>> Cayenne/etc needs to have them all (or at least the PKs) in order to
>> function. If it only fetched the PKs, it can still do 1 SELECT
>> passing
>> the DB N PKs (I'd like to buy a vowel there), which is at least
>> fairly
>> efficient. If you have 100,000 records in a relationship, you might
>> have bigger issues, too ... :-)
>> /dev/mrg
>> -----Original Message-----
>> From: Gili [mailto:cowwo..bs.darktech.org] Sent: Tuesday,
>> September 06, 2005 10:41 AM
>> To: cayenne-deve..bjectstyle.org
>> Subject: Re: Idea for index/limit support
>> Right, this is related to pagination. All I'm saying is that if
>> I have 100,000 results (not necessarily blobs) and I only want to
>> pick up the first five results, I should be able to do just that.
>> Hitting all 100,000 results when I know in advance I will not read
>> them is a huge waste of both memory and time.
>> Regarding my suggestion, the default would be the current
>> behavior (i.e. retrieve all values) but I'm saying it would be
>> nice to have the option to tweak this further. To my
>> understanding, even with Pagination,
>> Cayenne will retrieve all primary keys from a result. I saw an
>> open JIRA
>> issue for this I think.
>> Gili
>> Gentry, Michael (Contractor) wrote:
>>
>>> I really can't imagine my to-many faults not returning everything in
>>>
>> the
>>
>>> relationship. I certainly wouldn't want it to only fault in what I
>>>
>> try
>>
>>> to read (that would require far too many SELECTs and hurt
>>>
>> performance).
>>
>>> And it's probably too arbitrary to pick N records for it to read
>>> in at
>>>
>> a
>>
>>> time.
>>>
>>> Given your example, it still seems like you are still having your
>>> BLOB
>>> data inside a table with other data instead of isolated. So,
>>> when you
>>> fault your to-many relationship, you are getting all of the BLOB
>>> data
>>>
>> (N
>>
>>> rows) in addition to the other attributes. This is almost always
>>>
>> going
>>
>>> to be a lose situation using ORM.
>>>
>>> All that being said, perhaps a paginated query could help you out?
>>>
>>>
>>>
>> http://www.objectstyle.org/cayenne/userguide/perform/paged-
>> queries.html
>>
>>> /dev/mrg
>>>
>>> -----Original Message-----
>>> From: Gili [mailto:cowwo..bs.darktech.org] Sent: Monday,
>>> September 05, 2005 3:29 AM
>>> To: cayenne-deve..bjectstyle.org
>>> Subject: Idea for index/limit support
>>>
>>>
>>>
>>> Aside from the obvious idea that queries should be able to
>>> set index/limit, I'm thinking we should modify the behavior of
>>>
>> relationship
>>
>>> getting methods. Currently they return a List from xxxArray
>>> relationships. I'm thinking we should implement the List interface,
>>>
>> and
>>
>>> provide special index/limit/fetchSize methods in our
>>> implementation. This way old code continues working as is and new
>>> code that wishes to optimize performance does something like this:
>>>
>>> List images = dataObject.getImageArray();
>>> CayenneList typedImages = (CayenneList) images;
>>>
>>> // (optional) configure List before using it
>>> typedImages.setFetchSize(10);
>>> typedImages.setMinIndex(15);
>>> typedImages.setMaxIndex(30);
>>>
>>> // use list like we always did
>>> typedImages.get(20);
>>>
>>> now... min/max indexes are hints to the List implementation as
>>> to what index/limit values to issue the query with. If a user
>>> tries to get()
>>>
>> an
>>
>>> index outside this range we will issue a second query but this is
>>> unlikely to happen. The typical usage scenerio would be if the above
>>>
>> had
>>
>>> 100,000 rows in the result we'd see a noticable performance
>>> benefit of
>>> only retrieving 15 rows (index 15 - 30 inclusive). And best of all,
>>>
>> this
>>
>>> change is completely backwards compatible with older releases.
>>> What do
>>> you think?
>>>
>>> Gili
>>>
>
> --
> http://www.desktopbeautifier.com/
>
>
This archive was generated by hypermail 2.0.0 : Tue Sep 06 2005 - 11:06:13 EDT