Re: paged query slow when fetching big lists

From: Andrus Adamchik (andru..bjectstyle.org)
Date: Fri Jun 22 2007 - 11:10:15 EDT

Next message: Craig L Russell: "Re: AW: creating/saving transient objects to database"

Previous message: Michael Lepine: "Re: creating/saving transient objects to database"
In reply to: Marcin Skladaniec: "paged query slow when fetching big lists"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi Marcin,

> * fetch only Pk columns and create all ObjectIds at once, get rid
> of the iterating process if possible
> * use already existing method resolveInterval() to fault the
> required range of records

This strategy was discussed in the May thread with Ari (the one that
Michael Gentry mentioned). My vote is +0, meaning that before we make
this change, I want to confirm first that it has a visible impact on
performance. Could you possibly make such change locally and see if
it helps? (Look at SelectQuery.addCustomDbAttribute() to only include
PK; if you have problems making this change, ping me via the dev list
- I'll try my best to help).

> If the creation of ObjectId and getting the results from ResultSet
> cannot be speed up (because it simply has to happen, and it does
> not depend on the way it is done), the only choice will be to
> implement some more complex solution using sql LIMIT statement.

I'd love to avoid that, as the data you get may as well be different
the next time you resolve a page, so you may end up with duplicates
or skipped records. If we ever go this way, we'll probably need to
make it a user choice (use LIMIT vs. IncrementalFaultList).

Andrus

On Jun 22, 2007, at 2:35 AM, Marcin Skladaniec wrote:

> Hi
> Recently we have found that fetching a list of 100,000 records
> using ROP with paging and no cache takes a long time, about 50
> seconds in our case. We have profiled the cpu usage and the result
> shows that 99% of time is spent in IncrementalFaultList, within the
> fillIn() method.
>
> The fillIn method works (in my opinion) in a bit strange fashion:
> it does execute query at once, stores the query result in
> java.sql.ResultSet, and than iterates through the result either
> creating the whole DataRow or just ObjectId. If there is a need the
> DataRows are faulted at the end of the method.
> From our testing it came up that this bit of code :
>
> while (it.hasNextRow()) {
> elements.add(it.nextObjectId(entity));
> }
>
> is where all the time is spent. Each iteration in this loop takes
> about 0.5ms, which multiplied by 100,000 takes almost 50 seconds.
> nextObjectId method consists of two parts: fetching the next result
> from ResultSet and creating a ObjectId, but I was unable to check
> which one takes the most time, anyway I think that this approach is
> somewhat wrong, since always 99% of the records will be fetched as
> ObjectId and never faulted, so my ideas to enhance this are:
> * fetch only Pk columns and create all ObjectIds at once, get rid
> of the iterating process if possible
> * use already existing method resolveInterval() to fault the
> required range of records
> If the creation of ObjectId and getting the results from ResultSet
> cannot be speed up (because it simply has to happen, and it does
> not depend on the way it is done), the only choice will be to
> implement some more complex solution using sql LIMIT statement.
>
> I would like to mention that we are using some DataContext
> decorators and life-cycle callbacks, but I don't believe those are
> important factors in this case.
>
> Whatever is the solution, i think it is pretty crucial that it will
> be implemented soon, since the usability of the ROP without fast
> paging is rather low.
>
> With regards
> Marcin
> -------------------------->
> ish
> http://www.ish.com.au
> Level 1, 30 Wilson Street Newtown 2042 Australia
> phone +61 2 9550 5001 fax +61 2 9550 4001
>
>
>

Next message: Craig L Russell: "Re: AW: creating/saving transient objects to database"
Previous message: Michael Lepine: "Re: creating/saving transient objects to database"
In reply to: Marcin Skladaniec: "paged query slow when fetching big lists"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2.0.0 : Fri Jun 22 2007 - 11:10:41 EDT