Re: Use of DISTINCT in "to-many" qualifiers [Was: Multiple flattened relationships...]

From: Laszlo Spoor (lspoor_cayenn..otmail.com)
Date: Tue May 13 2003 - 15:54:06 EDT

Next message: martin ruff: "strange behaviour"

Previous message: Andrus Adamchik: "Use of DISTINCT in "to-many" qualifiers [Was: Multiple flattened relationships...]"
Maybe in reply to: Andrus Adamchik: "Use of DISTINCT in "to-many" qualifiers [Was: Multiple flattened relationships...]"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi Andrus,

You are right. Still, I would like to plead for a small note in the JavaDoc,
explaining this. :-)

Thanks, Laszlo

>From: "Andrus Adamchik" <andru..bjectstyle.org>
>Reply-To: andru..bjectstyle.org
>To: <cayenne-use..bjectstyle.org>
>Subject: Use of DISTINCT in "to-many" qualifiers [Was: Multiple flattened
>relationships...]
>Date: Tue, 13 May 2003 12:28:03 -0400 (EDT)
>
>Laszlo,
>
>I kind of expected such result, but your message prompted me to look for
>an alternative solution. I did some research on various queries
>performance on Oracle. In my case there are 2 tables with to-many from one
>to another, both having around 100000 to 200000 rows. I tried running 3
>flavors of the same query, directly on Oracle, with no JDBC involved:
>
>1. joining on to-many with no DISTINCT
>2. joining on to-many with DISTINCT
>3. Using correlated subquery with EXISTS
>
>(1) was the fastest - 2 seconds in my case, but produced incorrect results
>(duplicate rows), (2) was the slowest - 4 seconds. (3) was in between (1)
>and (2) - 3 to 4 seconds. So using correlated subquery with EXISTS may
>yield to faster queries in some cases. Though I believe that playing with
>indexes may alter the results dramatically either way.
>
>We may look into adding this optimization in the future. It will have to
>be optional though, since many databases do not support correlated
>subqueries and EXISTS (MySQL prior to 4.*).
>
>Andrus
>
>
>
> > Hi Andrus,
> >
> > About the 'DISTINCT' part that I would write you back about.
> >
> > To be sure I formulated my case well, I had to look it up in a couple of
> > guides. As you will notice I will use the Oracle database as the
> > datastore, simply because I have the most experience with this database.
> >
> > I'd got confirmed that the most important reason not to use DISTINCT is
> > the fact that is forces the database to do a 'SORT UNIQUE' which can
> > cost more disk reads and more processing time when reading the results.
> >
> > This information comes from the 'Oracle 9i Performance and Tuning Guide,
> > p2-14):
> > The distinct part results in a SORT-UNIQUE action... SORT UNIQUE occurs
> > if a user specifies a DISTINCT clause (Example 2û15)or if an operation
> > requires unique values for the next step (Example 2û16). Example 2û15
> > DISTINCT Clause Causing SORT UNIQUE
> >
> > SELECT DISTINCT last_name, first_name
> > FROM per_all_people_f
> > WHERE full_name LIKE :b1;
> >
> > Plan
> > --------------------------------------------------
> > SELECT STATEMENT
> > SORT UNIQUE
> > TABLE ACCESS BY INDEX ROWID PER_ALL_PEOPLE_F
> > INDEX RANGE SCAN PER_PEOPLE_F_N54
> >
> > Another, not technical, reason is that in general it is a bad practice
> > to use
> > DISTINCT, because the query writer 'admits' that his 'where-logic' is
> > not strong
> > enough to return the results that are required. However, I don't quite
> > think this is 100% valid for Cayenne, because Cayenne can not always
> > tell if that is
> > the case.
> >
> > Still I do feel that the developer can have the option leaving the
> > distinct out
> > by using query.setDistinct(false), because it gives somewhat more space
> > to tuning.
> >
> > Regards, Laszlo
> >
> > _________________________________________________________________
> > MSN 8 with e-mail virus protection service: 2 months FREE*
> > http://join.msn.com/?page=features/virus
>
>
>

_________________________________________________________________
Help STOP SPAM with the new MSN 8 and get 2 months FREE*
http://join.msn.com/?page=features/junkmail

Next message: martin ruff: "strange behaviour"
Previous message: Andrus Adamchik: "Use of DISTINCT in "to-many" qualifiers [Was: Multiple flattened relationships...]"
Maybe in reply to: Andrus Adamchik: "Use of DISTINCT in "to-many" qualifiers [Was: Multiple flattened relationships...]"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2.0.0 : Tue May 13 2003 - 15:53:41 EDT