Andrus,
Here is the method I use to get the same behaviour. I use two queries
but it is still faster (in fact, .3 seconds for the request on primary
keys and 1.4 to get the full rows from the PKs, I don't really
understand why it is that slow -- but it is not done if objects are in
cache). I think it needs some work to be used in the real Cayenne,
because I didn't checked the strict equivalency in every case (in
particular, when the original query already gets customDbAttributes, or
the attributes I don't copy -- a clone method in query could be useful).
Notice that the QueryFactory factory used is home-made. Function names
are self-describing.
public static <T extends DataObject> List<T> query(DataContext context,
Class<T> clazz, SelectQuery query, boolean fetchByPk) {
if (!fetchByPk) {
// Use the normal method
return query(context, clazz, query);
}
DbEntity entity = context.getEntityResolver().lookupDbEntity(clazz);
// Copy the original query
SelectQuery query2 = new SelectQuery(clazz);
query2.setQualifier(query.getQualifier());
query2.addOrderings(query.getOrderings());
query2.setFetchLimit(query.getFetchLimit());
// But modify the copy
query2.setDistinct(true);
List<DbAttribute> attrs = entity.getPrimaryKey();
if (attrs.size() == 1) {
query2.addCustomDbAttribute(attrs.get(0).getName());
} else {
throw new UnsupportedOperationException(
"Unable to handle multi-columns (" + attrs.size()
+ ") primary keys");
}
// Performs the modified query
List results = context.performQuery(query2);
if (results.isEmpty()) {
// Avoid useless pain
return new LinkedList<T>();
}
List<T> cached = new LinkedList<T>();
Expression e;
if (attrs.size() == 1) {
List<Object> pks = new LinkedList<Object>();
final String name = attrs.get(0).getName();
for (Object o : results) {
Object pk = ((DataRow) o).get(name);
T cachedObj = clazz.cast(context.getObjectStore().getObject(
new ObjectId(clazz, name, pk)));
if (cachedObj == null) {
pks.add(pk);
} else {
cached.add(cachedObj);
}
}
e = QueryFactory.createIn("db:" + attrs.get(0).getName(), pks);
} else {
throw new UnsupportedOperationException(
"Unable to handle multi-columns (" + attrs.size()
+ ") primary keys");
}
if (e == null) {
// No expression => nothing to get
return cached;
}
// Get full and real objects
List<T> retval = query(context, clazz, new SelectQuery(clazz, e));
// Ajout des objets en cache
if (!cached.isEmpty()) {
retval.addAll(cached);
}
// Sort the results
List<Ordering> orderings = query.getOrderings();
if (orderings != null && !orderings.isEmpty()) {
Collections.sort(retval, new CompositeComparator(orderings));
}
return retval;
}
I did it on my spare time so you are free to use it as you want.
**** some more comments below ****
Le dimanche 17 avril 2005 à 09:58 -0400, Andrus Adamchik a écrit :
> Mikaël,
>
> I applied the patch - thanks!
>
> Re: subselects. You are talking about queries with qualifiers over
> to-many relationships, right?
Yes
> As this is the case when DISTINCT is
> added behind the scenes. I've done some research in this area before -
> http://www.objectstyle.org/cayenne/lists/cayenne-user/2003/05/0031.html
> I've been thinking of adding this as an alternative translation
> strategy configurable either per query or per adapter.
>
> Can't say when. Patches are always welcome ;-)
The previous method is an attempt ;-)
> However I just realized that Cayenne already supports another strategy
> described above - fetching duplicate rows and then internally applying
> "distinct" logic, returning rows with unique PK. Can't say if this is
> faster than PostgeSQL distinct ... somebody needs to try.
For me it won't, the join multiplies the result set's size by approx
300-700. It also disables some optimizations if I understand the
EXPLAIN's results rights.
> The actual worker class is
> "org.objectstyle.cayenne.access.util.DistinctResultIterator". It is
> used behind the scenes if SelectTranslator returns true from
> "isSuppressingDistinct" method.
>
> Setting this up is not very user-friendly right now, as it wasn't
> intended for public use, but if we have proof that it actually improves
> performance, we can make it one of SelectQuery flags.
Answered that just before. You can't know own much more network load it
will be since it depends on the data.
This archive was generated by hypermail 2.0.0 : Sun Apr 17 2005 - 19:20:45 EDT