Re: Optimize cascade deletes

From: Giulio Cesare Solaroli (giulio.cesar..mail.com)
Date: Mon Oct 01 2007 - 11:30:51 EDT

Next message: Michael Gentry: "Re: Optimize cascade deletes"

Next in thread: Michael Gentry: "Re: Optimize cascade deletes"
Reply: Michael Gentry: "Re: Optimize cascade deletes"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hello everybody,

today I have being able to do some more tests and I have found out
that the problem seems to be caused by a "fragmentation" of the data
on the Postgresql table space.

I have created a copy of the table where delete statements where very
slow (using the "create table .... as select from ..." syntax) and
matching the structure of the new copy to the original one (index and
constraint wise); in this new table the performance of the delete
statements where from 20 to 100 times faster that in the original
table. :-(

This rules out Cayenne as the culprit, but leave me wondering how to
avoid this degradation of performances on Postgresql; but this is
probably not the right place where to start.

Thank you everybody for your attention and for the very useful
suggestions or pointers that greatly helped me in understanding this
problem.

Best regards,

Giulio Cesare

On 9/23/07, Giulio Cesare Solaroli <giulio.cesar..mail.com> wrote:
> Hello,
>
> I have collected a few more data, as suggested by Aristedes.
>
> On 9/23/07, Giulio Cesare Solaroli <giulio.cesar..mail.com> wrote:
> > On 9/23/07, Aristedes Maniatis <ar..aniatis.org> wrote:
> > >
> > > On 23/09/2007, at 5:38 PM, Giulio Cesare Solaroli wrote:
> > >
> > > > log4j.logger.org.apache.cayenne=ERROR
> > > > log4j.logger.org.apache.cayenne.access.QueryLogger=DEBUG
> > > >
> > > >
> > > > Is there anything going on on the Cayenne code between the last [batch
> > > > bind:xxx] log and the "updated" log?
> > >
> > > Well, you could always try:
> > >
> > > log4j.logger.org.apache.cayenne=DEBUG
> >
> > Argh. I should have thought about this myself. :-(
>
> Even with full log enabled I got the same exact log, so it looks like
> all the time is really spent on the DB. And the next logs confirm
> this.
>
>
> > > > Is all this time spent on the DB only?
> > >
> > > What does turning on log_min_duration_statement tell you?
>
> I have being able to run PostgreSQL with logging enabled only on my
> Parallels instance, as I can not start/stop the db on the deployment
> server; this will make the absolute times not that relevant, but I
> hope we can understand something more anyway.
>
> First of all, there is a huge difference between the deletion time of
> the "cascade" records in different tables.
>
> [...]
> LOG: duration: 0.020 ms statement: EXECUTE <unnamed> [PREPARE:
> DELETE FROM connection.USRCNNRQS WHERE ID_USRCNNRQS = $1]
> LOG: duration: 0.021 ms statement: EXECUTE <unnamed> [PREPARE:
> DELETE FROM connection.USRCNNRQS WHERE ID_USRCNNRQS = $1]
> LOG: duration: 0.020 ms statement: EXECUTE <unnamed> [PREPARE:
> DELETE FROM connection.USRCNNRQS WHERE ID_USRCNNRQS = $1]
> [...]
>
> To delete rows on the USRCNNRQS (aka UserConnectionRequest) table, it
> averages at bout 0.020 ms.
>
>
> For another table (USRCNN aka UserConnection) the times range from
> 0.145 to 0.060, for later settling for 0.040 ms:
>
> [...]
> LOG: duration: 0.090 ms statement: EXECUTE <unnamed> [PREPARE:
> DELETE FROM connection.USRCNN WHERE ID_USRCNN = $1]
> LOG: duration: 0.071 ms statement: EXECUTE <unnamed> [PREPARE:
> DELETE FROM connection.USRCNN WHERE ID_USRCNN = $1]
> LOG: duration: 0.067 ms statement: EXECUTE <unnamed> [PREPARE:
> DELETE FROM connection.USRCNN WHERE ID_USRCNN = $1]
> [...]
>
>
> It later took 43 seconds (!!) to parse and execute the deletion of a
> single record, ...
> [...]
> LOG: statement: PREPARE S_5 AS DELETE FROM clipperz.RCR WHERE ID_RCR = $1
> LOG: statement: <BIND>
> LOG: statement: EXECUTE <unnamed> [PREPARE: DELETE FROM
> clipperz.RCR WHERE ID_RCR = $1]
> LOG: duration: 42.998 ms statement: EXECUTE <unnamed> [PREPARE:
> DELETE FROM clipperz.RCR WHERE ID_RCR = $1]
> LOG: statement: <BIND>
> LOG: statement: EXECUTE <unnamed> [PREPARE: DELETE FROM
> clipperz.RCR WHERE ID_RCR = $1]
> [...]
>
> ...later spotting time as low as 0.042 for the execution of the same query:
> [...]
> LOG: duration: 0.042 ms statement: EXECUTE <unnamed> [PREPARE:
> DELETE FROM clipperz.RCR WHERE ID_RCR = $1]
> LOG: statement: <BIND>
> LOG: statement: EXECUTE <unnamed> [PREPARE: DELETE FROM
> clipperz.RCR WHERE ID_RCR = $1]
> [...]
>
> But as I may understand this different timing may be due to the
> interaction of the Virtual machine with the real server, what is
> puzzling me is that the deletion from the latest table had a
> "constant" time of about 2 seconds each:
> [...]
> LOG: duration: 1929.043 ms statement: EXECUTE <unnamed> [PREPARE:
> DELETE FROM clipperz.RCRVRS WHERE ID_RCRVRS = $1]
> LOG: duration: 1991.311 ms statement: EXECUTE <unnamed> [PREPARE:
> DELETE FROM clipperz.RCRVRS WHERE ID_RCRVRS = $1]
> LOG: duration: 2078.249 ms statement: EXECUTE <unnamed> [PREPARE:
> DELETE FROM clipperz.RCRVRS WHERE ID_RCRVRS = $1]
> LOG: duration: 2015.047 ms statement: EXECUTE <unnamed> [PREPARE:
> DELETE FROM clipperz.RCRVRS WHERE ID_RCRVRS = $1]
> [...]
>
>
> If nobody as something else to suggest, I will try to get back the the
> DB schema to see if there is some constraint/index that is slowing
> down the deletion of these records.
>
> Thank you very much for your attention.
>
> Best regards,
>
> Giulio Cesare
>

Next message: Michael Gentry: "Re: Optimize cascade deletes"
Next in thread: Michael Gentry: "Re: Optimize cascade deletes"
Reply: Michael Gentry: "Re: Optimize cascade deletes"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2.0.0 : Mon Oct 01 2007 - 11:31:25 EDT