On 8/26/06, Mike Kienenberger <mkienen..mail.com> wrote:
> On 8/26/06, Eric Lazarus <ericllazaru..ahoo.com> wrote:
> > I guess I was thinking about it done in Tomi's way but if this way can ensure that no data is ever lost, that is pretty powerful.
> >
> > What are the advantages of doing it Tomi's way?
> >
> > He can do efficient queries to see what the state of the database was at any time in the past? Is that right?
> >
> > He could efficiently materialize objects as they existed at any time in the past.
>
> Yeah, the only real difficulty in doing this is handling join queries.
> Each join has to also eliminate non-relevent data.
>
> I actually did a lot of this "by hand" for a WebObjects project
> (Imperial Wars) a long time ago. Time was measured in discrete units
> (turns), but it worked pretty much the same way.
The above two comments Erik made describe exactly what I'd like to
have at my disposal when I start designing a new system. I'd like to
make my reasons clear.
Experience showed me that a significant percentage of the systems my
company deploys have to keep some sort of history. Furthermore, in
almost all systems, we need a way to find out what happened to the
system in a specific period of time.
Having the possibility to set the "system time" at runtime (I even
considered the possibility of making it a user-level setting) would
allow me the following:
1) do a detailed "playback" of all system events
2) compare analysis on system objects in different moments, i.e. have
an instant time-dimension embedded in my system, to build data
warehouses upon
3) troubleshoot: when something fails (due to one reason or another),
it would be a lot easier to find the cause if the data wasn't somehow
overwritten. Obviously, if the problem was with the temporal
mechanisms themselves, this wouldn't work, but that's one of the
reason I'd like to automate construction of such a database.
4) security: no one can do anything within the system without there
being a clear trace of his/her actions in the system. Sure, log files
are fine, but they're much harder to automatically analyze in order to
find e.g. patterns of behaviour of the offending user etc.
5) inherently temporal applications: I'm working on an app right now
whose key interface element will be a time slider, positioning the
user into a given moment in time so that he can access data from that
exact moment - what he does with that data is not important, but it is
important that he be able to get at it, at runtime and without having
to call me to retrieve the state of his database from 5 years ago
6) meeting regulatory requirements: sometimes, for some government
systems and the like, the application *has* to work in a non-overwrite
mode. There simply is no overwriting anything.
Now, an audit trail is a useful tool answering point number 3 and possibly 4.
1, 2, 5 and 6, however, (I feel) need a more robust approach to making
"time" an integral part of all stored data.
As a side note, when I first heard about postgresql's capability to do
PITR (Point-In-Time-Recovery), I was fascinated, but then I learned
you have to take the database offline, initialize it in a certain way
and so on, so it's not nearly as useful as I hoped it would be. The
server itself is probably too low a level to implement meaningful time
dimensions without severely limiting the type off applications you
could run on it, but still, the idea fascinated me. It's basically a
niche in the ORM problem space (as I see it) that hasn't been
addressed yet: I feel it'd be a nice addition to the already existing
arsenal of features Cayenne has to offer.
As an alternative, I was thinking about generating a trigger system in
the database that would perform most of what I have in mind here, but
I'm not sure how it'd get along with cayenne, and I'd still have to
design the hairy temporal database model by hand, instead of being
able to focus on the nature of the data, so I'm not really sure where
to go from here.
This archive was generated by hypermail 2.0.0 : Sun Aug 27 2006 - 18:58:48 EDT