GSoC2008, a project proposal

From: Marcin Skladaniec (marci..sh.com.au)
Date: Thu Mar 27 2008 - 10:33:55 EDT

  • Next message: Robert Zeigler: "Re: GSoC2008, a project proposal"

    Hello

    I've been using open source software, and Cayenne in particular, for
    quite some time now. Now finally comes a great opportunity for me to
    contribute something back to the open source world.
    It is sad that I could not do much until the incentive from Google
    came along, but let me just blame the fact that I'm a busy student...

    I have already talked to Andrus about the possible projects I could
    undertake, and to my great contentment he liked the idea of
    implementing inheritance in Cayenne and agreed to be my mentor for
    SoC2008.
    I put together a proposal, and at the end of this email you can find
    the most important fragment of it (I skipped all the 'why me' bits ).

    Any feedback is most welcomed !

    Best regards
    Marcin

    "There are several object-relational mapping (ORM) java frameworks in
    existence, and in my opinion Cayenne (http://cayenne.apache.org/) is
    one of the best. Cayenne has clear and consistent API, and great, most
    vibrant community I met in the open source world.
    Cayenne has many features you one would expect ORM to have, but it is
    lacking one important feature which is present in competing projects
    like Hibernate: inheritance. I would like to propose a Summer of Code
    project bringing inheritance to Cayenne.

    At the moment Cayenne does support only some sort of flat inheritance
    (called 'derived entities' ), but it is already marked as deprecated
    and will be excluded from future versions.
    I believe (re)adding this feature is going to highly improve Cayenne
    as ORM, I have already found myself missing it few times when working
    on various database designs.

    There are few ways of designing inheritance, mentioned before flat
    inheritance, vertical inheritance and horizontal inheritance. I would
    like to explain differences between them to explain what I would like
    to do. I'll use an example to illustrate each type, in the examples
    I'll use notation Table[field1, field2,..] to describe the database
    structure and Entity<field1, field2,...> to describe entities.

    Without inheritance a database table is simply mapped to a java class
    (which is often called an entity).
    Example: If an application has an entity representing different
    payments each type of payment needs to be stored in separate table and
    will be mapped to a separate entity, ie.
    - CreditCardPayment[amount, bankedDate, ccNumber, ccExpiry, ...] <=>
    CreditCardPayment<amount, bankedDate, ccNumber, ccExpiry, ...>
    - ChequePayment[amount, dateBanked, chequeBank, ...] <=>
    ChequePayment<amount, dateBanked, chequeBank, ...>
    - CashPayment[amount, dateBanked, ...] <=> CashPayment<amount,
    dateBanked, ...>
    As can be seen some fields gets duplicated, and as there is no
    inheritance between the java classes the application ends up with
    plenty duplicate code. Also the relationships between the entities
    might get complicated, for example an Invoice would need to have
    separate relationship to each payment table.

    Flat inheritance does use a single table, but allows mapping this
    table into several entities. There are some benefits of this type of
    inheritance like it does not require any joins, likewise inserting
    data does affect only one table. Using flat inheritance has only
    limited usage, and if used to extensively leads to tables with too
    many columns to be readable and understandable.
    Example: In the same situation as before there would be only one table
    defining all the fields :
    - Payment[amount, bankedDate, ccNumber, ccExpiry,...,
    chequeBank, ...] <=> CreditCardPayment<amount, bankedDate, ccNumber,
    ccExpiry, ...>, ChequePayment<amount, dateBanked, chequeBank, ...>,
    CashPayment<amount, dateBanked, ...>
    Modelling the relationships becomes much easier, but as each entity
    still is separate there is a high possibility of duplicate code.

    Horizontal inheritance can be considered a small improvement over the
    no inheritance model, keeping the duplication of the fields on the
    database and entity level, but allowing to gather some code together.
    I'm finding this approach a little confusing and counter-intuitive,
    but it certainly has a reason behind: it does avoid the slowness
    related with vertical inheritance.
    Example:
    - (no database table) <=> Payment<amount, bankedDate, ...> implements/
    extends CreditCardPayment, ChequePayment, CashPayment
    - CreditCardPayment[ccNumber, ccExpiry, ...] <=>
    CreditCardPayment<ccNumber, ccExpiry, ...>
    - ChequePayment[chequeBank, ...] <=> ChequePayment<chequeBank, ...>
    - CashPayment[...] <=> CashPayment<...>

    Vertical inheritance escapes the problems of flat inheritance by
    storing data relevant to an entity in a dedicated table. Resulting
    database schema is cleaner and easier to maintain when adding new
    entities and tables. There is a cost linked with this type of
    inheritance: the database transaction speeds are lower, since each
    fetching query must use a join and storing data has to be executed
    against each table.
    Example: Again the same situation as in previous cases. In vertical
    inheritance there is a common table and common entity class finally
    creating a room for the code common to all Payments:
    - Payment[amount, bankedDate, ...] <=> Payment<amount, bankedDate, ...>
    - CreditCardPayment[ccNumber, ccExpiry, ...] <=>
    CreditCardPayment<ccNumber, ccExpiry, ...> extends Payment
    - ChequePayment[chequeBank, ...] <=> ChequePayment<chequeBank, ...>
    extends Payment
    - CashPayment[...] <=> CashPayment<...> extends Payment

    My Google Summer of Code is going to bring the inheritance to Cayenne.
    Certainly I would like to implement the vertical inheritance, albeit
    it might have performance impact, it also seems to be the most
    advanced model. If the time will allow I'll put my effort to also
    implement either flat or horizontal inheritance, consulting the
    Cayenne community to find which one is more anticipated. "





    This archive was generated by hypermail 2.0.0 : Thu Mar 27 2008 - 10:34:44 EDT