Agile Database Framework

Download Framework

Prior to launching my last project, I had difficulty finding a database patching release framework that would allow a team of developers to continue taking the codebase forward, whilst at the same time enabling simple release patching to production by a DBA, all the time maintaining a clean version of the database schema in source control.

There are numerous good articles out there, the one that most seems to follow the guidelines I had adhered to being K. Scott Allen’s Versioning Databases, and I think it’s worth restating the database principles I had in place:

  • All developers have their own database schema – in fact, we had 2 each because of the nature of the application, and I had an additional 2 having lots of complex PL/SQL running at times. There is very little overhead in this, especially once you are able to give your DBA a create schema script (granting privileges etc.) – automate as much as you can!
  • A baseline for the schema must be maintained in source control – even though we had a small team, I did not believe it possible to move forward efficiently without being able to accurately predict which version of any object was the latest
  • All objects to be maintained in separate source files – so tables, views, triggers, procedures etc. have their own .sql, .pkg file with proper naming conventions (you’ll need to agree your own standards beforehand)

But nowhere could I find anyone willing to either a) propose a theoretical release framework structure and what it should actually look like, or indeed b) an implementation! So after a little bit of work on the whiteboard I came up with my own…

But first, some background you might like to know:

  • I was running a team of 6 development environments, each of us having our own schema (not least to allow playing around with Oracle for instance, without impacting the rest of the team), each developer checking in small database patches on an almost daily basis
  • Development cycles followed at most a monthly iteration cycle (Agile XP methodology)
  • We had a continuous integration server (employing CruiseControl) that would rebuild modules immediately upon code check-in. This same server would also rebuild the entire application (including the database schema) from scratch each night and run all test cases
  • A QA environment that would be updated towards the end of each iteration by the development team
  • A pre-production environment that sat outside the development team (in the hands of a DBA) that would be updated prior to a release
  • A production environment (again outside the remit of the development team) where releases were scheduled approximately once every 5 weeks

So, that’s a total of 10 environments, some under the control of a 3rd party who is likely not wholly aware of the applications development, some being rebuilt anew, some patched daily and others patched much less frequently.

After consultation with my DBA as to how the production environment releases would be best organised, I had the following requirements that I wanted my framework to tackle:

  • Seamless integration of developer patches across all development / integration environments – no need for e-mails or any kind of discussions to occur, patches would be applied by simply checking out the latest code and rebuilding the core module
  • Maintain the ability to check out a completely clean version of the database schema from scratch – important in cases of a hard-drive failure, building a new staging environment for instance, or adding a new developer to the team
  • Avoid the need for branching or merging – we were a relatively small team of 6 developers with short turnarounds of at max 5 weeks. I have only ever observed branching and merging leading to complications, delays or painful integration experiences
  • Command line and Ant build tool execution – the DBA rightly preferred that Ant was not installed on the production database server
  • Rollback scripts that revert any applied patch script. These were required in all environments, as once a patch had been checked in a developer must assume that this patch has already been applied to numerous other database schema
  • Framework to run on both Sun and Linux OS

So, on to the actual implementation… Don’t worry, you’ll get to the distributable soon!

It’s important to note here that there is an overhead and responsibility on each development team member to not only create patch and rollback scripts, but also to ensure that they maintain original ’setup’ scripts so that a clean installation can occur at any particular moment in time. These are the ‘baseline version’ scripts if you like.

After discussion with the team, it was agreed that we would try this approach, but move to an alternative (such as utilizing Oracle’s import/export utilities) if this proved too cumbersome in practice.

Some more Agile development techniques creeping in here:

High level framework components

High level framework components

The diagram above shows the build file targets made available, and the RELEASE_PATCH database table that is used to track which scripts have been run.

The RELEASE_PATCH primary key is a compound primary key consisting of the particular release number, the script name and script type (either ‘CLEAN’, ‘PATCH’ or ‘ROLLBACK’). The SCRIPT_CHECKSUM column is there to ensure, should it ever be needed, that no script has been changed in version control after it has been applied to the schema.

The build file tasks are fairly straightforward to discerne. The following is an explanation of each in turn:

  • install-schema is intended for first use against a new or empty schema to create a clean database installation. It achieves this by executing the master setup script setup/init_schema.sql which contains the baseline version of all objects (a TEST table is provided for setup in the distributable by default, alongside the RELEASE_PATCH table and a routine to clean all objects). A ‘CLEAN’ entry is then inserted into RELEASE_PATCH for each script listed in patches/<release_number>/install.sql by the clean.sh shell script. Note that none of the SQL patch scripts listed are actually run – the master setup scripts have already created your baseline database objects in the first step – but an entry is made into RELEASE_PATCH to ensure that upon subsequent builds associated patches do not need to be applied
  • apply-patch is then run on each subsequent application build. For each script listed in patches/<release_number>/install.sql the patch.sh shell script checks with RELEASE_PATCH to see if the patch has already been applied, if it has it is ignored, if not the patch is first executed and then a new entry is made in RELEASE_PATCH recording the fact (type ‘PATCH’). Finally a utility script checks and reports (to the console) on any currently invalid database objects
  • apply-rollback can be run should any of the applied patches create invalid database objects, SQL exceptions or cause application instability. For each script listed in patches/<release_number>/uninstall.sql the rollback.sh shell script checks with RELEASE_PATCH to see if the rollback has already been applied, if it has it is ignored, if not the rollback is first executed and then a new entry is made in RELEASE_PATCH recording the fact (type ‘ROLLBACK’).

It may be that it is not worth running all rollback scripts if only one has failed, either way some hands-on intervention will be required to ascertain what went wrong. The entire point of the rollback scripts was to give a DBA a headstart should a problem occur during a release – chances are it will be too early for the developer to be at his or her desk. Another benefit is if a problem occured with someone else’s patch on my development environment I don’t have to hang about whilst they work out how to fix it – I simply run the associated rollback and carry on without further disruption.

It is very important to note again, that it is really the development team’s responsibility to ensure a sufficient level of thoroughness is applied to all their SQL scripts. If the standard is not high, no framework is going to save you! ;o)

Low-level directory structure

Low-level directory structure

The above diagram shows the low-level directory structure and where to add files for subsequent releases – simply create a r2.0 directory and copy empty install.sql and uninstall.sql scripts to it. One of the first things you might want to do upon subsequent releases is DROP any backup tables from previous releases, something like this:

DECLARE
    CURSOR table_cursor IS SELECT table_name FROM user_tables
        WHERE table_name IN ('BKP_AE_USER_ACCOUNT_GENDER_DOB', 'BKP_AE_PRE_REG_GENDER_DOB');
    BEGIN
        FOR table_rec IN table_cursor
        LOOP
            EXECUTE IMMEDIATE 'DROP TABLE ' || table_rec.table_name;
        END LOOP;
    END;
/

It’s nowhere near as complicated as all the above might make out. If you have any comments, or improvements please feel free to let me know!

Download Framework