Reference Data Framework

 

Managing reference data and lookups with JDO

 

 

Introduction

 

This paper describes one of the packages developed during a consultancy engagement between Ogilvie Partners Ltd and Eclectic Consulting in Arlington, VA, during July 2002.  Our joint aim in publishing our results is to add to the evolving body of knowledge about the application of JDO to real-world projects.

 

We hope that you enjoy reading it, and look forward to your comments. 

 

Simplicity vs. Complexity

 

"I would give my right arm for the simplicity on the far side of complexity."

Oliver Wendell Holmes

 

A comment from David Medinets:

 

Mr. Holmes is a far more accomplished wordsmith than I and his words echo my own sentiment. I seem to spend much of my time as an application developer managing complexity - the interplay of software modules; the juggling of development, staging, and production environments; and the balancing act of where to locate business logic for best effect.

 

Whenever possible I develop methodologies or frameworks that survive from project to project.

 

This paper attempts to encapsulate the idea of Reference Data. Hopefully, you'll agree with my ideas. And if not, please let me (medined@mtolive.com) or Robin (Robin@OgilviePartners.com) know.

 

I hope to use the techniques shown in this paper as the basis for several future projects:

·         Reference Data Editor for the Eclipse IDE

·         ColdFusion Interface

·         Checkpointing (crude versioning) of Reference Data

·         XML export and import

·         Validation expressions

 

The problem

 

Every application requires some form of reference data to be persisted and managed.  Projects tend to treat the topic in different ways which results in duplication of design and implementation effort.  As part of a much larger design effort we faced this issue and attempted to write a generic framework that could be reused across different projects.

 

What is reference data?

 

Our working definition of reference data is as follows. 

 

Reference data:

·         is not time sensitive

·         is identified by a publicly known “code”

·         can exist in the system without being referenced

 

The reference data we had to manage included currencies, countries and airports, identified by currency codes, country codes and airport codes.

 

In the simplest form, reference data is merely the mapping of a code to a displayable name, although these usually evolve into more complex scenarios and object designs.  For instance, some applications may merely require that an airport code resolves to an airport name of String type.  However, more flexible applications will resolve an airport code to an instance of some Airport class, which can encapsulate data (beyond merely the displayable name) and specific behaviour.

 

The Reference Data Framework is designed with extensibility in mind, so that it can cater for String, Object or arbitrary persistence-capable types.

Assumptions

 

In putting the framework together we made several assumptions.  These are detailed here.  We believe that they describe the domain adequately, and also provide for a high level of cross-project applicability and JDO implementation portability.

 

Types of reference data:

·         Reference data can be primitive types, common wrapper and String types, or arbitrary persistence-capable types

 

Usage of reference data:

·         Applications need to iterate reference data (e.g. to populate combo boxes and other GUI components)

·         Applications need to do “contains” tests to see if codes provided, perhaps by a user, are valid

·         Applications need to do “lookups”, which actually return the data that corresponds to the code

·         Reference data exists in classifications; uniqueness of a specific code is only constrained within that classification

 

Usage of JDO:

  • This framework is designed to persist reference data through JDO
  • JDO Application Identity may not be supported by the data store; by architecting with Datastore Identity we cater for a wider selection of JDO implementations employing both Object and Relational data stores.

The design

 

Packages

 

The reference data framework comprises the following packages:

 

com.affy.domain.reference

Persistence-capable framework classes.

com.affy.app.reference

Sample applications illustrating use of the framework.

 

Framework Classes

 

The UML class diagram for package com.affy.app.reference is shown in Figure 1.

 

Reference data exists in named classifications whose behaviour is specified in the iClassify interface.  A persistent singleton, the ReferenceRoot, manages a group of named classifications (i.e. a Map that contains iClassify instances).

 

Since reference data can be considered as “values” that are looked up with “keys”, the iClassify interface extends java.util.Map and adds a few framework-specific methods.

 

In order to provide for extensibility, concrete classification classes must extend the ClassificationAdapter class.  The adapter implements the iClassify interface and provides concrete method implementations where appropriate.  These are mostly implemented as delegation to an instantiated Map object (actually a HashMap). 

 

With most of the work handled by the adapter, concrete classification classes become extremely easy to write.  The primary purpose of having type-specific classifications is to enforce type-safety in the Map, and to provide get() methods that return the appropriate type instead of returning Object.

 

Two concrete classification classes are provided as part of the framework:

StringClassification manages reference data where the String key resolves to a String value.

 

ObjectClassification manages reference data where the String key resolves to an Object (presumed to be persistence-capable but otherwise unrestricted).

 

Figure 1 – UML for package com.affy.app.reference

 

 

 

Further “standard” concrete classification classes are envisaged to cater for each of Java’s primitive types, by storing the data in corresponding wrapper objects and facilitating its conversion back to the appropriate primitive.

 

Of course, it is expected that many projects will take the “pure” OO design route of creating persistence-capable classes for each type of reference data.  We provide an example in which a Airport class is made persistence-capable, and show how to write an AirportClassification through which they can be managed with appropriate type-safety.

 

Using the Framework

 

Before describing how the framework is implemented it seems appropriate to illustrate its use.

 

The first illustration shows how data may be populated into various classifications.  The second considers iterating through a classification, and the third looks at determining the existence of and extracting specific objects.

 

Bootstrapping JDO

 

Each of these examples requires that a PersistenceManager be available.  We use the class com.ogilviepartners.jdo.JDOBootstrap to achieve this.  It loads JDO property values from a file called “jdo.properties” in the CLASSPATH or current working directory, and passes these to the standard getPersistenceManagerFactory(Properties)

method of JDOHelper.

 

The class is imported from its package:

 

 

import com.ogilviepartners.jdo.JDOBootstrap;

 

 

The extract below bootstraps the implementation, printing out the vendor name and version details (vendor properties) before obtaining the PersistenceManager from its factory.

 

 

JDOBootstrap bootstrap = new JDOBootstrap();

bootstrap.listVendorProperties();

PersistenceManagerFactory pmf = bootstrap.getPersistenceManagerFactory();

PersistenceManager pm = pmf.getPersistenceManager();

Transaction t = pm.currentTransaction();

 

 

For further details of the JDOBootstrap class, refer to Robin Roos’ book Java Data Objects, published by Addison Wesley.

 

Populating Classifications

 

Here’s a sample of code from the sample application

com.affy.app.reference.Populate. 

It uses the ReferenceRoot class to create and populate three classifications.

 

Before getting going, it first obtains a reference to the persistent singleton ReferenceRoot instance.

 

 

t.begin();

System.out.println("getting reference root");

ReferenceRoot root = ReferenceRoot.getRoot(pm);

 

 

The first classification is a StringClassification (the default) of Country data. 

 

 

System.out.println("creating country classification");

iClassify countries = root.createClassification("Country");

 

 

The second is an ObjectClassification of Currency data. 

 

 

System.out.println("creating currency classification");

iClassify currencies = root.addClassification(new   

    ObjectClassification("Currencies"));

 

 

We expect ObjectClassification to be used only rarely, since data that is more complex than single String key/value pairs would warrant its own type-safe concrete classification. 

 

This is illustrated by the third classification, in which the AirportClassification class provides type-safety for Airport instances.  AirportClassification and Airport belong to the com.affy.domain.travel package and, along with all of the framework classes, are persistence-capable.

 

 

System.out.println("creating airport classification");

iClassify airports = root.addClassification(new AirportClassification("IATA.Airport"));

 

 

Finally, the transaction within which classifications were created is completed.

 

 

t.commit();

 

 

Now that the classifications exist, reference data can be added.  We do this programmatically in these examples, but sourcing the data from external sources (such as XML documents) is a logical extension that we are considering.

 

Country data is added into the StringClassification “countries” as follows:

 

 

t.begin();

System.out.println("creating countries");

countries.put("UK", "United Kingdom");

countries.put("US", "United States");

countries.put("SA", "Soudi Arabia");

countries.put("ZA", "South Africa");

countries.put("IE", "Ireland");

countries.put("FR", "France");

System.out.println("committing countries");

t.commit();

 

 

We use String as the underlying object for currency data, even though the ObjectClassification can store arbitrary instances.

 

 

t.begin();

System.out.println("creating currencies");

currencies.put("GBP", "Pounds Stirling");

currencies.put("CHF", "Swiss Franc");

currencies.put("EUR", "Euro");

currencies.put("USD", "US Dollar");

currencies.put("AUD", "Australian Dollar");

currencies.put("ZAR", "South African Rand");

System.out.println("committing currencies");

t.commit();

 

 

A more interesting classification is Airports.  Here we construct instances of the Airport class and add these. 

 

 

t.begin();

System.out.println("creating airports");

Airport lax = new Airport("LAX");

lax.setAirportName("Los Angeles International");

lax.setCityName("Los Angeles, CA");

lax.setCustomsFacilities(true);

 

Airport iad = new Airport("IAD");

iad.setAirportName("Washington Dulles");

iad.setCityName("Washington, DC");

iad.setCustomsFacilities(true);

 

Airport npn = new Airport("NPN");

npn.setAirportName("Williamsburg / Newport-News");

npn.setCityName("Newport News, VA");

npn.setCustomsFacilities(false);

 

Airport jfk = new Airport("JFK");

jfk.setAirportName("John F. Kennedy Intl");

jfk.setCityName("New York City, NY");

jfk.setCustomsFacilities(false);

 

 

airports.put(lax);

airports.put(iad);

airports.put(npn);

airports.put(jfk);

 

System.out.println("committing airports");

t.commit();

 

Iterating Classifications

 

Now that we have some data loaded into the reference extents let’s examine how this might be used.

 

Commonly reference data must be iterated, in order to provide lists of data from which the user can make selections.  To iterate a classification get an iterator from that classification.  Here’s an example that iterates the “country” classification and displays the results.  During this process the objects retrieved from the iterator are cast as Strings; this is safe, since the classification “country” was constructed as the default StringClassification which implements the appropriate type-safety.

 

 

t.begin();

iClassify countries = ReferenceRoot.get(“Country”);

Iterator iterCountries = countries.iterator();

while iterCountries.hasNext() {

    String country = (String) iterCountries.next();

    System.out.println(country);

}

t.commit();

 

 

The “currency” and “airport” classifications would be iterated identically, except that the returned objects are guaranteed to be instances of Object and Airport respectively.

 

Existence Validation

 

Another way that reference data is used is as a check that codes received by the application (typically through user input) exist in the data set.  This is supported by the contains() method; here’s the test to see if a particular currency exists:

 

 

t.begin();

iClassify currencies = ReferenceRoot.get(“Currencies”);

String currencyCode = “EUR”;

If (currencies.contains(currencyCode)) {

    System.out.println(“Currency ” + currencyCode + “ does exist.”;

} else {

    System.out.println(“Currency ” + currencyCode + “ does not exist.”;

}

t.commit();

 

 

Reference Data Retrieval

 

Finally, an application may wish to retrieve a specific object from the classification.  Here are some examples working with Airports.  Firstly the airport is retrieved from the ReferenceRoot using its fully qualified classification name. 

 

 

t.begin();

Airport iad = (Airport) ReferenceRoot.get(pm, “Iata.Airport.IAD”);

System.out.println(iad);

t.commit();

 

 

Secondly the classification called “Iata.Airport” is retrieved and cast to the appropriate type (AirportClassification).  The get() method on this class returns instances of Airport, making subsequent type casting unnecessary.

 

 

t.begin();

AirportClassification airports = (AirportClassification)

    ReferenceRoot.get(“Airports”);

Airport lax = airports.get(“LAX”);

System.out.println(lax);

t.commit();

 

 

The implementation

 

Access to the framework is provided through static methods on the ReferenceRoot class.  ReferenceRoot is a persistent singleton class; attempts to get the instance resolve to iteration of the ReferenceRoot extent.  If no instance is found then a new instance is constructed and made persistent.  This is the only usage of pm.makePersistent(Object) in the entire framework, illustrating the benefits of JDO’s transparent persistence as intrusion of JDO-specific calls into application object code is reduced to transaction demarcation.

 

The state maintained by a ReferenceRoot instance is limited to a map of classifications, and a static “instance” to resolve the singleton pattern.

 

 

private Map classifications = null;

private static ReferenceRoot instance = null;

 

 

The persistent singleton strategy is implemented by the getRoot(PersistenceManager) method:

 

 

public static ReferenceRoot getRoot(PersistenceManager pm){

    boolean demarcate;

    Transaction t = pm.currentTransaction();

    demarcate=!t.isActive();

    if (demarcate) t.begin();

    if (instance == null) {

        Extent e = pm.getExtent(ReferenceRoot.class, true);

        Iterator i = e.iterator();

        if (i.hasNext()) {

            instance = (ReferenceRoot) i.next();

        } else {

            instance = new ReferenceRoot();

            pm.makePersistent(instance);

        }

    }

    if (demarcate) t.commit();

    return instance;

}

 

// would like this to be private, but need to check JDO support

public ReferenceRoot() {

    classifications = new HashMap();

}

 

 

The remaining method implementations manipulate the map of classifications.  The createClassification(String) method creates StringClassifications, with other types of classification being created by the application and then passed to ReferenceRoot through the add(String, iClassify) method.  Some of the methods have static equivalents that additionally require a PersistenceManager argument.

 

 

public iClassify getClassification(String name) {

    return (iClassify) classifications.get(name.toUpperCase());

}

 

public static iClassify getClassification(PersistenceManager pm,

                                              String name) {

    return (iClassify) ReferenceRoot.getRoot(pm).classifications.

        get(name.toUpperCase());

}

 

public iClassify addClassification(iClassify classification) {

    if (classifications.put(classification.getClassificationId(),

                            classification) != null) {

        // it was already present

        throw new RuntimeException("Classification already exists " +

             "keyed on: " + classification.getClassificationId());

    }

    return classification;

}

 

public iClassify createClassification(String name) {

    iClassify c = new StringClassification(name);

    if (classifications.put(name.toUpperCase(), c) != null) {

        // it was already present

        throw new RuntimeException("Classification already exists " +

            "keyed on: " + name.toUpperCase());

    }

    return c;

}

 

public Iterator iterator() {

    return classifications.values().iterator();

}

 

 

The iClassify interface merely adds some useful methods to the Map interface.

 

 

package com.affy.domain.reference;

 

import java.util.Map;

 

public interface iClassify extends Map {

    String getClassificationId();

    String getDisplayName();

}

 

 

All classifications are named.  The name may include upper and lower-case characters.  However the key with which the classification is indexed (its classificationId) is the name converted to upper-case only.

 

Concrete classification classes are defined by extending the abstract class ClassificationAdapter.  ClassificationAdapter implements most of the methods in the iClassify interface, leaving a few to be implemented by type-aware concrete subclasses.

 

Most of the methods merely delegate to the map instance.

 

 

package com.affy.domain.reference;

 

import java.util.Map;

import java.util.HashMap;

import java.util.Set;

import java.util.Collection;

abstract public class ClassificationAdapter implements iClassify {

    protected ClassificationAdapter(String name) {

        // needs to validation name; no spaces; no punctuation;

        this.classificationId = name.toUpperCase();

        this.displayName = name;

    }

 

    // methods that delegate to map

    public abstract boolean testObjectType(Object o);

    public abstract String getObjectTypeName();

 

    public boolean  containsKey(Object p1)  {return map.containsKey(p1);}

    public boolean  containsValue(Object p1){return map.containsValue(p1);}

 

    public int        size()            { return map.size(); }

    public boolean    isEmpty()         { return map.isEmpty(); }

    public Object     get(Object p1)    { return map.get(p1); }

    public Set        keySet()          { return map.keySet(); }

    public Collection values()          { return map.values(); }

    public Set        entrySet()        { return map.entrySet(); }

    public boolean    equals(Object p1) { return map.equals(p1); }

    public int        hashCode()        { return map.hashCode(); }

    public Object     remove(Object p1) { return map.remove(p1); }

 

 

    public Object put(Object key, Object value){

        if (!(key instanceof String))

            throw new ClassCastException("Key must be a String");

        if (!(testObjectType(value)))

            throw new ClassCastException("Object is of invalid type.  " +

                "Expected: " + getObjectTypeName());

        return map.put(key, value);

    }

 

    public String getKey(Object value) {

        throw new RuntimeException("Only domain-specific classifications " +

            "can support the put(Object) method");

    }

 

    public Object put(Object value){

        if (!(testObjectType(value))) throw new ClassCastException(

            "Object is of invalid type.  Expected: " + getObjectTypeName());

        String key = getKey(value);       

        return map.put(key, value);

    }

 

    public String getClassificationId() {

        return classificationId;

    }

 

    public void putAll(Map p1){

        if (getObjectTypeName() != "java.lang.Object")

            throw new RuntimeException("putAll(Map) Not yet type-safe");

        map.putAll(p1);

    }

 

    public void clear(){

        throw new RuntimeException("clearing is not supported directly");

    }

 

    public String getDisplayName() {

              return displayName;

    }

 

    private Map map = new HashMap();

    private String classificationId;

    private String displayName;

}

 

 

The following methods must be implemented by a concrete classification:  (Type represents the class name for which the classification is type-safe.)

 

public TypeClassification(String name)

Constructor which must call super(name).

public Type getObjectTypeName()

Returns the class name for which the classification is type-safe.  This class name is used in the exception thrown by the ClassificationAdapter when the application attempts to “put” inappropriate instances.

public boolean testObjectType(Object o)

This method is responsible for testing the type of the object.  It must return true if the parameter is an instance of the appropriate type.  It is called immediately before each object is “put” into the classification.

public Type get(String name)

By implementing get(String), the domain-specific classification can return objects cast to the appropriate type.  This serves to reduce the incidence of typecasting in applications.

 

Here’s the implementation of StringClassification:

 

 

package com.affy.domain.reference;

 

public class StringClassification extends ClassificationAdapter {

    public StringClassification(String name) {

        super(name);

    }

 

    public String get(String name) {

        return (String) super.get(name);

    }

 

    public String getObjectTypeName() {

        return String.class.getName();

    }

 

    public boolean testObjectType(Object o) {

        return (o instanceof String);

    }

}

 

 

Extensibility

 

Extending the classification concept to domain-specific classes is remarkably straightforward.  In our example we create an AirportClassification that is part of the package com.affy.domain.travel. 

 

Figure 2 – UML for package com.affy.domain.travel

 

 

 

The pattern applied is the same as that used for the StringClassification.

 

 

package com.affy.domain.travel;

 

import com.affy.domain.reference.ClassificationAdapter;

 

public class AirportClassification extends ClassificationAdapter {

    public AirportClassification(String name) {

        super(name);

    }

 

    public String getKey(Object value) {

        return ((Airport) value).getIataCode();

    }

 

    public Airport get(String name) {

        return (Airport) super.get(name);

    }

 

    public String getObjectTypeName() {

        return Airport.class.getName();

    }

 

    public boolean testObjectType(Object o) {

        return (o instanceof Airport);

    }

}

 

 

Where did all the JDO stuff go?

 

The only JDO-specific code present in this framework is to be found in the ReferenceRoot class.  This is where the root object is searched for (through extent iteration) and where the initial root object is made persistent.

 

For the rest of the framework, JDO is exactly where it should be – transparent.  We manipulate persistent data without concerning ourselves with the underlying infrastructure, which happens to be JDO.  Of course we do have to demarcate transactions (which any useful application should be doing anyway).

 

The last part of this implementation discussion is the persistence descriptor.  All of the Map fields are explicitly identified to the implementation.  Keys are second-class (embedded-key=“true”), whilst values are first-class (embedded-value=“false”).  Here it is:

 

 

<?xml version = "1.0" encoding = "UTF-8"?>

<!DOCTYPE jdo SYSTEM "file:///jdo/dtd/jdo.dtd">

<jdo>

    <package name = "com.affy.domain.reference">

        <class name = "ClassificationAdapter">

            <field name="map">

                <map    

                    key-type="java.lang.String"

                    embedded-key="true"

                    value-type="java.lang.Object"

                    embedded-value="true" />

            </field>

        </class>

 

        <class name = "ReferenceRoot">

            <field name="classifications">

                <map   

                    key-type="java.lang.String"

                    embedded-key="true"

                    value-type=

                           "com.affy.domain.reference.ClassificationAdapter"

                    embedded-value="false" />

            </field>

        </class>

 

        <class

             name = "ObjectClassification"

             persistence-capable-superclass=

                 "com.affy.domain.reference.ClassificationAdapter"/>

        <class

                name = "StringClassification"

                persistence-capable-superclass=

                "com.affy.domain.reference.ClassificationAdapter"/>

    </package>

 

    <package name="com.affy.domain.travel">

        <class

            name = "AirportClassification"

            persistence-capable-superclass=

                "com.affy.domain.reference.ClassificationAdapter"/>

        <class name = "Airport"/>

    </package>

</jdo>

 

 

 

Issues for further consideration

 

The following issues pertaining to this framework remain for future consideration:

 

Usage in the Managed Environment

 

Within a J2EE context it is inappropriate to maintain PersistenceManager instances for long periods of time.  Instead, a PersistenceManagerFactory should be used to get a PersistenceManager instance that is bound to the current transaction.  This persistence manager should then be used, and closed as soon as possible.  (The closure of persistence managers in the managed environment merely returns them to a pool maintained by the factory.)

 

When using the Reference Data Framework from within a J2EE component, only the static methods of the ReferenceRoot should be used.  These require a PersistenceManager instance to be passed in on each invocation, and do not maintain references to them beyond each individual method invocation.

 

A future enhancement to this project would be to include a constructor that takes a PersistenceManagerFactory instance.  This would then be used as a source for PersistenceManagers on an as-needed basis.  The same interface could then be used from the Managed and the Non-Managed environments – in the first case the ReferenceRoot would be obtained with a PersistenceManagerFactory, and in the second it would be obtained with a PersistenceManager.

 

 

Scalability of Persistent HashMap Implementations

 

At the time of writing it is our intention to test the Reference Data Framework with large volumes of data.  It will be interesting to gauge the performance of the framework as the volume of data in an individual classification makes it impractical to retrieve the entire classification at one time.  JDO vendors have the capability to instantiate Second Class Objects that implement the Map interface, but which lazily load their contents on an as-needed basis.  Thus lookups to the Map could be implemented as keyed lookups to a data store table.

 

Since JDO is still in its early days of adoption it is unlikely that many vendors have contemplated such implementations for their persistent Maps.  However this is an area where we foresee significant opportunities.

 

XML Read/Write

 

Another avenue that we intend to explore is that of the exporting and importing of reference data through an intermediate XML format.  We see having a human-readable intermediate form as beneficial, since it facilitates the manipulation of data by means other than through JDO. 

 

Implementation of this capability would presumably hinge on readXml() and writeXml() methods being implemented in each concrete classification.  In this way, each classification would take on responsibility for manipulating the transformation of its specific content between Object and XML representations.

 

Contacts

 

The Reference Data Framework is downloadable as source and compiled classes from:

 

http://www.OgilviePartners.com/Download.html

 

Copyright remains with Eclectic Consulting and Ogilvie Partners Ltd.  Please feel free to alter the framework for your own usage.  If you have suggestions for its extension then we’d love to hear from you.

 

For further information, please contact:

 

Mr Robin Roos

Principal Consultant

Ogilvie Partners Ltd

 

Email: Robin@OgilviePartners.com

URL:    http://www.OgilviePartners.com

Mr David Medinets

Consultant

Eclectic Consulting

 

Email: medined@mtolive.com

URL:    http://www.CodeBits.com