GeoTools : Feature Model Design Discussion

Table of Contents:


This document represents the ongoing design discussion centered around moving the GeoAPI and GeoTools feature model forward.
The goal is to support GML3, and provide a complete model for the definition generation of XPath expressions (allowing Filter, Expression and SLD to work).

For more details please see:

Aspects of the Feature Model

At a glance:

Feature Model ID Design

A Feature ID in a GML document is required to be unique. However in the OGC Feature model a Feature ID is supposed to always to refer to the same physical construct in the real world. The "effiel tower" is unique and there is only one actual thing refered to by that ID. For different applications various sets of attribtues may be used to model the "effiel tower".

It seems disirabe that these IDs are stable across Application Models as well as GML docuemnts. A tall order.

(star) The concept of an ID that is generated external to the attribute model matches our understanding of the problem.

Generating the ID from a KEY column in your database seems to be an implementaiton detail, as long as the column is not part of the published schema of attribtues nobody needs (or can) know). This is the same conceptual model as the use of ID above.

Using more then one key in this opqaue fashion does not change matters.

By using a key attribute column to generate our ID while making that value available as an attribute is scary. The worst case senario is that the ID may be modified.

Same drawbacks as the use of a single key attribute column.

It is worth pointing out that in a real world system any of the differnt "Identiy" Analysis Patterns are subject to pros and cons. For example the same physical object could be registered in two different remote users applications. When these users sync with the server applicaton the concept of identity will need to be resolved, further more is a system with archival needs (for lawsuites) these changes of identity will need to be tracked, and traceable.

For now all we can do is build the model for other subsystems to work against, a system that requries management of identity as outlined above will need to take responsibility for the generation of Identity and the management there of.

In GML, in order to allow a Feature to appear inside another Feature, or in more then one collection referneces may be used. XML also allows validation constraints based on unique and id (that is unique and required) - I suspect these XMLSchema facilities may have muddied the waters of understanding.

Design Decisions:

interface Feature {
  String getID();

(warning) Genreation of IDs is considered to be implementation, or application specific. At worst we will need a FIDSequence interface to generate new IDs as required. Currently GeoAPI leaves generation of ID up to the feature source (as such IDs are not available until after a commit). GeoTools fakes it with QA rather then as an explicit part of the workflow.

(warning) It has come to our attention that the concept of ID is also used by GML Geometry to allow for either externaly defined Geoemtry, or shared Geometries.

Feature Model Attribute Access Design

(star) Use of a Qualified name, or an actual AttributeType object when in Object-Oriented land seems to be the only way to go. The AttribtueType will need to have knowledge of its QName for document generation. It is worth pointing out that this is required to suport super types, as the same name may be under different restrictions depending on what type it originated from.

(star) Name breaks down in the presents of super types, we require super types and thus cannot expect name to work in all situtations. However it is so direct we should include it in our API as a convience method.

Index based IDs do not work when used with super types, we require super types and thus cannot support them in the general case.
However: many applications call for a flat feature model, we may be able to prove those with simple data with a simple API to access it.
I recommend support indexed based access at the GeoTools level in order to ease migration pain. This can be accomplished by declariing a "FlatFeatureType" and "FlatFeature" model, so we can explictly model simple data.

XPath support at the model level is desirable from a client code point of view, however no implementations to date have managed to produce. Further more if we expect XPath to work our "chain" becomes only as strong as the weakest implementation. We would do better to have a strong Feature Model and construct our XPath support at arms length (using JXPath from Jackarta is the reccomended course of action).

There are downsides to this practice of making values typed by qualified anme, in GeoServer a lot of trouble is spent munging FeatureType/AttributeType information for writing - by strongly typing the players involved we are making the process more difficult (but more specific). We hope the trade off is worthwhile.

Design Decisions:
and support for is recommended for a simplified FlatFeature.

interface Attribute {
   String getName();
   Type getType();
   Object get();
   void set( Object newValue );
interface Type {
   GenericName name();
   Class getType();
interface Feature {
   List<Attribtue> attribtues();
   List<Type> types();
   List<Object> values();
class FlatFeature extends Feature { // simplfied access   
   Object get( String name );
   Object get( AttributeType type );
   Object get( int index );
   void set( AttributeType type, Object value );
   void set( Stirng name, Object value );
   void set( int index, Object value );

FeatureType Super Type Design

Support for extention of a parent super type. Resulting feature should be made up of attribtues of both the super type and the sub type. Sub types have an oppertunity to further restrict indivudual attribtues defined by the super type.

Nobody was interested in supporting multiple supertypes at this time. It should be kept in mind as a possible avuene for future growth (we should not for example support index based attribute access which would limit our support for this construct).

In GML their is a separation between subsitution type and Extention. It may also be useful to provide for super AttributeType as a way of capturing the use of Atomic types defined by XMLSchema.

Design Decision:

interface FeatureType {
    boolean isAbstract();
    FeatureType getSuperType();    

(warning) The issue of super type may also be needed for handling complex, and even simple content. Gabriel has expressed a wish to capture the XMLSchema Atomic types and be able to explicitly reuse their restrictions.

Feature Model FeatureCollection Design

The OGC overview document clearly indicates that Collections are a "derived" Feature from their contents. Without contents a Collection cannot exist. In addition FeatureCollections are actual Features with FeatureTypes in their own right. Because of this we need to be sure that the FeatureType for a FeatureCollection contains a way to describe the FeatureType of the contained Features.

FeatureCollection FeatureType contains a reference to allowable child FeatureType
FeatureCollection FeatureType contains a reference to several allowable FeatureTypes
Children represented as a "normal" attribute (of type features) with multiplicity

There is an attraction to representing child featues as a normal attribute (called "featureMembers"). It allows us to capture all the uses cases, it gives XPath generation a consistent model to work from (all attributes all the time with no special case for featureMembers). It is however impossible to tell the difference between this required featureMembers attribute and any other attribute of type FeatureType with a multiplicity. That may or may not be a good thing?

For now lets go with explicit modeling power at the cost of making XPath a little bit harder.

It is unclear if we need to separately define FeatureCollectionType (that extends FeatureType).

The OGC overview document also defines the FeatureCollection as having references to its children.

Parent references to child features. Often this is done by way of an expression that can generate the references (such as a

Children contain a back pointer to their containing Feature. This limits the modeling ability of a the system, as it prevents a child from being in more then one collection. And begs the question of why attribtues that are features do not have a back reference to their containing Feature (even though it may not be a FeatureCollection).

It should be pointed out that all known implementations have the reference reversed (Features contain a reference to their Parent FeatureCollection). This is probably a convience, but it does limit out modeling power (we cannot model a Feature as being part of more then one collection). Even in GML this is possible by use of references.

Design Decision:

interface FeatureType extends Type {
interface FeatureCollectionType extends FeatureType {
  FeatureType getChildFeatureType(); // featureMember/featureMembers restriction
interface Feature {
  FeatureType getType();
interface FeatureCollection extends Feature {
  FeatureCollectionType getType();
  Iterator<Feature> features(); // featureMember/featureMembers access
  close( Iterator<Feature> );

(warning) I have included the ability to "close" iterators (in case they have OS resouces they need to return), the practice was taken from JDO and is used by GeoTools.

(warning) Note this represents GML "infecting" our modeling system, we are explicitly capturing featureMembers with the features() method. Any XPath system will have to try a search for featureMember/*,featureMembers

Feature Model Simple Design

Gabriel would like to capture the mapping of XMLSchema Atomic types to known java interfaces: AtomicTypes

We considered making a SimpleAttribtue with helper methods for dealing with Interger, String and so on. We will wait to see how Gabriels need for AtomicType support shakes out.

Design Decisions:

interface Attribute {
   String getName();
   Type getType();
   Object get();
   void set( Object newValue );
interface Type {   
   QName name();
   Class getType();
   boolean isNilable();

Feature Model Complex Design

Most of the discussion has centered on the debate of if FeatureType is an AttributeType or not (specifically a ComplexAttribute).

Argument For FeatureType as a ComplexType: Feature represents content that can be navigated by xpath in exactly the same manner as a complex content. Whatever problems we run into with Feature we will also run into with complex content.

Argument for AttributeType refering to a FeatureType: Some of the concerns of AttributeType are not applicable to FeatureType (such as multiplicity for example).

Design Ideas (No Decision reached yet):

class Type {
class ComplexType extends Type {
   List<Type> types();
class Complex extends Attribtue {
   ComplexType getType();
   List<Attribute> attributes();
   List<Type> types();
   List<Object) values();
class FeatureType extends ComplexType {
class FeatureCollectionType extends FeatureType {   

Feature Model Multiplicity Design

Remember this is a description of schema, no chantes to Attribtues are required. The fact that none are needed is a verification we are on the right track.

It is very tempting to make SequenceType extend AttributeType and provide a sequence() method, I am caught between making client code explicit (so they know what they are doing), and having he class heirarchy explicit. On the whole I think the decision below is the correct one. The convience method could always be added to as required.

The decision to represent FeatureType directly in the AttributeType heirachy is apparently a bad idea, someone is going to search through the geotools email list and hunt down Ian Turton's message that tells us why. Until then ...

We have a conflict between our type system for XPath (ie what is turned into a Node in a GML docuemnt, or an Object in our system), and what is required for validation. We have captured this below using Type and Schema to distingish between the two.

Design Decision:

class ComplexAttribtueType {
   List<Schema> getSchema();
class FeatureCollectionType {
   Schema getMemberSchema(); // based on featureMember/featureMembers
class Schema {
   int getMinOccurs();
   int getMaxOccurs();
   List<Filter> facets();
class SimpleSchema extends Schema {
   AttributeType getType();
class ChoiceSchema extends Schema {
   List<Schema> choices();
class SequenceSchema extends Schema {
   List<Schema> sequence();
class AllSchema extends Schema {
   List<Schema> all();

(warning) The distinction I am going for is this; AttributeType referes to a real navigatable object, Schema refers to validation constraints.

Feature Model Discussion

Ongoing design discussion, as we attempt to address the above concerns:

Feature Model Proposal

The final Feature Model proposal will be submitted to GeoAPI pending directory:

Feature Model Letter of Support

The following letter is being drafted as feedback to Gabriel (and his Sponser) by the GeoTools PMC:
-Feature Model Letter of Support


On Feature.getParent()

This could cause serialization problems for the Feature, and depending on the implementations not all FeatureCollection could be easily serializable.

Paul Selormey.

Posted by paulus at Aug 19, 2005 05:49