GeoTools : ComplexDataStore Documentation

ComplexDataStore documentation

This page contains the documentation for use and configuration of a ComplexDataStore

Author note: though this "Complex" datastore has born to provide complex features out of a simple features data source, it may be better called a DerivativeDataStore or something like that, you'll see why later.

Introduction

This DataStore implementation acts as a wrapper over one or more DataStores, from now on, the surrogate DataStores, and allows to specify a series of mappings between the properties of surrogate FeatureTypes and output schemas. This mappings, in turn, allows to specify properties of the target FeatureTypes as being derivated by the
evaluation of an org.geotools.filter.Expression defined against the surrogate FeautreType.

So, what is this useful for?

Suppose you have a database of feature types you need to serve out of your office or organization. Furthermore, suppose you need to serve that data in an externally defined schema (like one defined by INSPIRE or any other organization). Obviously you don't want to rearchitect your database to conform to that schema! And indeed you probably even can't do that without the assistance of some kind of object-relational mapping layer.
Now you can better figure out what this ComplexDataStore is about if you think on it as a kind of object-relational mapping layer, but targeted to GIS data. Though not exact, this pseudo definition can help you understanding it if its your first time reading this document.

Of course it has nothing to do with relational databases directly, but with mapping an existing
GeoTools FeatureType from your internal storage schema to an externally defined one, which we're
getting used to call "community schemas".

How does ComplexDataStore achieves that?
You need:

  • An output (community) schema. This schema exists independently of your actual data so it will be loaded from a GML schema file, defined in XML Schema language.
  • An input FeatureType. GeoTools FeatureTypes are exposed by GeoTools DataStores, so you need a way to specify the DataStore's connection parameters and the source FeatureType name.
  • The attribute and attribute id mapping definitions. They consists of a series of couples of XPath OGC Filter 1.0 Expressions. The former addresses the output schema properties and the later defines how the value of that properties are derivated from the source Feature instances.

All this information is holded by a FeatureTypeMapping object, and a ComplexDataStore, in turn, may hold an arbitrary number of that objects, each one defining one (derivated) FeatureType the DataStore exposes.

To persist this information, use use a XML file which contains this definitions, and whose location in the form of an URL must be used to create a DataStore instance through the GeoTools DataStoreFinder lookup system.

GeoServer configuration

To use a ComplexDataStore on GeoServer you need a mappings file like the one described in the next section, and address it on geoserver's catalog.xml as usual.
To do that, the expected datastore parameters are as follows:

parameter

description

dbtype

fixed value, must be complex

config

URL to a mappings configuration file (for example: "file:/home/gabriel/mappings/RoadSegments.xml")

Requirements

  1. Run GeoServer with a 1.5+ JDK version. This limitation will be removed once the new GeoTools FM is backported to Java 1.4.
  2. You cannot still use GeoServer's trunk version with this datastore. Use the one at https://svn.codehaus.org/geoserver/branches/complex-features/, which is already tuned to use the new GeoTools FM.

ComplexDataStore Configuration

We'll use a very simple example to illustrate how to configure a ComplexDataStore.

In this example, you have to specify the following mappings from the input to the output schema.

The mappings file

The FeatureTypes exposed by a ComplexDataStore are defined in a XML file which must comply to this XML Schema
The root element of such a configuration file is ComplexDataStore. The elements of this file follow the Object/property convention.
The mappings file may be named anyway, its full path has to be passed as a DataStore configuration parameter later.

Configure source DataStores

So, to create a mappings file for our example, we first need to specify the surrogate datastores to use. Suppose we're going to use a properties file DataStore as the data source.

<?xml version="1.0" encoding="UTF-8"?>
<c:ComplexDataStore xmlns:c="http://www.geotools.org/complex" xmlns:ogc="http://www.opengis.net/ogc"
  xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://www.geotools.org/complex ComplexDataStore.xsd">

  <sourceDataStores>
    <DataStore>
      <id>directory1</id>
      <parameters>
        <Parameter>
          <name>directory</name>
          <value>/home/gabriel/workspaces/svn/complex-features/gt/plugin/complexds/test/org/geotools/data/complex/test-data</value>
        </Parameter>
      </parameters>
    </DataStore>
  </sourceDataStores>
 ...

Add as many Parameter/name Parameter/value elements as needed to configure your source datastore, and as many DataStore elements as needed. You can use multiple datastores to obtain surrogate FeatureTypes, but by now you can only use a single surrogate FeatureType to create a FeatureTypeMappings (that is, no cross DataStore joins yet).

Reference target types

Bellow the sourceDataStores element, you have to reference the GML schema locations of the target FeatureTypes (the ones your DataStore is going to serve).

...
  <targetTypes>
    <FeatureType>
      <schemaUri>RoadSegment.xsd</schemaUri>
    </FeatureType>
  </targetTypes>
...

schema location

Icon

schemaUri is a relative path with respect to the locations of the mappings file

warning

Icon

at this time, the GML schema parsing capabilities are somewhat limited in the sense that it
is not possible to parse includes and imports.

Attribute and id(s) mappings

The last step is to define how to map attributes from one source FeatureType to a target one.
The typeMappings property contains a series of FeatureTypeMapping elements, where each one defines exactly the information that a org.geotools.data.complex.FeatureTypeMapping object holds.

For instance, this information is

  • a reference to a source FeatureSource:
    <sourceDataStore>directory1</sourceDataStore>
    <sourceType>SimpleRoads</sourceType>
    
  • a reference to an output FeatureType (the "community" schema):
    <targetType>RoadSegments</targetType>
  • the mappings between them:

The mappings consist of two sets of mappings, one for the identifyable attributes (the ones with gml:id), like instances of the FeatureType itself and any direct or nested property that is also identifyable. The other for the target attribute values.

NOTE that both for id and attribute mappings, the expression used to assign a value is expressed as an OGC's Common Query Language expression. That way, you can easily define the expression without the need to write the more lengthy Filter encoding equivalent.

FID mapping examples:

      <fidMappings>
        <FidMapping>
          <targetAttribute>RoadSegments</targetAttribute>
          <sourceExpression>
            <OCQL>getId()</OCQL>
          </sourceExpression>
        </FidMapping>
        <FidMapping>
          <targetAttribute>RoadSegments/fromToNodes</targetAttribute>
          <sourceExpression>
            <OCQL>FID</OCQL>
          </sourceExpression>
        </FidMapping>
      </fidMappings>

important

Icon

If tou want to assign the target Features the same feature id as the source features, use the getId() function as in the example above. Beyond that, you can use any other expression that evaluates to a characted string.

Attribute mapping examples:

        <AttributeMapping>
          <targetAttribute>RoadSegments/fromToNodes</targetAttribute>
          <sourceExpression>
            <OCQL>null</OCQL>
          </sourceExpression>
        </AttributeMapping>

important

Icon
  1. Attributes of derived Features will be assigned in the order they're found in this mapping file. So, when defining the attribute mappings you must take care that the order you declare the mappings in is actually valid for the output schema.
  2. When a complex attribute has an id mapping, like in RoadSegments/fromToNodes in the example above, it is important to declare also an attibute mapping with a null OGC Common Query Language expression, so the attribute is first created with the correct id and then its childs are appended.

The complete FeatureTypeMapping looks like this:

 <typeMappings>
    <FeatureTypeMapping>
      <sourceDataStore>directory1</sourceDataStore>
      <sourceType>SimpleRoads</sourceType>
      <targetType>RoadSegments</targetType>
      <groupBy/>
      <fidMappings>
        <FidMapping>
          <targetAttribute>RoadSegments</targetAttribute>
          <sourceExpression>
            <OCQL>getId()</OCQL>
          </sourceExpression>
        </FidMapping>
        <FidMapping>
          <targetAttribute>RoadSegments/fromToNodes</targetAttribute>
          <sourceExpression>
            <OCQL>FID</OCQL>
          </sourceExpression>
        </FidMapping>
      </fidMappings>
      <attributeMappings>
        <AttributeMapping>
          <targetAttribute>RoadSegments/fromToNodes</targetAttribute>
          <sourceExpression>
            <OCQL>null</OCQL>
          </sourceExpression>
        </AttributeMapping>
        <AttributeMapping>
          <targetAttribute>RoadSegments/fromToNodes/fromNode</targetAttribute>
          <sourceExpression>
            <OCQL>fromNode</OCQL>
          </sourceExpression>
        </AttributeMapping>
        <AttributeMapping>
          <targetAttribute>RoadSegments/fromToNodes/toNode</targetAttribute>
          <sourceExpression>
            <OCQL>toNode</OCQL>
          </sourceExpression>
        </AttributeMapping>
        <AttributeMapping>
          <targetAttribute>RoadSegments/name</targetAttribute>
          <sourceExpression>
            <OCQL>NAME</OCQL>
          </sourceExpression>
        </AttributeMapping>
        <AttributeMapping>
          <targetAttribute>RoadSegments/the_geom</targetAttribute>
          <sourceExpression>
            <OCQL>the_geom</OCQL>
          </sourceExpression>
        </AttributeMapping>
      </attributeMappings>
    </FeatureTypeMapping>
  </typeMappings>
</c:ComplexDataStore>

Advanced configuration

In the example shown previously, we used a one-to-one relationship between the source and target schemas, meaning that exactly one target Feature is created from each source feature.

But what if you have an output schema where one of its attributes has multiplicity greated than 1?
The functionality described bellow responds to this need, in the specific case where you can configure your source DataStore to serve features of a given type as a "view", like in a SQL join, and ordered as needed so a single read is enough to "group" the "rows" and produce a single multivalued attribute out of a consecutive set of input Features.

Input data:
wq_view

fid

mid

result

determinand_description

location

station_name

fid.1

mid.1

10

Turbidity

POINT(0.75 1.45)

name1

fid.1

mid.2

7

Turbidity

POINT(0.75 1.45)

name1

fid.1

mid.3

5

Turbidity

POINT(0.75 1.45)

name1

fid.2

mid.4

9

Turbidity

POINT(0.1 2.2)

name2

fid.2

mid.5

8

Turbidity

POINT(0.1 2.2)

name2

Output schema:
wq_plus

measurement id=true (0:N)
  determinand_description
  result
location
sitename

In this case you have to tell ComplexDataStore the the attributes location and name are "grouping" attributes, and result and determinand_description are multiple valued, using the GroupByAttribute elements:

    <FeatureTypeMapping>
      <sourceDataStore>ds_id</sourceDataStore>
      <sourceType>wq_view</sourceType>
      <targetType>wq_plus</targetType>
      <groupBy>
        <GroupByAttribute>location</GroupByAttribute>
        <GroupByAttribute>sitename</GroupByAttribute>
      </groupBy>
      <fidMappings>
        <FidMapping>
          <targetAttribute>wq_plus</targetAttribute>
          <sourceExpression>
            <OCQL>fid</OCQL>
          </sourceExpression>
        </FidMapping>
        <FidMapping>
          <targetAttribute>wq_plus/measurement</targetAttribute>
          <sourceExpression>
            <OCQL>mid</OCQL>
          </sourceExpression>
        </FidMapping>
      </fidMappings>
      <attributeMappings>
        <AttributeMapping>
          <targetAttribute>wq_plus/measurement</targetAttribute>
          <sourceExpression>
            <OCQL>null</OCQL>
          </sourceExpression>
        </AttributeMapping>
        <AttributeMapping>
          <targetAttribute>wq_plus/measurement/determinand_description</targetAttribute>
          <sourceExpression>
            <OCQL>determinand_description</OCQL>
          </sourceExpression>
        </AttributeMapping>
        <AttributeMapping>
          <targetAttribute>wq_plus/measurement/result</targetAttribute>
          <sourceExpression>
            <OCQL>result</OCQL>
          </sourceExpression>
        </AttributeMapping>
        <AttributeMapping>
          <targetAttribute>wq_plus/location</targetAttribute>
          <sourceExpression>
            <OCQL>location</OCQL>
          </sourceExpression>
        </AttributeMapping>
        <AttributeMapping>
          <targetAttribute>wq_plus/sitename</targetAttribute>
          <sourceExpression>
            <OCQL>station_name</OCQL>
          </sourceExpression>
        </AttributeMapping>
      </attributeMappings>
    </FeatureTypeMapping>

Now you can expect a result like this:

<wq_plus gml:id="fid.1">
 <measurement gml:id="mid.1">
   <determinand_description>Turbidity</determinand_description>
   <result>10</result>
 </measurement>
 <measurement gml:id="mid.2">
   <determinand_description>Turbidity</determinand_description>
   <result>7</result>
 </measurement>
 <measurement gml:id="mid.3">
   <determinand_description>Turbidity</determinand_description>
   <result>5</result>
 </measurement>
 <location><gml:pointProperty><gml:coords>0.75 1.45</gml:coords></gml:pointProperty></location>
 <sitename>name1</sitename> 
</wq_plus>
<wq_plus gml:id="fid.2">
 <measurement gml:id="mid.4">
   <determinand_description>Turbidity</determinand_description>
   <result>9</result>
 </measurement>
 <measurement gml:id="mid.5">
   <determinand_description>Turbidity</determinand_description>
   <result>8</result>
 </measurement>
 <location><gml:pointProperty><gml:coords>0.1 2.2</gml:coords></gml:pointProperty></location>
 <sitename>name2</sitename> 
</wq_plus>

That's all. Now you should be able to server community schemas out of your internal data structures!

Attachments:

complexdsscheme.png (image/png)
simplemappingesqueme.png (image/png)
ComplexDataStore.xsd (application/octet-stream)