GeoTools : Complex schemas - Business driver examples

This document compiles a set of sampling GML schema related needs, where available from real world uses, that the ComplexDataStore together with the GeoTools core product must address in order to easily deploy community-schema based projects.

Icon

After the consolidation of the Feature Model Proposal, I'm putting bellow each example an scheme of the corrsponding Geotools FeatureType structure.
Please use both documents to evaluate the matching level of the GeoTools model to the GML one, and provide feedback.

Business driver schema examples.

The document is divided in two parts, the first relating to the needs of modeling power for the GeoTools Feature API and the second on FeatureType mapping abilities that the ComplexDataStore product must provide.

Supporting examples

The following xml fragments defines the structures that needs to be supported by the core GeoTools FeatureType model.

Icon

Consider each pair of schema/instance as a story card saying "the GeoTools Feature API should be able of modeling a feature type that mirrors this schema"

Access the link on each title to download the full example xml document. The documents validates against the GML 3.1.1 schemas.

A feature may have a multi-valued property

Instance

<gml:featureMember>
  <sco:wq_plus gml:id="_41010901">
    <sco:sitename>BALRANALD WEIR</sco:sitename>
    <sco:anzlic_no>ANZNS0359100023</sco:anzlic_no>
    <sco:location>
      <gml:Point  srsName="http://www.opengis.net/gml/srs/epsg.xml#4283">
        <gml:pos>22 143.53399658</gml:pos>
      </gml:Point>
    </sco:location>
    <sco:measurement gml:id="_16JAN94002001002003000000">
      <sco:determinand_description>16/JAN/94</sco:determinand_description>
      <sco:result>Turbidity</sco:result>
    </sco:measurement>
    <sco:measurement gml:id="_24JAN94002001002003000000">
      <sco:determinand_description>24/JAN/94</sco:determinand_description>
      <sco:result>Turbidity</sco:result>
    </sco:measurement>
    <sco:project_no>RWWQ0004</sco:project_no>
  </sco:wq_plus>
</gml:featureMember>

Schema

<xs:complexType xmlns:xs="http://www.w3.org/2001/XMLSchema" name="wq_plus_Type">
  <xs:complexContent>
    <xs:extension base="gml:AbstractFeatureType">
      <xs:sequence>
        <xs:element name="sitename" minOccurs="1" nillable="false" type="xs:string" />
        <xs:element name="anzlic_no" minOccurs="0" nillable="true" type="xs:string" />
        <xs:element name="location" minOccurs="0" nillable="true" type="gml:LocationPropertyType" />
        <xs:element name="measurement" minOccurs="0" maxOccurs="unbounded" nillable="true">
         <xs:complexType>
          <xs:sequence>
            <xs:element name="determinand_description" type="xs:string" minOccurs="1"/>
            <xs:element name="result" type="xs:string" minOccurs="1"/>
          </xs:sequence>            
          <xs:attribute ref="gml:id" use="optional"/>
         </xs:complexType>
        </xs:element>

        <xs:element name="project_no" minOccurs="0" nillable="true" type="xs:string" />
      </xs:sequence>
    </xs:extension>
  </xs:complexContent>
</xs:complexType>
<xs:element name='wq_plus' type='sco:wq_plus_Type' substitutionGroup="gml:_Feature" />

Geotools Feature Type

FeatureType[
	name = wq_plus
	identified = true
	super = Feature
	abstract = false
	binding = Feature.class
	restrictions = EMPTY_SET
	nillable = false
	defaultGeometry = #location
	descriptor = OrderedDescriptor(1, 1)[
		sequence = List[
			AttributeDescriptor(1, 1)[
				type = AttributeType[GEOTDOC:
					name = sitename
					identified = false
					super = null (?????????)
					abstract = false
					binding = String.class
					restrictions = EMPTY_SET
					nillable = false
				]
			],
			AttributeDescriptor(0, 1)[
				type = AttributeType[
					name = anzlic_no
					identified = false
					super = null (?????????)
					abstract = false
					binding = String.class
					restrictions = EMPTY_SET
					nillable = true
				]
			],
			AttributeDescriptor(0, 1)[
				type = GeometryAttribute[
					name = location
					identified = false
					super = HERE WE NEED TO REFER TO  gml:LocationPropertyType
					abstract = false
					binding = Point.class
					restrictions = EMPTY_SET
					nillable = true
				]
			],
			AttributeDescriptor (0, Integer.MAX_VALUE)[
				type = ComplexType[
					name = measurement
					identified = true
					super = null (?????????????????)
					abstract = false
					binding = null
					restrictions = EMPTY_SET
					nillable = true
					descriptor = OrderedDescriptor(0, Integer.MAX_VALUE)[
						AttributeDescriptor(1, 1)[
							type = AttributeType[
								name = determinand_description
								identified = false
								super = null (?????????)
								abstract = false
								binding = String.class
								restrictions = EMPTY_SET
								nillable = false
							]
						],
						AttributeDescriptor(1, 1)[
							type = AttributeType[
								name = result
								identified = false
								super = null (?????????)
								abstract = false
								binding = String.class
								restrictions = EMPTY_SET
								nillable = false
							]
						]
					]
				]
			], //measurement
			AttributeDescriptor(0, 1)[
				type = AttributeType[
					name = project_no
					identified = false
					super = null (?????????)
					abstract = false
					binding = String.class
					restrictions = EMPTY_SET
					nillable = false
				]
			]
		]
	]
]	 

A feature may have various multi-valued properties

For example:

  • An object may have several recorded positions
  • A land parcel may have several owners
  • A traffic light may have multiple sequencing rule sets
  • A classroom may have multiple bookable time slots

Schema

<xs:complexType name="wq_plus_Type" xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:complexContent>
    <xs:extension base="gml:AbstractFeatureType">
      <xs:sequence>
        <xs:element name="measurement" minOccurs="0" maxOccurs="unbounded" nillable="true">
         <xs:complexType>
          <xs:sequence>
            <xs:element name="determinand_description" type="xs:string" minOccurs="1"/>
            <xs:element name="result" type="xs:string" minOccurs="1"/>
          </xs:sequence>            
          <xs:attribute ref="gml:id" use="optional"/>
         </xs:complexType>
        </xs:element>

        <xs:element name="the_geom" type="gml:GeometryPropertyType"/>

        <xs:element name="sitename" maxOccurs="unbounded" nillable="false" type="xs:string" />
      </xs:sequence>
    </xs:extension>
  </xs:complexContent>
</xs:complexType>
<xs:element name='wq_plus' type='sco:wq_plus_Type' substitutionGroup="gml:_Feature" />

Instance

<gml:featureMember>
  <sco:wq_plus gml:id="_41010901">
    <sco:measurement gml:id="_16JAN94002001002003000000">
      <sco:determinand_description>16/JAN/94</sco:determinand_description>
      <sco:result>Turbidity</sco:result>
    </sco:measurement>
    <sco:measurement gml:id="_24JAN94002001002003000000">
      <sco:determinand_description>24/JAN/94</sco:determinand_description>
      <sco:result>Turbidity</sco:result>
    </sco:measurement>
   
    <sco:the_geom>
      <gml:Point srsName="http://www.opengis.net/gml/srs/epsg.xml#4283">
        <gml:coordinates decimal="." cs="," ts=" ">22,143.53399658</gml:coordinates>
      </gml:Point>
    </sco:the_geom>

    <sco:sitename>BALRANALD WEIR</sco:sitename>
    <sco:sitename>RWWQ0004</sco:sitename>
  </sco:wq_plus>
</gml:featureMember>

A feature may have multiple geometries

Schema

<xs:complexType name="measurement_Type">
  <xs:sequence>
    <xs:element name="determinand_description" type="xs:string"/>
    <xs:element name="result" type="xs:string"/>
  </xs:sequence>            
  <xs:attribute ref="gml:id" use="optional"/>
</xs:complexType> 

<xs:complexType name="wq_plus_Type" xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:complexContent>
    <xs:extension base="gml:AbstractFeatureType">
      <xs:sequence>
        <xs:element name="measurement" maxOccurs="unbounded" type="sco:measurement_Type"/>
        
        <xs:element name="location" type="gml:LocationPropertyType"/>
        <xs:element name="nearestSlimePit" type="gml:PointPropertyType"/>

        <xs:element name="sitename" maxOccurs="unbounded" nillable="false" type="xs:string" />
      </xs:sequence>
    </xs:extension>
  </xs:complexContent>
</xs:complexType>

<xs:element name='wq_plus' type='sco:wq_plus_Type' substitutionGroup="gml:_Feature" />

Instance

  <gml:featureMember>
    <sco:wq_plus gml:id="_41010901">
      <sco:measurement gml:id="_16JAN94002001002003000000">
        <sco:determinand_description>16/JAN/94</sco:determinand_description>
        <sco:result>Turbidity</sco:result>
      </sco:measurement>
      <sco:measurement gml:id="_24JAN94002001002003000000">
        <sco:determinand_description>24/JAN/94</sco:determinand_description>
        <sco:result>Turbidity</sco:result>
      </sco:measurement>
      <sco:location>
        <gml:Point srsName="http://www.opengis.net/gml/srs/epsg.xml#4283">
          <gml:coordinates decimal="." cs="," ts=" ">22,143.53399658</gml:coordinates>
        </gml:Point>
      </sco:location>
      <sco:nearestSlimePit>
        <gml:Point srsName="http://www.opengis.net/gml/srs/epsg.xml#4283">
          <gml:coordinates decimal="." cs="," ts=" ">22.1,143.53399658</gml:coordinates>
        </gml:Point>
      </sco:nearestSlimePit>

      <sco:sitename>BALRANALD WEIR</sco:sitename>
      <sco:sitename>RWWQ0004</sco:sitename>
    </sco:wq_plus>
  </gml:featureMember>

A feature may be defined to include properties from many namespaces.

Schema

<xs:complexType name="wq_plus_Type" xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:complexContent>
    <xs:extension base="gml:AbstractFeatureType">
      <xs:sequence>
        <xs:element name="name" maxOccurs="unbounded" type="xs:string" />
        <xs:element ref="gml:name" maxOccurs="unbounded"/>
      </xs:sequence>
    </xs:extension>
  </xs:complexContent>
</xs:complexType>

Instance

<sco:wq_plus gml:id="_41010901">
  <gml:location>
    <gml:Point srsName="http://www.opengis.net/gml/srs/epsg.xml#4283">
      <gml:coordinates decimal="." cs="," ts=" ">22,143.53399658</gml:coordinates>
    </gml:Point>
  </gml:location>
  <sco:name>BALRANALD WEIR</sco:name>
  <gml:name>RWWQ0004</gml:name>
</sco:wq_plus>

A set of features may be inter-related (bidirectional association):

e.g. a sampling location and a series of samples.

Schema

<xs:element name="roadRef" type="sco:RoadPropertyType"/>
<xs:element name="junctionRef" type="sco:JunctionPropertyType"/>

<xs:complexType name="RoadPropertyType">
  <xs:annotation>
    <xs:documentation>Container for a road - follow gml:AssociationType pattern.</xs:documentation>
  </xs:annotation>
  <xs:sequence minOccurs="0">
    <xs:element ref="sco:Road" />
  </xs:sequence>
  <xs:attributeGroup ref="gml:AssociationAttributeGroup" />
</xs:complexType>

<xs:complexType name="JunctionPropertyType">
  <xs:annotation>
    <xs:documentation>Container for a junction - follow gml:AssociationType pattern.</xs:documentation>
  </xs:annotation>
  <xs:sequence minOccurs="0">
    <xs:element ref="sco:Junction" />
  </xs:sequence>
  <xs:attributeGroup ref="gml:AssociationAttributeGroup" />
</xs:complexType>

<xs:complexType name="RoadType">
  <xs:complexContent>
    <xs:extension base="gml:AbstractFeatureType">
      <xs:sequence>
        <xs:element name="geom" type="gml:CurvePropertyType" minOccurs="0" />
        <xs:element ref="sco:junctionRef" minOccurs="0" maxOccurs="unbounded" />
      </xs:sequence>
    </xs:extension>
  </xs:complexContent>
</xs:complexType>

<xs:complexType name="JunctionType">
  <xs:complexContent>
    <xs:extension base="gml:AbstractFeatureType">
      <xs:sequence>
        <xs:element ref="sco:roadRef" />
        <xs:element name="direction" type="xs:int" />
      </xs:sequence>
    </xs:extension>
  </xs:complexContent>
</xs:complexType>

<xs:element name='Junction' type='sco:JunctionType' substitutionGroup="gml:_Feature" />
<xs:element name='Road' type='sco:RoadType' substitutionGroup="gml:_Feature" />

Instance

<gml:featureMember>
  <sco:Road gml:id="r1">
    <sco:junctionRef xlink:href="#j1" />
  </sco:Road>
</gml:featureMember>
<gml:featureMember>
  <sco:Road gml:id="r2">
    <sco:junctionRef xlink:href="#j2" />
  </sco:Road>
</gml:featureMember>
<gml:featureMember >
  <sco:Junction gml:id="j1">
    <sco:roadRef xlink:href="#r1" />
    <sco:direction>220</sco:direction>
  </sco:Junction>
</gml:featureMember>
<gml:featureMember >
  <sco:Junction gml:id="j2">
    <sco:roadRef xlink:href="#r2" />
    <sco:direction>90</sco:direction>
  </sco:Junction>
</gml:featureMember>

Many features may share the same feature as a property

For example:

  • many samples at a sampling location
  • many measurements on a sample

Schema

<xs:element name="locationRef" type="sco:LocationPropertyType"/>

<xs:complexType name="LocationPropertyType">
  <xs:annotation>
    <xs:documentation>Container for a sampling location - follow gml:AssociationType pattern.</xs:documentation>
  </xs:annotation>
  <xs:sequence minOccurs="0">
    <xs:element ref="sco:Location" />
  </xs:sequence>
  <xs:attributeGroup ref="gml:AssociationAttributeGroup" />
</xs:complexType>

<xs:complexType name="LocationType">
  <xs:complexContent>
    <xs:extension base="gml:AbstractFeatureType">
      <xs:sequence>
        <xs:element name="geom" type="gml:LocationPropertyType"/>
      </xs:sequence>
    </xs:extension>
  </xs:complexContent>
</xs:complexType>

<xs:complexType name="SampleType">
  <xs:complexContent>
    <xs:extension base="gml:AbstractFeatureType">
      <xs:sequence>
        <xs:element name="type" type="xs:string" />
        <xs:element name="ammount" type="xs:float" />
        <xs:element ref="sco:locationRef"/>
      </xs:sequence>
    </xs:extension>
  </xs:complexContent>
</xs:complexType>

<xs:element name='Sample' type='sco:SampleType' substitutionGroup="gml:_Feature" />
<xs:element name='Location' type='sco:LocationType' substitutionGroup="gml:_Feature" />

Instance

<gml:featureMember>
  <sco:Sample gml:id="s1">
    <sco:type>sampleType1</sco:type>
    <sco:ammount>1.0</sco:ammount>
    <sco:locationRef>
      <sco:Location gml:id="l1">
        <sco:geom>
          <gml:Point>
            <gml:coordinates>10 10</gml:coordinates>
          </gml:Point>
        </sco:geom>
      </sco:Location>
    </sco:locationRef>
  </sco:Sample>
</gml:featureMember>
<gml:featureMember>
  <sco:Sample gml:id="s2">
    <sco:type>sampleType2</sco:type>
    <sco:ammount>2.0</sco:ammount>
    <sco:locationRef xlink:href="#l1" />
  </sco:Sample>
</gml:featureMember>
<gml:featureMember>
  <sco:Sample gml:id="s3">
    <sco:type>sampleType3</sco:type>
    <sco:ammount>3.0</sco:ammount>
    <sco:locationRef xlink:href="#l1" />
  </sco:Sample>
</gml:featureMember>

Back-end capabilities to make the product easily applied:

Data may exist in a pre-existing database structure

This may include multiple tables and business rules and not be efficiently mapped into flat views. WFS will need to be able to map queries against the real data.

Apart of the modeling power of the Feature API to create complex schemas like de above examples, there are a set of requirements that the product must address in order to be useful. These requirements comes from the fact that often the back end data structure of an organization is not modeled in the same way than the community schema they need to serve. So, though the needed information is there, it is needed a kind of adaptation layer from the organization's internal model to the community schema model.

The following is a list of the recognized requirements this kind of adaptation layer should cover to easily map such an internal model to an externally defined schema:

Features may not exist except as a result of queries

For example, a feature collection showing the number of pollution incidents by catchment may be created dynamically from an incident database. This is particularly true of datasets with highly variable spatio-temporal density .

Icon

A FeatureType may be created by a pre-defined query in the back end query language. I.e. Using a predefined SQL query.

Multiple columns could be mapped to a multi-value property:

Suppose you have a table with the following columns: watersampleid, ph, temp, turbidity.
This table maps directly to a flat FeatureType with the columns names as simple properties.

But the externally defined schema to serve has a single complex property holding all this information:

table instance:

watersampleid

ph

temp

turbidity

watersample.1

7

21

0.6

flat instance
<sample gml:id="watersample.1">
 <ph>7<ph>
 <temp>21</temp>
 <turbidity>0.6</turbidity>
</sample>
complex instance
<sample gml:id="watersample.1">
   <measurement>
      <parameter>ph</parameter>
      <value>7</value>
   </measurement>
   <measurement>
      <parameter>temp</parameter>
      <value>21</value>
   </measurement>
   <measurement>
      <parameter>turbidity</parameter>
      <value>0.6</value>
   </measurement>
</sample>
Icon

So you need to map the flat schema of your table to a complex one containing a sequence of measurement properties

It is typically necessary to run queries across related features:

You have a complex type: sampling locations. It has a Feature property of type River. River has a flow simple attribute.

It should be possible to run a query like:

show me sampling locations where the current river flow = 0

Icon

This example can be generalized to say that it must be possible to execute queries involving complex properties, whether they refer to the same feature type or a related one.

A feature property may be mapped to part of or more than one storage schema:

Often you have a normalized database, but your externally defined schema exposes some information in a way that is denormalized for your internal storage schema.

For example, you have a table:
phone(area_code, number, extension),
but the schema for a phone number property is defined as a single string property with the pattern area_code + number + " x" + extension

Icon

It would be an extrmelly powerful capability if the product can use an Expression to define a types's value range from another FeatureType.