GeoTools : Multidimensional Grid Motivation and Scope

Why do you want more?

There exists a set of applications which require a toolkit which works with n-Dimensional data, where n is greater than 2. This set of applications have traditionally been the domain of data visualization packages. These packages perform very well and permit many forms of data exploration which typical GIS packages (including GeoTools) do not support. However, geospatial support in these packages tends to be very sparse. This is important when the data being processed or visualized is strongly geospatial (e.g., numerical weather prediction). One can either use the custom programs developed by a specific community for specific purposes (and perform no other analyses), or one can "dumb down" the data so that a lowly GIS can understand it (and perform some basic analyses).

We (the USDA Forest Service) have been presented with a data integration problem whereby several models from different domains are required to interoperate. This project also requires that we set up a distributed modeling system, whereby the output of the course-scale model run at our center must be used as input to the high-resolution regional modeling centers. Without going into more detail, I hope it is possible to see that the path of least resistance led us to augment GeoTools' current coverage functionality using pre-existing international standards. The alternative is to cobble together a data integration, processing, and distribution system unique to our needs which could probably never be re-used with any other combination of models or tools.

Our partners (the Naval Undersea Research Center, a division of NATO) have a similar problem in a different application domain. Additional information about that project should be given by them in order to avoid any misrepresentation, omission, embellishments, or misunderstandings. But specifics aside, both NURC and the USDA Forest Service recognize the value of producing a toolkit instead of a non-reusable unique solution.

Therefore, I write this wiki page. (smile)

How many dimensions can OGC represent?

It is common to require 5D data in fields like computational fluid dynamics and numerical weather prediction. Of these five dimensions, three are spatial, one is temporal, and one is a "category" axis, which represents many different quantities (e.g., component of a wind vector, concentration of a chemical, pressure, temperature, etc.) Generalizing this problem to n-Dimensional data, we can see that of the "n" dimensions, some dimensions are spatial, some are temporal, and the remainder are "other". The essential problem is that OGC specifications only contain a vocabulary for describing spatial or temporal dimensions, leaving the case of "other" unhandled. The root cause of this is stated in Topic 2, which is also ISO 19111:

From ISO19111

This International Standard defines the conceptual schema for the description of spatial referencing by coordinates, optionally extended to spatio-temporal referencing. It describes the minimum data required to define 1-, 2-, and 3-dimensional spatial coordinate reference systems with an extension to merged spatial-temporal reference systems. It allows additional descriptive information to be provided.

Any dimension which is not spatial or temporal is not representable in the OGC/ISO CRS framework. Assuming the maximum of 3 spatial and 1 temporal dimension, this allows us to represent, at most, a 4D + bands configuration. It also leaves us unable to use an axis with units of Volts, Mass, Velocity, etc. This shortcoming is important because it means that derived products (like histograms) are not expressable in the OGC framework. One cannot even write valid metadata to describe the dataset because ISO 19115 is based upon the same CRS framework.

Essential Problem


The essential problem is that OGC specifications only contain a vocabulary for describing spatial or temporal dimensions, leaving the case of "other" unhandled.

What would a hack look like?

One could choose to represent the fifth dimension as a series of bands. Alternatively, one could create one 4D dataset for each value present in the fifth axis. Both schemes lead directly to the problem of mapping values along the fifth dimension into 4 dimensional space. One either maps values along the fifth axis to a particular band or to a particular 4D coverage.

The problem with representing the fifth dimension as a sample band is that it fragments the index into the grid. The fifth dimension is the result of the evaluate() operation on the other four dimensions. One maintains a 4D index, then retrieves only the i-th property of the query. There is no easy way to subset using the fifth dimension.

Likewise, the index is fragmented when the fifth axis is represented by completely independent 4D grids. Data selection becomes complex.

Are you proposing a solution?

Unfortunately, no. This topic was investigated because our current effort involves revisiting coverages. Our immediate needs are not for generic N dimensional functionality, but for 4D + bands functionality. We were hoping to create a system whereby our 4D + bands requirement was just a subset of the capabilities of a more general data processing and visualization system.

Scope of the Multidimensional Grid Project


This page should be taken to define the scope of the multidimensional coverage effort within GeoTools. At the completion of this project, GeoTools will implement as much coverage functionality as the OGC/ISO 191xx specifications permit. To go further and produce a truly N dimensional data processing system will require changes to the specifications upon which GeoTools is based. No such changes are being proposed at this time.