In date 31 May 2007, I asked some questions to my Mentor (Simone Giannecchini) relying the ImageIO level capabilities and requirements in the context of data access.
Here below, there is the LOG of our conversation.
Daniele: Just a question: From the abstraction layer above the ImageIO-ImageReader layer, I should specify the required section of 2D data (by means of a specific Time, level, and x-y region).
My question is: should I extend or set-up a kind of ImageReadParam and then provide it as an input param of the ImageReader Read operation?
Simone: ok, here is my vsion.
I think that the ImageIO ImageReader and ImageWriter should be usable from outside the GeoTools or better GIS world in general, because people might want to use ecw or jp2 or whatever we do for simple images not coverages.
Hence I am convinced that at the base of everything we should provide readers and writer for each formats that just provide information that are not bound to the gis world, an example the crs.
Simone: let's take ecw or better a gdal plugin, it gives us back the crs in wkt. In my opinion the baseline ImageReader should mostly ignore this or at most it should simply give us back a string
without even thinking about parsing and the like.
Simone: I would like to keep the dependencies between geotools and this work down to zero.
The dependence should be [geotools depends on jiio-ext]. Not [jiio-ext depends on geotools] or [jiio-ext depends on geotools "and" geotools depends on jiio-ext].
Simone: I want people for the image processing world to use jiio without problems and without having to download geotools itself.
This is like a wish or better we could call it a requirement.
Simone: In this vision, the baseline access to 2d layers in multidimensional stores is done by a simple integer because from the simplest point of views the different layers are simply different images in the same file.
Simone: Our duty is to find a way to convert a request like "give me the layer that corresponds to z==z0 and t==t0" into xxxxreader.read(index).
Simone: we need some way of indexing the images using time and z. I am aware that, to implement the mapping we need to get the information to build it from somewhere.
Well, IIOmetadata or some replacement of it is the way to go but, again, I think that at the very base level we should avoid to embed spatial info at first the imageio level.
What I am thinking about is something like this:
every single plugin we developed must have public methods only to access single layers in the ImageIO fashion no talking about spatial or temporal mappings and the like.
But we should then develop wrapper or specialization of this plugins specific for geotools that access the relevant info not available in the base implementations.
Simone: and they make it available to the gis user. At this level we could create new profiles of metadata to handle more info than just the basic one and we could depend on geotools with no worries.
Just to give an example, let's take hdf. HDF is a generic container, you can ut whatever you want inside it and that's basically what people do!
I don't think we should try the approach of putting a lot of logic in the first base layer.
What I would do is do something powerful but not too smart to access the single 2d layer with all their bands and just forget about the other info like strange metadata and the like.
Simone: but exposed methods and objects to access them for the next layer of the HDF plugin the one that is spatial aware.
I would probably make good use of protected methods and object, so that next layer would be done by inheritance and information could as low exposed as possible to avoid information explosion (too much info is worse than not enough).
Am I talking non sense?
Daniele: nope. I understand and agree. Thank you very much for your suggestions
Simone: what do you think? do you get the point? It much like the options management in gdal.
You can have one class doing most of the job, at least the common part
Simone: and have the plugins handle the details of the access to the data because, in the end, the geospatial part would be pretty much the same
if use a good level of indirection and abstraction
Simone: the only thing that should be tricky would be handling correctly complex emtadata.
Ok. I taken some notes in order to proceed with the basic architectural design.
Simone: It would be nice to have your impression on Martin's work. I have not had a chance yet to look at it
Daniele: I taken a look on it.
Essentially, at the actual state, (I'm talking about NetCDF reader) there is a DefaultReader which access data following a rule like this:
the imageIndex refers to a specific variable contained in the data Source.
Z is used as a Band, x-y simply refers to the 2d region.
Actually, time is not handled.
Simone: different Z as a band could be a problem.
I don't mind
not being stuck by using actual ogc-iso standards but managing different Z levels as different bands should not be correct.
I am thinking especially about a possible WCS implementation and WMS also.
I would rather access them separatey.
Simone: I guess I'll have a better understanding once I have a chance to look at it myself next week.
by the way this is something we got to think about in generals:
a file could contain more than one single coverage
Simone: one example a grib file that contains in a certain spatiotemporal cube temperature, pressure and humidity
or an HDF that containes modis radiance for 3 different areas. Each file would have multiple coverages.
Simone: again, at the base level, each 2d layers is simply a layer, but for the above levels we need to take somehow that into account or at least we need to be aware of the problem.
you follow me?
Daniele: yes, in that case we need to add a step to the imageIndex-relation logic.
Simone: I think your gsoc should be focused on trying to propose solutions to these problems
What do you mean?
Daniele: just a simple additional computation step in the intermediate layer when determining the imageIndex... just that simple
Simone: again, I think that as far as talking about the base ImageIO level everything should be flat: no notion of time z level or even coverage.
Daniele: yes... I Agree
Simone: above that group by coverage than by z and t.
do you agree?
Daniele: Yes. It will be a task of the intermediate level to understand which 2D layer to retrieve.
Simone: I think that if we do these layering in the correct way we would achieve a lot and we could resue Martin's work on GridCoverage2D which is actually pretty cool
Simone: exact. good to see that we are on the same line... well if we weren't you would get the GSOC money
Simone: anything else to discuss?
Daniele: Not actually... thx
Daniele: Ok. Thanks for your explainations.