Reading data from a data store requires the following steps:
- get the feature type name of the feature type you want to read by using the getTypeNames() method;
- get a FeatureSource that allows you to read that feature type;
- perform a Query on the FeatureSource and get a FeatureResult object;
- ask for a FeatureReader and iterate over the results.
If that sounds way too complicated, you may ask the DataStore for a FeatureReader directly. But be aware that by doing so you are possibly missing caching and other features that are specific to the FeatureSource and FeatureResult.
A Query object is a wrapper around a Filter that provides more capabilities than simple filtering:
- select what attributes you want to load by means of the getPropertyNames() method;
- limit the number of resulting features by means of the getMaxFeatures() method;
- coordinate system change or override using the CoordinateSystemReproject and CoordinateSystem properties (unfortunately, this is not implemented at the moment).
Other functionalities added by FeatureSource and FeatureResults are:
- feature modification listeners (FeatureSource);
- bounds computation and possibly caching (FeatureSource and FeatureResults);
- direct loading of all the results into a feature collection (FeatureResults)
An example: the shapefile reading tool
Now let's have a look at a simple example. The ShapeReader class, attached to this page, has just a simple main method that:
- asks the user for a shapefile (and falls back on a shapefile contained in the classpath if none is provided);
- creates a shapefile datastore from the URL and gets out the single feature type in this datastore;
- prints out the feature type's non-geometric attributes name and type as a header;
- prints out every feature id and non-geometric attributes;
- and finally prints out every feature id and geometry in wkt format.
As you can see, the code makes use of the FeatureReader interface, thus loading only one feature at a time, instead of loading the whole feature collection by using FeatureResults.collection(). This means that the shapefile will be read twice, but it also means that this little program can work with shapefile of any size without encountering any out of memory problems.
Of course, if you are going to process different attributes in different ways, you can always build a Query that makes the datastore load only the attributes that you need. At the time of writing, however, this will not improve shapefile performance. Under the hood, the shapefile data store reads every attribute anyway and casts them to a smaller feature type after the features are in memory (thus, we still have some work to do on Shapefile reading optimization).