Working with GML

From OpenJUMP Wiki
Jump to navigation Jump to search

An extract from the JUMP User Guide:

Section: APPENDIX: GML INPUT & OUTPUT TEMPLATES

To read a general GML file, JUMP makes use of a mapping file called a GML input template. This template specifies how the contents of the GML file are mapped to a JUMP features. Similarly, when saving a GML file, JUMP uses a GML output template to specify the structure of the output GML. The following sections explain how to create GML input and output templates.

Note: You don’t need to create input or output templates for JUMP GML files. For more information, see Sections 4.1 Loading A Layer and 4.2 Saving A Layer.

12.1 WRITING A GML INPUT TEMPLATE

GML Input Templates are able to extract a single FeatureCollection of features from a GML file. Attribute values for each feature can be extracted from the GML describing the feature in a variety of ways. Listing 12-1 below shows an example of an input template (the column definitions are omitted and will be discussed later).

<?xml version='1.0' encoding='UTF-8'?>
<JCSGMLInputTemplate>
<CollectionElement>dataFeatures</CollectionElement>
<FeatureElement>Feature</FeatureElement>
<GeometryElement>gml:polygonProperty</GeometryElement>
<ColumnDefinitions>
.............
</ColumnDefinitions>
</JCSGMLInputTemplate>

Listing 12-1 – Example of an input template (column definitions omitted)

The input template begins by specifying the GML document’s collection element (dataFeatures) and the feature element (Feature). This information tells JUMP how to identify each feature in the GML document.

Next, the geometry element and column definitions are given. These specify the spatial and non-spatial attributes of each feature. They specify child elements of the feature element (e.g. gml:polygonProperty is a child element of Feature). If there is more than one kind of geometry element in the file (e.g. Polygons and MultiPolygons), you can specify multiple GeometryElement tags. Note however that JUMP still assumes that each feature has only one geometry. If a feature is found to have more than one geometry, only the last one is read. Within the ColumnDefinitions tags are column tags, for each of the feature’s nonspatial attributes. Listing 12-2 below shows an example of a column definition.


<column>
<name>Rainfall</name>
<type>DOUBLE</type>
<valueelement elementname="rainfall"/>
<valuelocation position="body”/>
</column>

Listing 12-2

Example of a column definition name is the name that you want the column to have in JUMP. type may be STRING, INTEGER, DOUBLE, or DATE. (JUMP can identify and parse a variety of date formats, but the recommended format for your data is yyyy-mm-dd). value-element tells JUMP how to find the XML element containing the column value. In the example, the element is the one named “rainfall”. In some cases there may be multiple elements with the same name in the GML for a feature. To handle these cases, elements may be identified more precisely by providing a combination of the following attributes:

  • Attribute Value: Optional?
  • elementname: The name of the element N
  • attributename: The name of an attribute on the element Y
  • attributevalue The value of the given attribute Y
  • valuelocation tells JUMP how to extract the actual value of the column from the identified element.

In the example the value is being extracted from the body of the element. The template also supports specifying that the value is located as the value of an attribute of the element by using the attributeName attribute:

<valuelocation position=“attribute” attributeName=“average-rainfall”/>

12.2 WRITING A GML OUTPUT TEMPLATE

Output templates are literally a “template” for the text in the desired output file. They consist of constant GML markup, together with symbols which will be replaced by the geometry and attribute information in a JUMP feature collection. This allows complete flexibility in the GML produced. (In fact, the output does not have to be GML at all, although the only Geometry output format currently supported is GML). GML output templates have the following structure:

Header Section
<%FEATURE%>
Feature-Definition Section
<%ENDFEATURE%>
Footer Section

Listing 12-3 – Output-template structure

The header and footer sections can contain arbitrary GML markup or data. They will appear at the beginning and end of the output GML file. They contain the opening and closing markup for the GML file elements as well as any elements which open and close the GML FeatureCollection. The Feature-Definition Section can contain arbitrary GML markup, as well special output template tags. In the output GML the template tags will be replaced by the actual data for the geometry and attributes of a JUMP feature collection. The Feature-Definition Section will be repeated once for each feature in the JUMP feature collection. The supported output template tags are given below.

Table 12-1 – Special output-template tags

TAG DESCRIPTION
<%=COLUMN columnname%> Inserts the value of the attribute named columnname.
<%=GEOMETRY%> Inserts a GML representation of the geometry.

An example of an output template is given in Listing 12-4 below.

<?xml version='1.0' encoding='UTF-8'?>
<dataset>
<%FEATURE%>
<Feature>
<property name="FID"><%=COLUMN fid%></property>
<property name="DESCRIPTION">
<%=COLUMN description%>
</property>
<GEOMETRY>
<%=GEOMETRY%>
</GEOMETRY >
</Feature>
<%ENDFEATURE%>
</dataset>

Listing 12-4 – Example of an output template.

Note: If you open a GML or FME GML file then save it again in the same format, some information may be lost. The Workbench preserves only the information it uses:

  • one spatial attribute for each feature (the “geometry”)
  • some non-spatial attributes for each feature (strings, dates and numbers)

Any information that the Workbench does not use will not be present in the document that gets saved. Therefore, you should generally avoid using the Workbench to overwrite existing files, unless you are sure that you won’t need all of the information in the old file.

A note by Andreas

DeeJUMP provides a GML3-Input/Output function. I've not used it much, so I'm not sure how practical it is, but deegree is generally capable of "guessing" the types of simple properties if no schema is given.

http://wiki.deegree.org/deegreeWiki/deeJUMP

For people that like packaged versions (this is very inofficial stuff):

http://wiki.deegree.org/deegreeWiki/AndreasSchmitz