Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The XML Feature contains operators that allow the processing of XML documents. Add de.uniol.inf.is.odysseus.server.xml.feature to your Odysseus in order to use this feature.

Table of Contents

Operators

ToXML

This operator allows the transformation from a Tuple object into a XML document. The transformation requieres a XML schema definition (XSD) which describes the structure of the XML document and the mapping between that document and the Tuple object. However, there are different approaches to provide that schema information to the operator:

  • rootElement
  • rootAttribute
  • xsdFile
  • xsdString
  • xsdAttribute 
  • xPathAttributes

Firstly, the schema can be provided via file access (whereas the file can be located either locally or on a webserver).

Code Block
transform = TOXML({
                rootelement = 'user',
                xsdfile = '../schema.xsd'              
              },
              input
            )

Secondly, the schema information can be passed via string.

Code Block
transform = TOXML({
                rootelement = 'user',
                xsdstring = '<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
							  <xs:element name="user">
							    <xs:complexType>
							      <xs:sequence>
							        <xs:element type="xs:string" name="nickname"/>
							        <xs:element type="xs:string" name="fname"/>
							        <xs:element type="xs:string" name="lname"/>
							        <xs:element type="xs:string" name="email"/>
							        <xs:element type="xs:byte" name="age"/>
							        <xs:element name="roles">
							          <xs:complexType>
							            <xs:sequence>
							              <xs:element type="xs:string" name="role"/>
							            </xs:sequence>
							          </xs:complexType>
							        </xs:element>
							      </xs:sequence>
							    </xs:complexType>
							  </xs:element>
							</xs:schema>'             
              },
              input
            )

Thirdly, the schema information can be a payload on the current Tuple object which can be advantageous if the XSD has to change dynamically (xsdattribute = 'xsd'). Thus, it is possible to produces different documents within one data stream. Besides that, the root element of the document might change and therefore it is possible to define an attribute of the Tuple object that carries this information as well (rootattribute = 'root').

Code Block
transform = TOXML({
                rootattribute = 'root',
                xsdattribute = 'xsd'           
              },
              input
            ) 

Furthermore, the operator allows to use XPath expressions for an enriched mapping between attributes of the Tuple object and the XML document. The expressions can be applied as options or again as a payload on the Tuple object (xpathattributes = ['xpaths']). If xpathattributes is used, each defined Tuple attribute can contain several expressions. Every expression is followed by an attribute name that has to match with an attribute of the Tuple object (e.g. [attributename][delimeter][expression]). These pairs are also separated by a delimiter (default is the , symbol). If the options alternative is used, the attribute-expression pair is given explicitly as option parameter.

Code Block
transform = TOXML({
                rootelement = 'user',
                xsdfile = '../schema.xsd',
                options = [
                  ['user_id', '/user/@id'],
                  ['role_1', '/user/roles/role[1]'],
                  ['role_2', '/user/roles/role[2]'],
                  ['role_3', '/user/roles/role[3]']
                ]                            
              },
              input
            )
Code Block
transform = TOXML({
                rootelement = 'user',
                xpathattributes = ['xpaths'],
                xsdfile = '../schema.xsd',
                options = [['delimeter', ';']]                                 
              },
			  input
            )

XMLToTuple

It is also possible to transform XML documents to Tuple objects whereas the mapping between attributes is done by XPath expressions and Tuple schema.

Code Block
transform1 = XMLTOTUPLE({
                  schema=[
                    ['nickname','String'],
                    ['fname', 'String'],
                    ['lname', 'String'],
                    ['email', 'String'],
                    ['role', 'String'],
                    ['age', 'Integer']
                  ],
                  expressions = [
                    ['nickname', '/user/nickname'],
                    ['fname', '/user/fname'],
                    ['lname', '/user/lname'],
                    ['email', '/user/email'],
                    ['role', '/user/roles/role'],
                    ['age', '/user/age']
                  ]              
                },
                input
              )

XMLEnrich

With this operator, it is possible to combine two different documents into one single document:

  • target
  • predicate

The operator functions as the Enrich operator that evaluates a given predicate and only merges the two documents if the predicate is true. In order to compare two documents, predicates can consist of XPath expressions that return a single value. The second parameter is the target parameter which specifies a location in the left document where the right document should be inserted. However, the right document is the enrichment for the left document.

Code Block
out = XMLENRICH({
          target = '/user',
          predicate = 'toString(/user/@id)==toString(/roles/@id)'
        },
        input,
        enrichment
      ) 

XMLMap

It is possible to use MEP Functions and XPath expression together within XML documents to manipulate values or project only a subset of the origin document (e.g. this operator works like Map operator).

Code Block
mapped = XMLMAP({
                expressions = [
                		['toLong({/user/age}) + 10000', '/user/age'],
                		['concat(toString({/user/fname}), "__XX01")', '/user/fname'],
                		['{/user/nickname}', '/user/nickname']
                ]
              },
              input
            )

XMLSplit

This operator splits a document in subsets whereas the each split point is defined by an XPath expression.


Code Block
splitted = XMLSPLIT({EXPRESSIONS = ['/user/roles', '/user/fname']}, input)

XMLTransform

The XMLTransform operator allows to transform a given XML with Extensible Stylesheet Language Transformations (XSLT):

  • xsltstring
  • xsltfile
  • dynamic

There are three different ways to provide the XSLT information. First, the information given by a string in the operator definition.

Code Block
trans1 = XMLTRANSFORM({
			xsltstring='<?xml version="1.0"?>
						<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
						    <xsl:template match="/">
						        <html>
						            <body>
						                <h2>My CD Collection</h2>
						                <table border="1">
						                    <tr bgcolor="#9acd32">
						                        <th>Title</th>
						                        <th>Artist</th>
						                    </tr>
						                    <xsl:for-each select="catalog/cd">
						                        <tr>
						                            <td><xsl:value-of select="title"/></td>
						                            <td><xsl:value-of select="artist"/></td>
						                        </tr>
						                    </xsl:for-each>
						                </table>
						            </body>
						        </html>
						    </xsl:template>
						</xsl:stylesheet>'
          },
          input
        ) 

Secondly, the XSLT can be read from file (whereas the file can be located either locally or on a web server).

Code Block
trans2 = XMLTRANSFORM({
			xsltfile = '../transform.xslt'
			},
			input	
		  )

And thirdly, the operator can be configured (dynamic = true) in a way that the XML document will be scanned for an embedded XSLT.

Code Block
trans3 = XMLTRANSFORM({
              dynamic = true			
            },
            input
          )

Wrapper

XML2 ProtocolHandler

This Protocol handler allows to load XML documents which will be represented as the KeyValueObject data type.

Code Block
inputXML2 = ACCESS({
                source='xml',
                protocol='XML2',
                transport='File',
                wrapper='GenericPull',
                datahandler='KeyValueObject',
                metaattribute = ['TimeInterval'],
                options = [
                  ['filename', '../document.xml']
                ]                                                                               
              }                                                                  

XML3 ProtocolHandler

This Protocol handler allows to load XML documents into the Odysseus system whereas the document will be represented by the XMLStreamObject data type. Furthermore, with the tag_to_strip option it is possible to shorten the XML document and thus process only a part of the document. The option awaits an element name that is present in the document.

Code Block
input = ACCESS({
            source='xml',
            protocol='XML3',
            transport='File',
            wrapper='GenericPull',
            datahandler='XMLStreamObject',
            options=[
              ['filename', '../document.xml'],
              ['tag_to_strip', 'user']
            ]                                                                                                                                                                                                   
          }                                                                                                                                                                 
        )