MEP Functions

MEP functions can be used to perform arbitrary things with your = data (e.g., mathematic operations, string operations, etc.). These function= s can be used in different operators like Map, Select, or Join. To implement a MEP function, one has to extend the <= em>AbstractFunction class. To implement your own MEP function you basi= cally have to implement the getValue function that calcula= tes the return value and call the super constructor with the configuration = of your MEP function. The configuration contains at least the symbol,= the number of parameters, the accepted data types for the parameters, and = the data type of the return value. In addition, the configuration can conta= in a flag to indicate if the MEP function should be evaluated each time or = if it is a constant and the time and space complexity of the MEP function.<= /p>

MEP Function stub

public class MyFunction extends AbstractFunction<=
;Double> {

    public static final SDFDatatype[][] accTypes =3D new SDFDatatype[][] {{=
 SDFDatatype.DOUBLE },
                                                                        { S=
DFDatatype.DOUBLE }};

    public MyFunction() {
        super("myFunction", 2, accTypes, SDFDatatype.DOUBLE, true, 3, =
5);
    }

    @Override
    public Double getValue() {
        double a =3D (double) this.getInputValue(0);
        double b =3D this.getNumericalInputValue(1);

        return a  +b;
    }

 }

In this example a MEP function is defined with the symbol myFunction= and can be used in a predicate or map expression with two parameters = of type Double and will also return a value of type Double which will be th= e sum or the two input parameters as defined by the implementation of the <= em>getValue() function. Further, the MEP function can be optimized if = the two input parameters are constant. Then, the result will be calculated = only once. In addition, the MEP function provides a time complexity score o= f 3 and a space complexity score of 5 which means that th= e optimizer will try to place the call of myFunction before any ot= her MEP function with a higher time or space complexity and behind any MEP = function with a lower time or space complexity during optimization.

Access to function attri= butes

To Access the attributes of the function you can use the getInputVal= ue or the getNumericalInputValue methods. While the first met= hod returns an object, the second already cast the input value to a double = value. Both methods takes the position index of the attribute as an argumen= t. The name of the function, the total number of attributes, and the data t= ype of the accepted attributes is set in the constructor. Thus, a = MEP function can handle multiple data types for each attribute.

Access to meta attributes
To access the meta attributes of an incoming streaming object you can us= e the getMetaAttribute function.

Access to additional cont= ent

To access the additional content of an incoming streaming object you can= use the getAdditionalContents method to access all contents. If y= ou only want to access a special field you can issue the getAdditionalC= ontent(fieldName) method.

Support for optimization
The MEP optimizer tries to determine if an expression is a constant and = should not be evaluated each time. For this, the getValue method is called.= This behavior can be changed by setting the fifth parameter in the constru= ctor to false. To support the optimization of predicates, the time and spac= e complexity can be set in the constructor as the last two parameters. Both= values should be in the range between 0-9 depending on their average expec= ted complexity. Depending on the value, the MEP function will be placed dif= ferently in the resulting optimized predicate.

Image the following scenario with a predicate expression that checks if = an attribute x holds a value higher than the return value of a function cal= led lastPrimeNumber(x) (that might estimate the last prime number = lower or equals to x), a value higher than the result of function defined a= bove, and higher than 0:

(x > lastPrimeNumber(x)) || (x > myFunction(y, z)) || (x > 0)
During predicate optimization, the optimizer checks the complexity value= s and reorder the terms according to their values resulting in an optimized= predicate as follows:

(x > 0) || (x > myFunction(y, z)) || (x > lastPrimeNumber(x))
Here, the last term comparing the value of x with 0 is moved to the fron= t and the more expensive function is moved to the tail.

A rule of thumb should be, MEP functions with logarithmic complexit= y should have a value between 0-3, linear complexity a value between = 4-6, and exponential complexity a value between 7-9. However, this is = just a first draft. The basic idea is to evaluate cheap functions first and= avoid expensive functions if possible.

MEP Functions

Access to function attri= butes

Access to meta attributes To access the meta attributes of an incoming streaming object you can us= e the getMetaAttribute function.

Access to additional cont= ent

Access to meta attributes
To access the meta attributes of an incoming streaming object you can us= e the getMetaAttribute function.