...
To enable the probabilistic processing you have to include the probabilistic feature and issue the StandardProbabilistic transformation configuration (#TRANSCFG StandardProbabilisticuse the probabilistic metadata (#METADATA Probabilistic) in your Odysseus script.
Estimating probabilistic values
ToDo:
Expectation Maximization
The EM operator allows the fit a Gaussian mixture model (GMM) with predefined number of mixtures to the values of a data stream.
Kalman Filter
The Kalman Filter operator can be used if the variance of of the values in the data stream is known from some datasheet.
Filtering probabilistic values
For filtering probabilistic filtering probabilistic values you can use the same syntax that you already use for deterministic values. However, the result of the operators differ. In case of discrete probabilistic values the Select operator returns a tuple with a lower tuple existence probability.
...
Code Block | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
filteroutput = SELECT({predicate = RelationalPredicateProbabilisticRelationalPredicate('x > 1.0 AND x < 3.0')}, input) |
...
Code Block | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
filteroutput = SELECT({predicate = RelationalPredicateProbabilisticRelationalPredicate('x > 1.0 AND x < 4.0')}, input) |
...
Code Block | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
Select * From input1,input2 WHERE input1.x=input2.y; |
...
Now that you know how to filter and join probabilistic values you probably want to do something with the values like performing mathematic operations on them. To do so you can use the algebraic operator (+, *, -, /, ^) on probabilistic values in i.e. a Map operator. Attention, when using multiplication or division on continuous probabilistic values, the result is estimated by fitting Gaussian mixture models to resulting distribution.
...
theme | Eclipse |
---|---|
language | javascript |
title | Algebraic operator on probabilistic discrete values |
linenumbers | true |
Mathematical Functions
Int(Distribution, Lower Limit, Upper Limit)
Estimates the multivariate normal distribution probability with lower and upper integration limit.
Statistical Functions
Similarity(Distribution, Distribution)
Calculates the Bhattacharyya distance between two distributions.
Code Block | ||||
---|---|---|---|---|
| ||||
SELECT similarity(as2DVector(x1,y1), as2DVector(x2,y2)) FROM stream
|
Distance(Distribution, Value)
Calculates the Mahalanobis distance between the distribution and the value. The value can be a scalar value or a vector.
Code Block | ||||
---|---|---|---|---|
| ||||
SELECT distance(as3DVector(x, y, z), [1.0;2.0;3.0]) FROM stream |
Datatype Functions
as2DVector(Object, Object)
Converts the two object into a 2D vector.
as3DVector(Object, Object, Object)
Similar to the as2DVector function, this function creates a 3D vector with the given objects.
...
Access to tuple existence
...
Code Block | ||||||||
---|---|---|---|---|---|---|---|---|
| ||||||||
filteroutput = ExistenceToPayload(SELECT({predicate = RelationalPredicateProbabilisticRelationalPredicate('x > 1.0 AND x < 4.0')}, input)) |
ProbabilisticRelationalPredicate