You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Next »

The probabilistic feature provides functions and operators to process discrete and continuous probabilistic values in a data stream. Continuous probabilistic values are represented using Gaussian Mixtures.


Filtering probabilistic values

For filtering probabilistic values you can use the same syntax that you already use for deterministic values. However, the result of the operators differ. In case of discrete probabilistic values the Select operator return a tuple with lower tuple existence probability. 

Lets assume you have an attribute x and that attribute is 1.0 with probability 0.25, 2.0 with probability 0.25, and 3.0 with probability 0.5. The following Select operation will now filter the attribute value such that the resulting attribute value can only be instantiated to 2.0 and the resulting tuple existence is reduced to 0.25.

Probabilistic discrete select
filter = SELECT({predicate = RelationalPredicate('x > 1.0 AND x < 3.0')}, probabilistic:data)

The filtering of continuous probabilistic distributions is similar to the processing of discrete probabilistic values in the fact that it may reduce the tuple existence probability.

Lets assume you have a random variable x with mean 0.0 and σ2 1.0 the following query will set the tuple existence to the cumulative probability that this random variable will take a value between the upper and lower bound that is ~0.1586235826896239.

Probabilistic continuous select
filter = SELECT({predicate = RelationalPredicate('x > 1.0 AND x < 4.0')}, probabilistic:data)


Access to tuple existence

To access the tuple existence during processing you can use the ExistenceToPayload operator that copies the tuple existence to the payload where you can access them with the attribute name "meta_existence".

  • No labels