You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

The RareSequence operator finds seldom sequences in data streams. It's build for discrete values, e.g. states. It is important to have recurring tuples and therefore remove attributes from the tuples which make "equal" tuples unique. E.g., if you have a tuple with a counting number and a recurring state [(1, "state x"), (2, "state y"), (3, "state x"), (4, "state x"), ...], this operator won't work. In this case, you would need to use a projection to remove the counter to get tuples like [("state x"), ("state y"), ("state x"), ("state x")] with each "()" as a single tuple.

The operator builds a tree from the incoming tuples and counts for each node, how often it was used. With this information, it can calculate the probability for each node in comparison to its siblings and therefore for each path in the tree. If the probability is below the value given by the user, the current tuple is marked as anomaly and will be reported.

Parameters

 

Example

stateAnalysis = RARESEQUENCE({
                    treedepth = 100,
                    minrelativefrequencyPath = 0.1,
					minrelativefrequencyNode = 0.3,
					firsttupleisroot = 'true'
                  },
                  state
                )

Using the backup functionality

  • No labels