Monday, April 7, 2014

Stream Mining with Rfx Framework

Estimate the unique words from data stream URL http://en.wikipedia.org/wiki/List_of_United_States_counties_and_county_equivalents
Using new data structure HyperLogLog since Redis 2.8.9
http://redis.io/commands#hyperloglog

Open Source Stream Library of AddThis
https://github.com/addthis/stream-lib

HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm
Original Paper: http://algo.inria.fr/flajolet/Publications/FlFuGaMe07.pdf

Mining Data Stream 
Slide: http://www.stanford.edu/class/cs246/slides/16-streams.pdf

Applicable Problems:
  • Estimate the unique elements in continuous data stream
  • Estimation for Big Data
  • finding an ever growing number of applications in networking and traffic monitoring, such as the detection of worm propagation, of network attacks (e.g., by Denial of Service), and of link-based spam on the web
  • an important indication for detecting attacks and monitoring traffic, as it records the number of distinct active flows
Refer Links

Featured Post

How to build your owned Customer Experience Activation Platform (CDP & CX)

What is USPA framework ? The USPA framework is conceptual framework, to develop Customer Experience Activation Platform (CDP ...