TY - GEN
T1 - Mining sequential patterns from probabilistic databases by pattern-growth
AU - Muzammal, Muhammad
PY - 2011
Y1 - 2011
N2 - We propose a pattern-growth approach for mining sequential patterns from probabilistic databases. Our considered model of uncertainty is about the situations where there is uncertainty in associating an event with a source; and consider the problem of enumerating all sequences whose expected support satisfies a user-defined threshold θ. In an earlier work [Muzammal and Raman, PAKDD'11], adapted representative candidate generate-and-test approaches, GSP (breadth-first sequence lattice traversal) and SPADE/SPAM (depth-first sequence lattice traversal) to the probabilistic case. The authors also noted the difficulties in generalizing PrefixSpan to the probabilistic case (PrefixSpan is a pattern-growth algorithm, considered to be the best performer for deterministic sequential pattern mining). We overcome these difficulties in this note and adapt PrefixSpan to work under probabilistic settings. We then report on an experimental evaluation of the candidate generate-and-test approaches against the pattern-growth approach.
AB - We propose a pattern-growth approach for mining sequential patterns from probabilistic databases. Our considered model of uncertainty is about the situations where there is uncertainty in associating an event with a source; and consider the problem of enumerating all sequences whose expected support satisfies a user-defined threshold θ. In an earlier work [Muzammal and Raman, PAKDD'11], adapted representative candidate generate-and-test approaches, GSP (breadth-first sequence lattice traversal) and SPADE/SPAM (depth-first sequence lattice traversal) to the probabilistic case. The authors also noted the difficulties in generalizing PrefixSpan to the probabilistic case (PrefixSpan is a pattern-growth algorithm, considered to be the best performer for deterministic sequential pattern mining). We overcome these difficulties in this note and adapt PrefixSpan to work under probabilistic settings. We then report on an experimental evaluation of the candidate generate-and-test approaches against the pattern-growth approach.
KW - Mining complex sequential data
KW - Mining Uncertain Data
KW - Novel models and algorithms
KW - Probabilistic Databases
UR - http://www.scopus.com/inward/record.url?scp=80655146271&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-24577-0_12
DO - 10.1007/978-3-642-24577-0_12
M3 - Conference contribution
AN - SCOPUS:80655146271
SN - 9783642245763
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 118
EP - 127
BT - Advances in Databases - 28th British National Conference on Databases, BNCOD 28, Revised Selected Papers
T2 - 28th British National Conference on Databases, BNCOD 2011
Y2 - 12 July 2011 through 14 July 2011
ER -