Mining sequential patterns from probabilistic databases by pattern-growth

Muhammad Muzammal*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

7 Citations (Scopus)

Abstract

We propose a pattern-growth approach for mining sequential patterns from probabilistic databases. Our considered model of uncertainty is about the situations where there is uncertainty in associating an event with a source; and consider the problem of enumerating all sequences whose expected support satisfies a user-defined threshold θ. In an earlier work [Muzammal and Raman, PAKDD'11], adapted representative candidate generate-and-test approaches, GSP (breadth-first sequence lattice traversal) and SPADE/SPAM (depth-first sequence lattice traversal) to the probabilistic case. The authors also noted the difficulties in generalizing PrefixSpan to the probabilistic case (PrefixSpan is a pattern-growth algorithm, considered to be the best performer for deterministic sequential pattern mining). We overcome these difficulties in this note and adapt PrefixSpan to work under probabilistic settings. We then report on an experimental evaluation of the candidate generate-and-test approaches against the pattern-growth approach.

Original languageEnglish
Title of host publicationAdvances in Databases - 28th British National Conference on Databases, BNCOD 28, Revised Selected Papers
Pages118-127
Number of pages10
DOIs
Publication statusPublished - 2011
Externally publishedYes
Event28th British National Conference on Databases, BNCOD 2011 - Manchester, United Kingdom
Duration: 12 Jul 201114 Jul 2011

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume7051 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference28th British National Conference on Databases, BNCOD 2011
Country/TerritoryUnited Kingdom
CityManchester
Period12/07/1114/07/11

Keywords

  • Mining complex sequential data
  • Mining Uncertain Data
  • Novel models and algorithms
  • Probabilistic Databases

Cite this