keyboard_arrow_up
AN EFFICIENT ALGORITHM FOR SEQUENCE GENERATION IN DATA MINING

Authors

Dr.S.Vijayarani and Ms.S.Deepa


Bharathiar University, India
Abstract

Data mining is the method or the activity of analyzing data from different perspectives and summarizing it into useful information. There are several major data mining techniques that have been developed and are used in the data mining projects which include association, classification, clustering, sequential patterns, prediction and decision tree. Among different tasks in data mining, sequential pattern mining is one of the most important tasks. Sequential pattern mining involves the mining of the subsequences that appear frequently in a set of sequences. It has a variety of applications in several domains such as the analysis of customer purchase patterns, protein sequence analysis, DNA analysis, gene sequence analysis, web access patterns, seismologic data and weather observations. Various models and algorithms have been developed for the efficient mining of sequential patterns in large amount of data. This research paper analyzes the efficiency of three sequence generation algorithms namely GSP, SPADE and PrefixSpan on a retail datasetn by applying various performance factors. From the experimental results, it is observed that the PrefixSpan algorithm is more efficient than other two algorithms.

Keywords

Sequential Pattern, GSP, SPADE, PrefixSpan, candidate generation, minimum_support, projection based.