A study of frequent pattern mining in transaction datasets : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computer Science at Massey University, Palmerston North, New Zealand

dc.contributor.authorXu, Luofeng
dc.date.accessioned2011-09-21T20:25:27Z
dc.date.available2011-09-21T20:25:27Z
dc.date.issued2011
dc.description.abstractWithin data mining, the efficient discovery of frequent patterns—sets of items that occur together in a dataset—is an important task, particularly in transaction datasets. This thesis develops effective and efficient algorithms for frequent pattern mining, and considers the related problem of how to learn, and utilise, the characteristics of the particular datasets being investigated. The first problem considered is how to mine frequent closed patterns in dynamic datasets, where updates to the dataset are performed. The standard approach to this problem is to use a standard pattern mining algorithm and simply rerun it on the updated dataset. An alternative method is proposed in this thesis that is significantly more efficient provided that the size of the updates is relatively small. Following this is an investigation of the pattern support distribution of transaction datasets, which measures the numbers of times each pattern appears within the dataset. The evidence for the pattern support distribution of real retail datasets obeying a power law is investigated using qualitative appraisals and statistical goodness-of-fit tests, and the power law is found to be a good model. Based on this, the thesis demonstrates how to efficiently estimate the pattern support distribution based on sampling techniques, reducing the computational cost of finding this distribution. The last major contribution of the thesis is to consider novel ways to set the main user-specified parameters of frequent pattern mining, the minimum support, which defines how many times a pattern needs to be seen before it is ‘frequent’. This is a critical parameter, and very hard to set without a lot of knowledge of the dataset. A method to enable the user to specify rather looser requirements for what they require from the mining is proposed based on the assumption of a power-law-based pattern support distribution and fuzzy logic techniques.en_US
dc.identifier.urihttp://hdl.handle.net/10179/2687
dc.language.isoenen_US
dc.publisherMassey Universityen_US
dc.rightsThe Authoren_US
dc.subjectData miningen_US
dc.subjectComputer scienceen_US
dc.subjectPattern support distributionen_US
dc.subjectPower-law relationshipen_US
dc.subjectFuzzy logicen_US
dc.subjectFrequent pattern miningen_US
dc.subjectDynamic transaction dataseten_US
dc.subjectIncremental miningen_US
dc.titleA study of frequent pattern mining in transaction datasets : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computer Science at Massey University, Palmerston North, New Zealanden_US
dc.typeThesisen_US
massey.contributor.authorXu, Luofeng
thesis.degree.disciplineComputer Science
thesis.degree.grantorMassey University
thesis.degree.levelDoctoral
thesis.degree.levelDoctoralen
thesis.degree.nameDoctor of Philosophy (Ph.D.)
Files
Original bundle
Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
02_whole.pdf
Size:
2.51 MB
Format:
Adobe Portable Document Format
Description:
Loading...
Thumbnail Image
Name:
01_front.pdf
Size:
98.7 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
804 B
Format:
Item-specific license agreed upon to submission
Description: