Workshop on Frequent Itemset Mining Implementations (FIMI'03)
19 November 2003, Melbourne, Florida, USA in conjunction with ICDM'03
Scope & Objectives |
Important dates |
Submission guidelines |
Workshop Committee
Proceedings
The proceedings of the workshop are published online in the
CEUR Workshop Proceedings.
Workshop Schedule.
Implementations
The implementations can be found in the FIMI repository.
FIMI Award
The FIMI'03 best implementation award was granted to Gosta Grahne and Jianfei Zhu, for their
implementations described in "Efficiently Using Prefix-trees in Mining Frequent Itemsets".
The award consisted of the most frequent itemset: {diapers, beer}.
(picture)
-
During the last decade, a huge number of algorithms have been
developed in order to efficiently solve the frequent itemset mining
(FIM) problem. Several new algorithms were shown by their authors to
run faster than previously existing algorithms. As has been shown by
Kohavi et al. at KDD 2001,
the performance behavior of several of
these algorithms is not always as was claimed by its authors when
tested on some different datasets. Also, from personal experience, we
noticed that different implementations of the same algorithm could
behave completely different for different datasets.
Given this proliferation of FIM algorithms, and sometimes
contradicting claims, there is a pressing need to completely
characterize and understand the algorithmic performance space. We
would like to understand precisely why and under what conditions one
algorithm would outperform another. This means testing the methods for
a wide variety of parameters of the problem. For instance different datasets spanning dense and
sparse, real and synthetic, small and large, hundreds to tens of
thousands of items, thousands to millions of transactions, etc. Also
of interest are methods that mine closed and maximal frequent
itemsets.
The main goal of this workshop is to find out which are the main
implementation aspects of the FIM problem for all, closed and
maximal pattern mining, and the behavior of the proposed
algorithms with respect to different types of datasets. In
essence, this workshop is a contest to solicit the best algorithms
for solving these problems according to a certain well specified
set of rules.
One of the most important aspects will be that only open source code submissions
are allowed (with flexible licensing) and that all submissions will
become freely available (for research purposes only) on an online FIM
repository along with several new datasets for benchmarking purposes.
Note that we are looking for good implementations of existing algorithms as
well as new algorithms.
This is a unique workshop. It will consist of code submission as well
as a paper submission describing the algorithm and a detailed
performance study by the authors on publicly provided datasets, along
with a detailed explanation on when and why their algorithm performs
better than existing implementations or algorithms. The submissions
will be tested independently by the co-chairs and other members of the
program committee. The algorithms will be judged for three main tasks:
mining all frequent itemsets, closed set mining, and maximal itemset
mining. All submissions will be tested on test
datasets which will not be made public until after the submission
deadline. The workshop participants will be invited to come and
discuss their submission; there will be a heavy focus on critical
evaluation, i.e., what are the limitations, under what conditions does
the algorithm work well, why it fails in other cases, what are the
open areas. One outcome of the workshop will be to outline the focus
for research on new problems in the field.
The conditions for "acceptance" of the submission will consist
of a correct implementation for the given task along
with either of the two criteria: an efficient implementation
compared with other submissions in the same category or a
submission that provides new insight into the FIM problem. The idea
is to highlight both successful and unsuccessful but interesting
ideas.
-
- Intent to submit (before September 19, 2003)
- Submission Deadline: September 26, 2003
- Notification: October 22, 2003
- Camera-ready Copies: October 31, 2003
- Workshop date: November 19, 2003
-
All submissions should be sent electronically to
fimi@cs.helsinki.fi.
Please limit one email per algorithm. The email
should contain exactly 2 files: 1) The tar or zip file of the entire
source code directory (including the Makefile) 2) the accompanying
paper describing the implementation.
The body of the email should contain the title of the paper and the name of
the implemented algorithm, the list of authors with their respective
affiliations and email-addresses.
If the algorithm solves more than one FIM problem (all, maximal,
closed), then the makefile should produce different executables:
fim_all, fim_maximal, or fim_closed respectively. Send only one email
in this case. If there are different algorithms for different tasks,
please submit separate emails.
The accompanying paper should be in Pdf or Postscript format only,
not exceeding 25 pages, double spaced, 12pt font, including all figures, tables and references.
Also see the submission rules, utilities and FAQ.
-
Program co-chairs
Program Committee
|