Workshop on Frequent Itemset Mining Implementations (FIMI'03)

19 November 2003, Melbourne, Florida, USA
in conjunction with ICDM'03

Scope & Objectives | Important dates | Submission guidelines | Workshop Committee

  • Proceedings

    The proceedings of the workshop are published online in the CEUR Workshop Proceedings.

    Workshop Schedule.
  • Implementations

    The implementations can be found in the FIMI repository.
  • FIMI Award

    The FIMI'03 best implementation award was granted to Gosta Grahne and Jianfei Zhu, for their implementations described in "Efficiently Using Prefix-trees in Mining Frequent Itemsets".
    The award consisted of the most frequent itemset: {diapers, beer}. (picture)
  • Scope & Objectives

    During the last decade, a huge number of algorithms have been developed in order to efficiently solve the frequent itemset mining (FIM) problem. Several new algorithms were shown by their authors to run faster than previously existing algorithms. As has been shown by Kohavi et al. at KDD 2001, the performance behavior of several of these algorithms is not always as was claimed by its authors when tested on some different datasets. Also, from personal experience, we noticed that different implementations of the same algorithm could behave completely different for different datasets.

    Given this proliferation of FIM algorithms, and sometimes contradicting claims, there is a pressing need to completely characterize and understand the algorithmic performance space. We would like to understand precisely why and under what conditions one algorithm would outperform another. This means testing the methods for a wide variety of parameters of the problem. For instance different datasets spanning dense and sparse, real and synthetic, small and large, hundreds to tens of thousands of items, thousands to millions of transactions, etc. Also of interest are methods that mine closed and maximal frequent itemsets.

    The main goal of this workshop is to find out which are the main implementation aspects of the FIM problem for all, closed and maximal pattern mining, and the behavior of the proposed algorithms with respect to different types of datasets. In essence, this workshop is a contest to solicit the best algorithms for solving these problems according to a certain well specified set of rules. One of the most important aspects will be that only open source code submissions are allowed (with flexible licensing) and that all submissions will become freely available (for research purposes only) on an online FIM repository along with several new datasets for benchmarking purposes. Note that we are looking for good implementations of existing algorithms as well as new algorithms.

    This is a unique workshop. It will consist of code submission as well as a paper submission describing the algorithm and a detailed performance study by the authors on publicly provided datasets, along with a detailed explanation on when and why their algorithm performs better than existing implementations or algorithms. The submissions will be tested independently by the co-chairs and other members of the program committee. The algorithms will be judged for three main tasks: mining all frequent itemsets, closed set mining, and maximal itemset mining. All submissions will be tested on test datasets which will not be made public until after the submission deadline. The workshop participants will be invited to come and discuss their submission; there will be a heavy focus on critical evaluation, i.e., what are the limitations, under what conditions does the algorithm work well, why it fails in other cases, what are the open areas. One outcome of the workshop will be to outline the focus for research on new problems in the field.

    The conditions for "acceptance" of the submission will consist of a correct implementation for the given task along with either of the two criteria: an efficient implementation compared with other submissions in the same category or a submission that provides new insight into the FIM problem. The idea is to highlight both successful and unsuccessful but interesting ideas.
  • Important dates

    • Intent to submit (before September 19, 2003)
    • Submission Deadline: September 26, 2003
    • Notification: October 22, 2003
    • Camera-ready Copies: October 31, 2003
    • Workshop date: November 19, 2003
  • Submission Guidelines

    All submissions should be sent electronically to Please limit one email per algorithm. The email should contain exactly 2 files: 1) The tar or zip file of the entire source code directory (including the Makefile) 2) the accompanying paper describing the implementation.

    The body of the email should contain the title of the paper and the name of the implemented algorithm, the list of authors with their respective affiliations and email-addresses.

    If the algorithm solves more than one FIM problem (all, maximal, closed), then the makefile should produce different executables: fim_all, fim_maximal, or fim_closed respectively. Send only one email in this case. If there are different algorithms for different tasks, please submit separate emails.

    The accompanying paper should be in Pdf or Postscript format only, not exceeding 25 pages, double spaced, 12pt font, including all figures, tables and references.

    Also see the submission rules, utilities and FAQ.
  • Workshop Committee

    Program co-chairs Program Committee