The following two datasets were generated using the generator from the IBM Almaden Quest
research group.
This generator can no longer be downloaded from their website.
Another implementation that can be compiled using the g++ compilers can be dowloaded from
Paolo Palmerini's website.
The following datasets were prepared by Roberto Bayardo from the UCI datasets and PUMSB.
The next dataset was provided to us by Ferenc Bodon and contains
(anonymized) click-stream data of a hungarian on-line news portal.
The KDD CUP 2000 datasets are available at:
The following dataset was donated by Tom Brijs and contains
the (anonymized) retail market basket data from an anonymous Belgian retail store.
The data are provided ’as is’. Basically, any use of the data is allowed as long as the
proper acknowledgment is provided and a copy of the work is provided to Tom Brijs.
More details can be found here.
The following dataset was donated by Karolien Geurts and
contains (anonymized) traffic accident data.
The data are provided ‘as is’.
Basically, any use of the data is allowed as long as the
proper acknowledgement
is provided and a copy of the work is provided to Karolien Geurts.
More details can be found here.
The following dataset was donated by Claudio Lucchese, Salvatore Orlando, Raffaele Perego, and Fabrizio Silvestri
and was built from a spidered collection of web html documents.
More details can be found here.
|