AutoDL Benchmark


In the preparation of AutoDL challenges, we formatted around 100 datasets and used 66 of them for AutoDL challenges. Some meta-features of these datasets are shown in the table below. Note that in AutoDL challenges all tasks are multi-label classification task. You can also format your own data with our code.

The public datasets provided in the final AutoDL challenge can be found here.

Winning methods

The implementation of the strongest baseline (Baseline 3) we provided in AutoDL challenges can be found here.

Benchmark results

As a first step towards a rich AutoDL benchmark, we ran Baseline 3 on all 66 AutoDL datasets. Their Area under Learning Curve (ALC) scores and final NAUC scores (time budget T=1200s and t0=60) are shown in the following figures. The rectangular area in the first figure is zoomed in the second figure.

We also ran AutoDL challenge's top-1 winner DeepWisdom's solution on these 66 datasets and the results are shown below.

Numerical values are shown in the following table.

A complete table (CSV file) including all results in AutoDL challenge's feedback phase, final phase and post-challenge analysis can be found

(updated on 7 May 2020)

If you wish to use above results and data, please think of citing following reference:

@article{liu_winning_2020,title = {Winning solutions and post-challenge analyses of the {ChaLearn} {AutoDL} challenge 2019},journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},author = {Liu, Zhengying and Pavao, Adrien and Xu, Zhen and Escalera, Sergio and Ferreira, Fabio and Guyon, Isabelle and Hong, Sirui and Hutter, Frank and Ji, Rongrong and Nierhoff, Thomas and Niu, Kangning and Pan, Chunguang and Stoll, Danny and Treguer, Sebastien and Wang, Jin and Wang, Peng and Wu, Chenglin and Xiong, Youcheng},year = {2020},pages = {17},annote = {Under review}}