Evaluation dataset

The evaluation dataset will be composed of 9 books. It will be composed of a set of training images and a set of test images. The training dataset will contain a reduced number of book pages, along with their ground truth in the TXT/PNG format. The training images are representative of different contents and layouts of the book pages. On the other side, the test dataset will be composed of images representing the remainder book pages.

Only the ground truths of few pages of the 9 selected books will be provided to constitute the training dataset of the evaluation dataset.

The participants are free to use the evaluation dataset for training, testing or any other purpose related to the HBA competition.

The 9 selected books to form the evaluation dataset are: Book 01, Book 02, Book 03 and Book 05 as manuscript books, and Book 07, Book 08, Book 09, Book 10 and Book 11 as printed ones.

The training dataset of each book of the evaluation dataset is structured as follows.

Book Id. Number of training images
Book 01 22
Book 02 42
Book 03 56
Book 05 27
Book 07 24
Book 08 45
Book 09 20
Book 10 26
Book 11 32

All evaluation dataset files are available from this link.

The evaluation dataset is only available for registered participants. A login and a password will be sent to each registered participant of the HBA competition.

Two versions of the evaluation dataset are available as:

1- TXT

2- PNG

The two versions differ in the format of the training and test files (TXT and PNG).

Each book of the evaluation dataset is composed of 3 folders namely, “images”, “train” and “test”.

1- “images”: It contains all TIFF images of a book.

2- “train”: It is composed of a number of TXT/PNG files to form the training dataset. The training dataset is representative of different contents and layouts of the analyzed book pages. Each line of a TXT file is composed of the following three values: the coordinates of the selected foreground pixel and its corresponding label class representing the content type in the analyzed book. The label value varies between 1 and 6. If the label value is equal to 1, the content class represents a graphical content else it corresponds to a textual content. In the case of PNG version, pixel-labeled images have been provided. At maximum 6 BGR values have been used to encode the 6 different content classes in the pixel-labeled images.

3- “test”: It contains the remainder of book TXT/PNG files by reference to the training dataset in order to form the test dataset. Each line of a TXT file is composed of only the coordinates of the selected foreground pixel. The participants should fill out these files with the predicted class label for each foreground pixel. In the case of PNG version, pixel-lalebed images with the selected foreground pixels colored in white have been provided. The participants should provide in the case of PNG version as an output a pixel-labeled image with respect of the BGR values defined in the training files.