Software: TINTO - Converting Tidy Data into Image for Classification with 2-Dimensional Convolutional Neural Networks

Abstract

TINTO is an open-source, user-extendable framework that offers new opportunities for users to convert tidy data into images through the representation of characteristic pixels. For this transformation, TINTO implemented two-dimensional reduction algorithms, such as PCA and t-SNE. Our proposal also includes a technique used in painting known as blurring, which adds more ordered information to the image and can improve the classification task in CNNs.

Citing TINTO: If you used TINTO in your work, please cite the INFFUS Paper:

@article{inffus_TINTO,
    title = {A novel deep learning approach using blurring image techniques for Bluetooth-based indoor localisation},
    journal = {Information Fusion},
    author = {Reewos Talla-Chumpitaz and Manuel Castillo-Cara and Luis Orozco-Barbosa and Raúl García-Castro},
    volume = {91},
    pages = {173-186},
    year = {2023},
    issn = {1566-2535},
    doi = {https://doi.org/10.1016/j.inffus.2022.10.011}
}

And the SoftwareX paper

@article{softwarex_TINTO,
    title = {TINTO: Converting Tidy Data into Image for Classification with 2-Dimensional Convolutional Neural Networks},
    journal = {SoftwareX},
    author = {Manuel Castillo-Cara and Reewos Talla-Chumpitaz and Raúl García-Castro and Luis Orozco-Barbosa},
    year = {2023},
    issn = {2352-7110},
    volume = {22},
    pages = {101391},
    doi = {https://doi.org/10.1016/j.softx.2023.1013911}
}

Documentation

You can find all the documentation and sources of TINTO in OEG GitHub.

Video Example

Main Features

Supports all CSV data in Tidy Data format.
For now, the algorithm converts tabular data for binary and multi-class classification problems into machine learning.
Input data formats:
- Tabular files: The input data must be in CSV, taking into account the Tidy Data format.
- Tidy Data: The target (variable to be predicted) should be set as the last column of the dataset. Therefore, the first columns will be the features.
- All data must be in numerical form. TINTO does not accept data in string or any other non-numeric format.
Two dimensionality reduction algorithms are used in image creation, PCA and t-SNE from the Scikit-learn Python library.
The synthetic images to be created will be in black and white, i.e. in 1 channel.
The synthetic image dimensions can be set as a parameter when creating them.
The synthetic images can be created using characteristic pixels or blurring painting technique (expressing an overlap of pixels as the maximum or average).
Runs on Linux, Windows and macOS systems.
Compatible with Python 3.7 or higher.

Input

The following table shows a classic example of the IRIS CSV dataset as it should look like for the run:

sepal length	sepal width	petal length	petal width	target
4.9	3.0	1.4	0.2	1
7.0	3.2	4.7	1.4	2
6.3	3.3	6.0	2.5	3

Output

The following Figure show the output of TINTO:

Twitter Facebook LinkedIn

Ph.D. Manuel Castillo-Cara