NEWS | 13/09/2021

AI That Describes Images in Italian

CLIPITALIAN, RECENTLY DEVELOPED BY A TEAM INCLUDING FEDERICO BIANCHI, IS THE FIRST AND ONLY AI MODEL THAT ASSOCIATES IMAGES TO THEIR DESCRIPTIONS IN ITALIAN ON A LARGE SCALE

Searching images through keywords is something familiar to everyone. This is possible thanks to models that, applying machine learning, are able to classify elements. CLIP-Italian is the first and the only Artificial Intelligence model in large-scale to classify images in Italian language, which was recently developed by Federico Bianchi, researcher at Data and Marketing Insights (DMI) at Bocconi, Giuseppe Attanasio (Politecnico di Torino), Raphael Pisoni (independent researcher), Silvia Terragni (Università degli Studi di Milano-Bicocca), Gabriele Sarti (University of Groningen) and Sri Lakshmi (independent researcher).

The CLIP-Italian model associates images and their descriptions, allowing to perform a set of tasks such as image search and classification in Italian. This type of model is generally trained on a dataset of items (called training set). CLIP-Italian based on CLIP, one of the most advanced machine learning models currently released by the OpenAI company, is able to perform "zero-shot" classifications, i.e. it can correctly classify objects and concepts from images not previously seen during the training phase.

The training work of CLIP-Italian was based on a dataset of about 1.4 million images, each associated to a description in Italian. The preparation of the dataset also involved a machine translation to use pre-existing datasets in other languages, in addition to the use of original data.

Large-scale models are difficult and expensive to train. The CLIP-Italian project was possible by participating in the international Flax/JAX Community Week competition, in which Google and HuggingFace provided computing power and funding. CLIP-Italian arrived among the finalists of the competition. It also managed to receive a special mention in the second round, which will grant access to additional resources for further developing the project.

The code used by the project to train the model is openly accessible on GitHub. Also, it is possible to try both classification and image search on the official demo, available on HuggingFace.

by Weiwei Chen

Mon	Tue	Wed	Thu	Fri	Sat	Sun
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

AI That Describes Images in Italian

People

Adam Eric Greenberg Makes Top List

Graziella Romeo Joins Top Academic Journal

Seminars

Seminars

THE FAILURE TO PREVENT FRAUD IN THE UK CORPORATE ENVIRONMENT
Seminar of Crime Law

Clare Balboni - Firm Adaptation in Production Networks: Evidence from Extreme Weather Events in Pakistan

AI That Describes Images in Italian

People

Adam Eric Greenberg Makes Top List

Graziella Romeo Joins Top Academic Journal

Seminars

Seminars

THE FAILURE TO PREVENT FRAUD IN THE UK CORPORATE ENVIRONMENT Seminar of Crime Law

Clare Balboni - Firm Adaptation in Production Networks: Evidence from Extreme Weather Events in Pakistan

THE FAILURE TO PREVENT FRAUD IN THE UK CORPORATE ENVIRONMENT
Seminar of Crime Law