Machine learning: machines learn from cultural heritage
“Data Scientist” has been defined as the most appealing title for job announcement in the XXI century. Indeed, this is due to the widespread of produced and stored data in the last decade. This huge amount of data contains important information which can be considered by companies to take decisions and make prediction in several fields: market analysis, imaging processing for facial recognition and computational biology for cancer individuation are some of the possible uses.
Machine learning is one of the typical skills of a data scientist. Machine learning suggests to machines something which is actually easy and natural for humans and animals: learning from the experience.
Machine learning is one of the typical skills a data scientist. Machine learning suggests to machines something which is actually easy and natural for humans and animals: learning from the experience.
Indeed, algorithms employ computational methods able to directly learn information from data without requiring a predetermined equation as model. The model, and eventually the equation, derives from the data and their deep understanding. The algorithm performance improves with the amount of data, similarly to the real-life wherein collecting experience helps us to take decisions in a more conscious way.
Hence, the aim of machine learning is to find the natural schemes in the data, able to create a deep comprehension and help to make choices and predictions. There exist two different type of algorithms.
The supervised machine learning considers a data set wherein input and output are well-known and the model is trained to produces reasonable outputs for new data. This category includes classification methods, which predict discrete response –for example if a cancer is malignant or not – and regression methods, which generate continuous variable as response , e.g. temperature fluctuations.
On the contrary, the unsupervised machine learning recognizes more hidden paths inside data, which do not have a clear output to be predicted. The most common method is the clustering and it is used to explicitly analyse the data and identify natural clusters.
How does the learning process take place? Which is the analysis process of the data? The most challenging steps of machine learning are related to data processing and to the individuation of the correct model. Then we can divide the learning process into three main steps:
1) Data pre-processing: data can be really different in terms of dimension and type. Real dataset are often incomplete and imprecise, then the first step to organize the data and analyse them (e.g. to evaluate the presence of missing values and outliers). The dataset is composed by instances, which are related to some attributes, that are necessary to predict the output (named target attribute).
2) Choice of the model and training: a portion of the initial dataset represents the training set, that the model will use to learn the path and the rules that characterize the data.
3) Model Evaluation: once the model is chosen, it has to be tested and evaluated on a test dataset. This step is fundamental to understand the accuracy and precision of the prediction. Moreover, we could decide to modify the model, removing or adding complexity in order to understand the data.
Taking decisions and making prediction have an essential role in several fields, from finance to diagnostics, but we wonders how machine learning can run into the cultural heritage world. The research group of Dr. Elgammal at Rutgers’ Art and Artificial Intelligence Laboratory proposed an algorithm to evaluate the level of creativity of a painting, considering the historical and artistical context. The model is evaluated on 1700 artwork and it shows several interesting results. For example, the algorithm associates to “Les demoiselles d’Avignon” by Picasso the highest score in creating among the painting made between 1904 and 1911. This result is in agreement with the opinion of art historians, who consider its flat painting and the application of Primitivism as the direct precursor of Picasso Cubist style.
This work show that, thanks to machine learning, not only human beings are able to evaluate the creativity! Machines can do the same, perhaps in a more objective way!
Machine learning is one of the typical skills of a data scientist. Machine learning suggests to machines something which is actually easy and natural for humans and animals: learning from the experience.
The graph shows the trend of the creativity index (axis y) between 1850 and 1950 (axis x). Adapted from arXiv:1506.00711. |
Machine learning is one of the typical skills a data scientist. Machine learning suggests to machines something which is actually easy and natural for humans and animals: learning from the experience.
Indeed, algorithms employ computational methods able to directly learn information from data without requiring a predetermined equation as model. The model, and eventually the equation, derives from the data and their deep understanding. The algorithm performance improves with the amount of data, similarly to the real-life wherein collecting experience helps us to take decisions in a more conscious way.
Hence, the aim of machine learning is to find the natural schemes in the data, able to create a deep comprehension and help to make choices and predictions. There exist two different type of algorithms.
The supervised machine learning considers a data set wherein input and output are well-known and the model is trained to produces reasonable outputs for new data. This category includes classification methods, which predict discrete response –for example if a cancer is malignant or not – and regression methods, which generate continuous variable as response , e.g. temperature fluctuations.
On the contrary, the unsupervised machine learning recognizes more hidden paths inside data, which do not have a clear output to be predicted. The most common method is the clustering and it is used to explicitly analyse the data and identify natural clusters.
How does the learning process take place? Which is the analysis process of the data? The most challenging steps of machine learning are related to data processing and to the individuation of the correct model. Then we can divide the learning process into three main steps:
1) Data pre-processing: data can be really different in terms of dimension and type. Real dataset are often incomplete and imprecise, then the first step to organize the data and analyse them (e.g. to evaluate the presence of missing values and outliers). The dataset is composed by instances, which are related to some attributes, that are necessary to predict the output (named target attribute).
2) Choice of the model and training: a portion of the initial dataset represents the training set, that the model will use to learn the path and the rules that characterize the data.
3) Model Evaluation: once the model is chosen, it has to be tested and evaluated on a test dataset. This step is fundamental to understand the accuracy and precision of the prediction. Moreover, we could decide to modify the model, removing or adding complexity in order to understand the data.
Taking decisions and making prediction have an essential role in several fields, from finance to diagnostics, but we wonders how machine learning can run into the cultural heritage world. The research group of Dr. Elgammal at Rutgers’ Art and Artificial Intelligence Laboratory proposed an algorithm to evaluate the level of creativity of a painting, considering the historical and artistical context. The model is evaluated on 1700 artwork and it shows several interesting results. For example, the algorithm associates to “Les demoiselles d’Avignon” by Picasso the highest score in creating among the painting made between 1904 and 1911. This result is in agreement with the opinion of art historians, who consider its flat painting and the application of Primitivism as the direct precursor of Picasso Cubist style.
This work show that, thanks to machine learning, not only human beings are able to evaluate the creativity! Machines can do the same, perhaps in a more objective way!
Bibliography:
- Tom Mitchell, Machine Learning, McGraw Hill, 1997.
- Ian H. Witten, Eibe Frank, Mark A. Hall, Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann.
- Ahmed Elgammal, Babak Saleh, Quantifying Creativity in Art Networks, arXiv:1506.00711
- https://theconversation.com/which-paintings-were-the-most-creative-of-their-time-an-algorithm-may-hold-the-answers-43157
Comments
Post a Comment