Data Analytics in Production

Davide Cacciarelli: Active learning-based sampling method to facilitate product-related data collection in big data environments

The term Big Data seems to be ubiquitous in many fields of applications and industrial production is no different. However, in production, this can be somewhat misleading as it often refers to process data that is obtained through automated data collection schemes with minimal manual interference. Product-related data is usually scarcer particularly in high volume manufacturing due to cost of inspection. This creates an imbalance in the amount of available data which can at times be quite substantial. Yet in many cases, predictive modeling relating process variables to product characteristics is sought after. Therefore, it will be beneficial to guide the data collection schemes for product characteristics through a real-time sampling methodology. This methodology should actively dictate when and how the new product data should be collected. 

The main goal of this project is to investigate and develop active learning-based sampling methods to facilitate effective data collection schemes for manufacturing processes where product characteristics are expensive to obtain. Moreover, one of the goals for pushing for digitalization in production is in fact to digitalize the knowledge and experience gained over the years for better reproducibility of desired outcomes. In thatsense, a real-time sampling scheme will not only be more cost-effective but can also greatly contribute and extend the process knowledge beyond the current levels. We believe active learning, strategically selecting the most informative data points, will offer a good starting point in developing such sampling plans in data-rich environments. We also propose to consider process surveillance methods that are traditionally used to detect anomalous situations in real-time industrial applications. We will investigate how to adapt multivariate process surveillance techniques towards detecting the need for sampling as more variation in process data is detected. The project is a joint effort with NTNU (Norwegian University of Science and Technology) in Trondheim and involves a reciprocating PhD project in that university.

PhD project

By: Davide Cacciarelli

Section: Statistics and Data Analysis

Principal supervisor: Murat Kulahci

Co-supervisor: John Tyssedal (NTNU)

Project title: Data Analytics in Production

Term: 01/11/2020 → 31/10/2023

Contact

Davide Cacciarelli
Research Assistant
DTU Compute

Contact

Murat Külahci
Professor
DTU Compute
+45 45 25 33 82