Telosaes.it / Primo piano scalac

December 2024, Year XVI, no. 12

Daniele Panfilo

Artificial Intelligence that generates data. Privacy proof.

“Synthetic data maintain the statistical utility of the original data, but since they are generated using an algorithm, they have no sensitive information and therefore cease to be personal data. This way any risk of re-identification is eliminated definitively.”

Telos: For you, generative artificial intelligence was, first, your academic specialisation, then, an entrepreneurial challenge. Could you tell us about your experience?

Daniele Panfilo: I graduated from Sapienza University of Rome in Industrial and Management Engineering and later got two master’s degrees: the first, again at Sapienza, in Optimization and Modelling; the second in operative research from the University of Maastricht, Department of Knowledge Engineering and Data Science, one of the first specialist degrees in Europe dedicated to Artificial Intelligence (AI). I stayed on as a researcher at Medtronic Bakken Research Center, where I started as a data scientist in the biomedical field. I studied pacemakers and AV Delay Optimization. As I was studying and doing research in the area of AI, I started to become curious about how AI could be applied in an industrial setting. I have always tried to reconcile these two spheres in order to translate academic research into practical applications with an impact on industry. Once I got back to Italy, I worked as a consultant in the Med-Pharma sector, then I joined the group of data scientists at Allianz Technology. This is where I met Sebastiano Saccani, who started the Aindo adventure with me. While I was working for Allianz, I started a PhD at the University of Trieste in Artificial Intelligence, where I specialised in generative machine-learning models. Since the beginning of my academic career, I have known that artificial intelligence would have broad applications not only in the field of research but, especially, in industry and business. When I started studying, generative AI was not nearly as widely known as it is now and was a totally different paradigm within the field of artificial intelligence. I intuitively knew that it would have led to significant turning points. Until then, the applications for artificial intelligence were mainly envisioned in order to classify or predict. Instead, generative models extended the capabilities of AI to much broader domains. Sebastiano and I got the idea for Aindo while we were working as AI algorithm developers in various sectors (Finance, Health, Marketing and Retail). We realised that it took a long time to get access to data, and since we were already working with generative models in our academic research, we had the insight to extend the capabilities of these models to the generation of artificial tabular data that mimic the statistical behaviour of real data, and therefore conserve its utility. However, because this data is artificial, it no longer contains personal information, which gets around the problem of privacy beautifully.
In 2020 we decided to submit this idea to a European accelerator program, the European Data Incubator, and we came in first out of 500 applicants and were awarded 100,000 euros equity-free to develop the first product. In 2021 Aindo received the first round of financing with the Vertis fund. In 2023 it raised another 6 million with the United Ventures and Vertis funds.

The ethical and responsible use of artificial intelligence has come up over and over again in public debate, but we suspect that few people really know what this means and the implications. However, Aindo has made this its mission. Could you describe the company and its outlook?

Aindo is a scale-up (a more evolved start-up) founded in 2018 within the International School of Advanced Studies (SISSA) in Trieste from the idea that artificial intelligence can bring great value to society, without infringing upon people’s rights and freedoms. This is why personal data can be used on our synthetic data generation platform while respecting privacy, reconciling AI innovation with data protection, impartiality and reliability. This solution is the result of state-of-the-art research on generative machine-learning models: starting with real data, these models can generate artificial data that faithfully replicate the patterns and behaviours of the population of real data. So, synthetic data maintain the statistical utility of the original data, but since they are generated using an algorithm, they have no sensitive information and therefore cease to be personal data. This way any risk of re-identification is eliminated definitively. Aindo’s patented technology – the first in Europe to obtain Europrivacy certification – incentivises the secure exchange of data, democratises innovation and facilitates collaboration and R&D, in line with Europe’s drive to create open data spaces that respect privacy legislation. This allows us to take full advantage of the potentials of artificial intelligence in strategic areas with high social impact and in business, like healthcare and pharmaceutical research and the development of technologies for financial and insurance markets.

The EU Regulation on artificial intelligence has expressly recognised synthetic data. Can you explain what data synthesis is in layman’s terms and how this technology can be used to benefit scientific research and public policy planning?

Artificial intelligence is revolutionising every aspect of our lives. However, unfortunately, over 85% of AI projects never get to the production phase. This happens because these projects, in order to get started, need huge amounts of data. Organizations need to access to data and to know for certain that this data is complete and secure. This process is costly in terms of both time and money. Synthetic data technology is becoming a key part of the successful development of artificial intelligence and data analytics projects. Synthetic data cannot be traced to real people, rather they simulate the behaviour of authentic data. Today, various sectors request synthetic data. A hospital, for example, might need to share the data of their patients to develop tools to better diagnose and treat specific pathologies. But electronic medical records are confidential. These data are sensitive and hard to use, even for R&D and in collaborations between public bodies. In these cases, the platform developed by Aindo converts information and generates a database of synthetic data that can be used for R&D while ensuring high privacy standards are being met. Another way that synthetic data can be applied is in research and study to help us understand socio-cultural phenomenon and improve people’s lives. When information cannot be accessed because it is subject to privacy protocols, synthetic data, which is compatible with international database systems, can be a valuable source of information for scientific investigation.

As always happens when faced with a change in the technological paradigm, artificial intelligence is often associated with the ghost of epoch-making economic and social change. Are the catastrophists right in being afraid or not, and why?

Innovation always comes with challenges that need to be gradually overcome: innovation means working on something that doesn’t exist yet, and it is often hard to predict how it will be perceived. People are always afraid of what they don’t understand, and in general, catastrophists are people who are insufficiently or badly informed. Like other emerging or groundbreaking technologies, sometimes artificial intelligence has important, useful applications and sometimes it is misused and has implications that could generate negative consequences. This doesn’t mean we should stop innovating. Instead, it’s important to study it, learn about it, regulate it and apply it properly. I don’t think the catastrophists, or the conspiracy theorists, are working toward these goals. We need quality information and education. When you fully understand something, your fear vanishes.

Mariella Palazzolo

Editorial

The USA innovates, China copies, Europe regulates. This overused, worn-out refrain reflects an unforgiving image of the Old Continent’s decline. However, it leaves one crucial question open: does regulating necessarily mean putting a burden on innovators or can a regulatory framework be created that encourages technological innovation and fosters its positive impact on the economy? The advent of artificial intelligence is an excellent testing ground, given its applications not only in the industrial and financial sectors, but also in scientific research and public policy planning. And yet, as Panfilo explains, AI project development is always coming up against the wall of insufficient data.
This might look like a paradox: the digital transition created the premises for an exponential increase in available data, but their potential is still largely unexpressed, especially because their use is limited by privacy protection regulations. Take this practical example: the digitalisation of medical records enables a huge amount of data to be shared and entered into the system, therefore also potentially enabling the building of predictive models that could really take the quality of National Health Service planning to the next level. But we are talking about personal data, and health-related data to boot, so particularly sensitive. Using them to train AI systems means using them for purposes other that those for which they were collected, an activity that privacy protection regulations severely limit. And herein lies the tangle that legislators are being called to unravel: on the one hand, there is the freedom to carry out research and the massive collective benefits deriving thereof, and on the other, the safeguarding of individual rights. In Italy the secondary use of personal health data for scientific research was subject to even stricter restrictions than those of the already not particularly flexible EU Regulation (the much-maligned GDPR). Finally, this year the Parliament has decided to do something about this anomaly, also thanks to the advocacy initiatives of scientific companies, associations of healthcare managers and other stakeholders. Now, instead of having to consult the Data Protection Authority case by case, the authority has identified general safeguards that must be complied with. But there is still the same fundamental dilemma, and it is generative AI itself that has provided lawmakers with the answer, through data synthesis. Starting from real datasets, artificial data can be generated that mimic the properties of real data starting from the data they were generated from, but they can no longer be traced to them. This truly is a paradigm shift. Personal data is not being pseudonymised, which would put it at risk for being reattributed to a person using other information. Synthetic data is being generated and subsequently used to train AI systems. We can say that this is no longer a mere prospect, it is a reality, which European lawmakers have fully recognized. First of all, the 2022 Data Governance Act, which lays out the conditions for third-party sharing and reuse of data kept by public bodies, added synthesis as one of the safeguards deemed capable of preventing individuals from being re-identified. Hence, the 2024 Artificial Intelligence Regulation (better known as the “AI Act”) took the decisive step: the secondary processing of personal data to develop and train high-risk AI systems is only allowed when it is impossible to use synthetic or anonymised data. Hence, Europe, which is synonymous with regulation, did its duty. What we hope is that Italian lawmakers, currently caught up in the process of drafting an Enabling Bill on AI, will take the baton.
This issue concludes PRIMOPIANOSCALAc’s 2024 cover series, inspired by the works of Romano Gazzera, a Piedmontese painter known for his ‘giant’, ‘talking’, ‘flying’ flowers which, along with other iconographic themes connected to historical and collective memory, characterised and distinguished him as the frontrunner of the Italian Neo-floral school.
For Daniele we have chosen the tulip, in tribute to his studies and early career in Holland. We didn’t have his full body shot we needed for this cover, so we couldn’t resist… and reconstructed it using an AI image extender of a graphic design programme, which we asked to complete the bottom part of the image with the missing elements (footrest and his legs and feet). Daniele wouldn’t dare complain! With this December issue, all of us here at Telos A&S would like to wish you a Merry Christmas and a peaceful 2025.

Marco Sonsini

Daniele Panfilo is a scientist and founder of Aindo, a scale-up that has developed and patented technology to generate synthetic data using AI. In our interview, Daniele talks in-depth about his academic and professional career and about the history of Aindo, so his bio will be totally different from our usual ones.
We would like to dedicate some space to Daniele the man, not Panfilo the engineer. Daniele loves good food, especially Italian food, but since he is very curious, he often tries even rather exotic cuisine. He has a layman’s interest in wine and believes it is one of the things Italy can truly be proud of, so proud he is “thinking about taking a course to understand it better.”
One of his hobbies is undoubtedly reading: “I really like Russian and German novels, from Tolstoy to Thomas Mann.” In his free time, he prefers reading things that have nothing to do with his job “to get my mind off technology, which takes up the majority of my day.”
Obviously, he is constantly keeping abreast of all things relating to AI and science, but “for this kind of reading I prefer primary literature and so choose scientific articles to avoid unqualified ‘opinions’. Science is made up of events that can be proven and repeated, not opinions.” Finally, he tries to find a little time every day for sport and his physical wellbeing, essential to withstand the particularly intense rhythm of his life. In his free time, he goes to the gym or to the pool. In recent years, since he lives in Trieste, he has really gotten into sailing, to the point that he and some friends even bought a small boat, “As a kid my father passed on his passion for fishing to me, and I still continue to cultivate this passion, even though it’s a bit more difficult.”
Daniele was born in Rome, is 36 years old and is “an only child and very close to my family and friends.”

Marco Sonsini

	A member of the Fipra Network
	Socio Corporate di American Chamber of Commerce in Italy

Telosaes.it

SocialTelos

SocialTelos

Daniele Panfilo

Artificial Intelligence that generates data. Privacy proof.

Editorial

Credits