Can artificial intelligence models be trusted?

Big data , iot , artificial intelligence (ai) technology every where , smart city technology concept. Neural networks connect atoms and blur city people cross street background. 3d Rendering.
Credit: © zapp2photo-AdobeStock
AI models are rapidly making inroads into a wide range of industrial use cases. And, to be deployed in critical systems, these models have to be robust to data variability. CEA-List has developed automated testing software called AIMOS that leverages formal methods to make reliability testing simpler and easier. The software was developed under the multi-partner trustworthy AI program, part of the “France 2030” national investment plan.

AI is being adopted at lightning speed across virtually all industries, making trustworthy AI a major challenge—and opportunity—for France’s economic competitiveness and industrial sovereignty. CEA-List is contributing to the trustworthy AI program, part of the “France 2030” national investment plan, by developing new testing software to help make AI systems more robust. The software (AIMOS, for AI Metamorphism Observing Software) ensures that AI models are robust to disturbances. The goal is to make sure systems that use AI operate as intended, generating reliable, reproducible results in all situations.

CEA-List’s software is based on something called metamorphic testing, which tests properties like symmetry of input/output pairs (for symmetrical systems) and verifies that a blurry or over- or underexposed input image, for example, does not produce incorrect results. AIMOS was implemented on two use cases provided by industrial companies in the program.

Software validated on two industrial use cases

The first test case, from car maker Renault, focused on quality control of rear-axle welds based on photographs. AIMOS compared the reliability of different AI models on low-quality images. An automation was implemented to generate images of varying quality (focus, exposure, etc.). AIMOS was then used to confirm to what extent the AI models tested still produced accurate results when the “disturbances” to the inputs remained within predetermined acceptable limits. This information was then used to rank the AI models according to stability.


Renault test case: AIMOS analysis of the stability of different AI models on increasingly blurry images


The second test case, on the ACAS Xu detect-and-avoid system, was provided by Airbus. AIMOS was used to test 45 neural networks developed to analyze the speeds and positions of other drones in the immediate vicinity. Here, the AIs had to predict symmetrical avoidance maneuvers for symmetrical angles of approach. The AIMOS analysis showed that, of the 45 networks, 42 were more than 95% stable. The other three were 60% to 70% stable.

What’s next

These two test cases showed that AIMOS is easy to use, that the results are reproducible, and that it is applicable to very different use cases. Renault is planning to integrate the software into its quality control process. The software will undergo further development and testing on other program partners’ use cases.

See also

Research programs

Responsible AI

CEA-List is rolling out an ambitious research program to support the responsible development of AI-based systems for industry and society.
Read more

Artificial intelligence

From home to work, artificial intelligence has made in roads into virtually every aspect of our lives. It has transformed how we relate to others, do our jobs, and interact with the devices we use eve...
Read more
Technology platforms

Artificial intelligence

Support the responsible development of AI-based systems for industry and society.
Read more

The Factory of the Future

The pervasive use of digital technology in all areas of industry from product design to manufacturing is rapidly ushering in a new industrial era.
Read more