Ensuring equitable access to web-based visual content in Science, Technology, Engineering, and Mathematics (STEM) disciplines remains a significant challenge for visually impaired users. This preliminary study explores the use of Large Language Models (LLMs) to automatically generate high-quality alternative texts for complex web images in these domains, contributing to the development of an accessibility tool. First, we analyzed the outputs of various LLM-based image-captioning systems, selected the most suitable one (Gemini), and developed a browser extension, AlternAtIve, capable of generating alternative descriptions at varying verbosity levels. To evaluate AlternAtIve, we assessed its perceived usefulness in a study involving 35 participants, including a blind user. Additionally, we manually compared the quality of the outputs generated by AlternAtIve with those provided by two state-of-the-practice tools from the Google Web Store, using a custom metric that computes the quality of the descriptions considering their correctness, usefulness, and completeness. The results show that the descriptions generated with AlternAtIve achieved high quality scores, almost always better than those of the other two tools. Although conveying the meaning of complex images to visually impaired users through descriptions remains challenging, the findings suggest that AI-based tools, such as AlternAtIve, can significantly improve the web navigation experience for screen reader users.
Improving Web Accessibility With an LLM-Based Tool: A Preliminary Evaluation for STEM Images
Maurizio Leotta;Marina Ribaudo
2025-01-01
Abstract
Ensuring equitable access to web-based visual content in Science, Technology, Engineering, and Mathematics (STEM) disciplines remains a significant challenge for visually impaired users. This preliminary study explores the use of Large Language Models (LLMs) to automatically generate high-quality alternative texts for complex web images in these domains, contributing to the development of an accessibility tool. First, we analyzed the outputs of various LLM-based image-captioning systems, selected the most suitable one (Gemini), and developed a browser extension, AlternAtIve, capable of generating alternative descriptions at varying verbosity levels. To evaluate AlternAtIve, we assessed its perceived usefulness in a study involving 35 participants, including a blind user. Additionally, we manually compared the quality of the outputs generated by AlternAtIve with those provided by two state-of-the-practice tools from the Google Web Store, using a custom metric that computes the quality of the descriptions considering their correctness, usefulness, and completeness. The results show that the descriptions generated with AlternAtIve achieved high quality scores, almost always better than those of the other two tools. Although conveying the meaning of complex images to visually impaired users through descriptions remains challenging, the findings suggest that AI-based tools, such as AlternAtIve, can significantly improve the web navigation experience for screen reader users.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.



