Several research efforts have addressed the problem of characterizing student mistakes in database query writing, among which the SQL error taxonomy proposed by Taipalus et al. [1] has become a widely adopted framework. In this paper, we report ongoing work stemming from our experience in operationalizing this taxonomy within an automated SQL error-detection pipeline, where we encountered practical limitations and ambiguities arising when taxonomy categories are interpreted algorithmically rather than by human annotators. Building on extensive manual annotation and iterative refinement conducted during the development of an automated SQL error correction tool, we outline a practice-driven revision of the taxonomy aimed at supporting fine-grained automated error classification. The proposed revision focuses on refining error definitions, regrouping certain categories, clarifying labels, and addressing gaps and redundancies identified in the original taxonomy. We conclude by discussing how the revised taxonomy can support educational tools and by outlining future directions for validation and empirical assessment.

A Proposal for Revising SQL Error Taxonomies Based on Automated Detection

Ponzini D.;Guerrini G.;Catania B.
2026-01-01

Abstract

Several research efforts have addressed the problem of characterizing student mistakes in database query writing, among which the SQL error taxonomy proposed by Taipalus et al. [1] has become a widely adopted framework. In this paper, we report ongoing work stemming from our experience in operationalizing this taxonomy within an automated SQL error-detection pipeline, where we encountered practical limitations and ambiguities arising when taxonomy categories are interpreted algorithmically rather than by human annotators. Building on extensive manual annotation and iterative refinement conducted during the development of an automated SQL error correction tool, we outline a practice-driven revision of the taxonomy aimed at supporting fine-grained automated error classification. The proposed revision focuses on refining error definitions, regrouping certain categories, clarifying labels, and addressing gaps and redundancies identified in the original taxonomy. We conclude by discussing how the revised taxonomy can support educational tools and by outlining future directions for validation and empirical assessment.
File in questo prodotto:
File Dimensione Formato  
DataEd-paper9.pdf

accesso chiuso

Tipologia: Documento in Post-print
Dimensione 1.13 MB
Formato Adobe PDF
1.13 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11567/1308717
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact