SMARE - Structure Matching and Recognition Engine for Hand-Drawn Chemical Formulas
Expressing chemical compounds in various representations is challenging. This is especially true for novices, since the task demands extensive domain-specific knowledge and spatial visualization skills. To address this challenge, we propose SMARE, our Structure Matching and Recognition Engine for chemical formulas. It interprets hand-drawn molecular structures and identifies and highlights errors and thereby is a fundamental component of educational applications. SMARE leverages a YOLO (You Only Look Once) model to recognize fundamental entities in chemical structures such as atoms or bonds. The dataset for training, validating, and testing the model consists of 1,844 hand-drawn chemical molecular images collected from students. SMARE processes the identified entities to construct an abstract molecular graph. The engine compares the identified molecular graph against a database of known molecules and detects errors such as incorrect bonding, and valency violations. Our fine-tuned YOLO model achieves an accuracy of 93.8% in recognizing chemical entities in hand-drawn molecules. SMARE was tested on 7,909 hand-drawn chemical structures from 519 school students under real-world conditions, successfully identifying numerous errors in their chemical drawings. This demonstrates effectiveness of SMARE as a powerful and practical tool for chemistry education.