Figure5
Figure 5. Visualization of recognition results. It presents the results of the proposed method on RGB and event modalities, respectively. "Correct" indicates correct predictions, while other cases correspond to incorrect predictions. The symbol "*" denotes an auxiliary gloss placeholder introduced during the WER alignment process, which is used to represent insertion operations when aligning predicted sequences with reference sequences. The bottom row shows attention heatmaps from the backbone network output generated by Grad-CAM[62]. The data used in this figure is obtained from the EvCSLR open-source dataset, which is available at https://github.com/diamondxx/EvCSLR.





