Figure1

Figure 1. Self-supervised named entity recognition model for robot behavior control. (a) is an input conversion module that converts speech input to text information, which is not necessary when the input is text. (b) refers to the extraction of text features, feature enhancement, and fusion (generally including Pinyin, partial, parts of speech, and other information), and the extraction of named entities based on the fusion features in a self-supervised manner. (c) shows the scene of driving the robot movement based on the extracted named entities.