fig2
Figure 2. Framework of leveraging multimodal features for few-shot malware classification. Initially, malware images and API invocation sequences are derived from malware binary files and then fed into two distinct encoders, yielding respective unimodal features. Within the feature fusion module, features from both modalities undergo pooling and normalization separately before being segmented. The segmented features are then integrated through a GNN for feature fusion. Classification is then performed using a prototypical network. API: Application programming interface; GNN: graph neural network.







