Công trìnhPublications Dilated Multi-Layer Perceptron Mixer for Faster Ne…
JournalSCIE SCIE Q1 IF 6.3

Dilated Multi-Layer Perceptron Mixer for Faster Neural Networks

Neural Networks
2026 vol. 201 pp. 108939 Elsevier

Multi-Layer Perceptrons Mixer (MLP-Mixer) has replaced self-attention layers with simple spatial MLP and achieved promising performances across visual tasks. However, existing MLP-Mixer methods have quadratic complexity with image resolutions and only process the input with a fixed size. When transferring pre-trained MLP-Mixer to downstream tasks, the model creates high computational costs and requires further operations to interpolate the MLP weights. This paper addresses these issues by proposing Dilated MLP-Mixers (DMLP) method, faster backbones for image classification and downstream tasks. DMLP backbone performs spatial mixing on non-overlapped and dilated windows to achieve better efficiency while exchanging information across windows. Unlike the MLP-based traditional approaches such as the Spatial MLP-Mixer architecture, the proposed method creates sparse and cross-connections between windows in the window MLP-Mixer block, known as the dilated windows. This approach allows for reducing the number of windows while it does not increase the number of connections within windows compared to previous approaches. This design has flexibility to architecture at different input sizes and results in linear complexity with image resolutions. The proposed method has been validated in the domain of automobile insurance and car damage recognition. Extensive experiments were conducted on benchmark datasets to evaluate its performance. Results indicate that the proposed DMLP backbone achieves performance comparable to MLP and Transformer models while demonstrating superior processing speed. Comparative evaluations reveal that the proposed method outperforms recent approaches in terms of both accuracy and computational efficiency. Furthermore, this method shows potential for application in systems with limited computational resources.

Deep Learning Image Classification Instance Segmentation MLP-Mixer Object Detection