TRANSWALKCOTNET: A SWIN TRANSFORMER AND DEEPWALK-BASED CROSS-ATTENTION FUSION FRAMEWORK FOR COTTON PLANT DISEASE IDENTIFICATION HYBRID GRAPH + TRANSFORMER MODEL
DOI:
https://doi.org/10.4238/0zf7vy26Keywords:
DeepWalk, Swin-Transformer, Cross-Attention Fusion, Hybrid Graph, ResNetAbstract
In contrast to the traditional CNN-based disease recognition algorithms, the proposed TransWalkCotNet combines hierarchical Swin Transformer image representations with the DeepWalk-based graph layouts by adopting a cross-attention fusion mechanism, thus being able to simultaneously model visual lesion patterns and inter-image relational structure.
The identification of cotton disease is a task that is vital in precision agriculture and thus it needs strong models that are able to tow both the visual and relational trends. This paper suggests an original hybrid model TransWalkCotNet, which combines the visual feature extraction of Swin Transformer with a graph representation learning on DeepWalk embeddings. A similarity graph is built using KNN to capture the relationship among images, and then the feature of cross-attention fusion is used to integrate visual and structural features.
The suggested model is tested by 5-fold stratified cross-validation and against baseline deep learning models (ResNet18 and Swin Transformer) and conventional machine learning models. As it is seen in the experimental results, TransWalkCotNet has a high performance with 98.17% accuracy and ROC-AUC 0.996, which is higher than the baseline and traditional models. These results demonstrate the success of incorporating the use of transformer-based visual learning with graph-based embedding in the classification of agricultural diseases.
Downloads
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

