Yang Zhang

Ph.D., Computer Vision and Multimedia @ NJU

Associate Professor @ School of Mechanical Engineering, Hubei University of Technology

yzhangcst[at]gmail.com; yzhangcst[at]hbut.edu.cn

I am currently an Associate Professor at School of Mechanical Engineering, Hubei University of Technology, China. My current research interests are machine learning and computer vision including 2D & 3D scene understanding. My full Chinese CV is available here.

I was previously a Postdoctoral Fellow from Medical Robotics Perception & AI at The Chinese University of Hong Kong. I received the Ph.D. degree from the National Key Laboratory for Novel Software Technology, Department of Computer Science and Technology, Nanjing University with Yanwen Guo in 2021.

I am looking for self-motivated Ph.D. students, research assistants, and visiting students, working together on exciting machine vision and artificial intelligence projects. If you are interested in working with me, please drop me an email with your resume.

Selected Publications

Spatial-wise Dynamic Distillation for MLP-like Efficient Visual Fault Detection of Freight Trains

IEEE Transactions on Industrial Electronics, 2024

Yang Zhang, Huilin Pan, Mingying Li, An Wang, Yang Zhou, Hongliang Ren

Spatial-wise dynamic distillation framework based on multi-layer perceptron (MLP) Is designed for visual fault detection of freight trains. We initially present the axial shift strategy, which allows the MLP-like architecture to overcome the challenge of spatial invariance and effectively incorporate both local and global cues. We propose a dynamic distillation method without a pre-training teacher, including a dynamic teacher mechanism that can effectively eliminate the semantic discrepancy with the student model. Such an approach mines more abundant details from lower-level feature appearances and higher-level label semantics as the extra supervision signal, which utilizes efficient instance embedding to model the global spatial and semantic information.

Efficient Visual Fault Detection for Freight Train via Neural Architecture Search with Data Volume Robustness

IEEE Transactions on Industrial Informatics, 2024

Yang Zhang, Mingying Li, Huilin Pan, Moyun Liu, Yang Zhou

An efficient NAS-based framework for visual fault detection of freight trains is proposed to search for the task-specific detection head with capacities of multi-scale representation. First, we design a scale-aware search space for discovering an effective receptive field in the head. Second, we explore the robustness of data volume to reduce search costs based on the specifically designed search space, and a novel sharing strategy is proposed to reduce memory and further improve search efficiency.

An Efficient MLP-based Point-guided Segmentation Network for Ore Images with Ambiguous Boundary

IEEE Transactions on Industrial Informatics, 2024

Guodong Sun, Yuting Peng, Le Cheng, Mengya Xu, An Wang, Bo Wu, Hongliang Ren, Yang Zhang*

Due to the homogeneous appearance of the ores, which leads to low contrast and unclear boundaries, accurate segmentation becomes challenging, and recognition becomes problematic. This paper proposes a lightweight framework based on Multi-Layer Perceptron (MLP), which focuses on solving the problem of edge burring. Specifically, we introduce a lightweight backbone better suited for efficiently extracting low-level features. Besides, we design a feature pyramid network consisting of two MLP structures that balance local and global information thus enhancing detection accuracy. Furthermore, we propose a novel loss function that guides the prediction points to match the instance edge points to achieve clear object boundaries.

Efficient segmentation with texture in ore images based on box-supervised approach

Engineering Applications of Artificial Intelligence, 2024

Guodong Sun, Delong Huang, Yuting Peng, Le Cheng, Bo Wu, Yang Zhang*

An effective box-supervised technique with texture features is provided for ore image segmentation that can identify complete and independent ores. Firstly, a ghost feature pyramid network (Ghost-FPN) is proposed to process the features obtained from the backbone to reduce redundant semantic information and computation generated by complex networks. Then, an optimized detection head is proposed to obtain the feature to maintain accuracy. Finally, Lab color space (Lab) and local binary patterns (LBP) texture features are combined to form a fusion feature similarity-based loss function to improve accuracy while incurring no loss.

Efficient Visual Fault Detection for Freight Train Braking System via Heterogeneous Self Distillation in the Wild

Advanced Engineering Informatics, 2023

Yang Zhang, Huilin Pan, Yang Zhou, Mingying Li, and Guodong Sun

This paper proposes a heterogeneous self-distillation framework to ensure detection accuracy and speed while satisfying low resource requirements. The privileged information in the output feature knowledge can be transferred from the teacher to the student model through distillation to boost performance. We first adopt a lightweight backbone to extract features and generate a new heterogeneous knowledge neck. Such neck models positional information and long-range dependencies among channels through parallel encoding to optimize feature extraction capabilities. Then, we utilize the general distribution to obtain more credible and accurate bounding box estimates. Finally, we employ a novel loss function that makes the network easily concentrate on values near the label to improve learning efficiency.

Spatial-information Guided Adaptive Context-aware Network for Efficient RGB-D Semantic Segmentation

IEEE Sensors Journal, 2023

Yang Zhang, Chenyun Xiong, Junjie Liu, Xuhui Ye, Guodong Sun

This paper proposes an efficient lightweight encoderdecoder network that reduces the computational parameters and guarantees the robustness of the algorithm. Working with channel and spatial fusion attention modules, our network effectively captures multi-level RGB-D features. A globally guided local affinity context module is proposed to obtain sufficient high-level context information. The decoder utilizes a lightweight residual unit that combines short- and long-distance information with a few redundant computations. Experimental results on NYUv2, SUN RGB-D, and Cityscapes datasets show that our method achieves a better trade-off among segmentation accuracy, inference time, and parameters than the state-of-the-art methods.

Faster OreFSDet: A Lightweight and Effective Few-shot Object Detector for Ore Images

Pattern Recognition, 2023

Yang Zhang, Le Cheng, Yuting Peng, Chengming Xu, Yanwei Fu, Bo Wu, and Guodong Sun

A lightweight and effective few-shot detector is proposed to achieve competitive performance with general object detection with only a few samples for ore images. First, the proposed support feature mining block characterizes the importance of location information in support features. Next, the relationship guidance block makes full use of support features to guide the generation of accurate candidate proposals. Finally, the dual-scale semantic aggregation module retrieves detailed features at different resolutions to contribute with the prediction process. Our method achieves the smallest model size of 19MB as well as being competitive at 50 FPS detection speed compared with general object detectors.

Adaptive Fusion Affinity Graph with Noise-free Online Low-rank Representation for Natural Image Segmentation

Pattern Recognition, 2023

Yang Zhang, Moyun Liu, Huiming Zhang, Guodong Sun, and Jingwu He

The proposed adaptive fusion affinity graph (AFA-graph) with noise-free low-rank representation in an online manner for natural image segmentation. An input image is first over-segmented into superpixels at different scales and then filtered by the proposed improved kernel density estimation method. Moreover, we select global nodes of these superpixels on the basis of their subspace-preserving presentation, which reveals the feature distribution of superpixels exactly. To reduce time complexity while improving performance, a sparse representation of global nodes based on noise-free online lowrank representation is used to obtain a global graph at each scale. The global graph is finally used to update a local graph which is built upon all superpixels at each scale.

Visual Fault Detection of Multi-scale Key Components in Freight Trains

IEEE Transactions on Industrial Informatics, 2022

Yang Zhang, Yang Zhou, Huilin Pan, Bo Wu, Guodong Sun

Despite the frequently employed methods based on deep learning, these fault detectors are extremely reliant on hardware resources and complex to implement. In addition, no train fault detectors consider the drop in accuracy induced by scale variation of fault parts. This paper proposes a lightweight anchor-free framework to solve the above problems. Specifically, to reduce the amount of computation and model size, we introduce a lightweight backbone and adopt an anchor-free method for localization and regression. To improve detection accuracy for multi-scale parts, we design a feature pyramid network to generate rectangular layers of different sizes to map parts with similar aspect ratios.

A Lightweight NMS-free Framework for Real-time Visual Fault Detection System of Freight Trains

IEEE Transactions on Instrumentation and Measurement, 2022, 71: 1-11

Guodong Sun, Yang Zhou, Huilin Pan, Bo Wu, Ye Hu, Yang Zhang*

A lightweight NMS-free framework is proposed to achieve real-time detection and high accuracy simultaneously. We use a lightweight backbone for feature extraction and design a fault detection pyramid toprocess features. This fault detection pyramid includes three novel individual modules using attention mechanism, bottleneck, and dilated convolution for feature enhancement and computation reduction. Instead of using NMS, we calculate different loss functions, including classification and location costs in the head to reduce computation.

A Unified Light Framework for Real-time Fault Detection of Freight Train Images

IEEE Transactions on Industrial Informatics, 2021, 17(11): 7423-7432

Yang Zhang, Moyun Liu, Yang Yang, Yanwen Guo, Huiming Zhang

A unified light framework is designed to improve detection accuracy while supporting a real-time operation with a low resource requirement. We firstly design a novel lightweight backbone (RFDNet) to improve the accuracy and reduce computational cost. Then, we propose a multi region proposal network using multi-scale feature maps generated from RFDNet to improve the detection performance. Finally, we present multi level position-sensitive score maps and region of interest pooling to further improve accuracy with few redundant computations.

Affinity Fusion Graph-based Framework for Natural Image Segmentation

IEEE Transactions on Multimedia, 2022, 24:440-450

Yang Zhang, Moyun Liu, Jingwu He, Fei Pan, Yanwen Guo

The proposed framework combines adjacency-graphs and kernel spectral clustering based graphs (KSC-graphs) according to a new definition named affinity nodes of multi-scale superpixels. These affinity nodes are selected based on a better affiliation of superpixels, namely subspace-preserving representation which is generated by sparse subspace clustering based on subspace pursuit. Then a KSC-graph is built via a novel kernel spectral clustering to explore the nonlinear relationships among these affinity nodes. Moreover, an adjacency-graph at each scale is constructed, which is further used to update the proposed KSC-graph at affinity nodes. The fusion graph is built across different scales, and it is partitioned to obtain final segmentation result.

Real-time Vision Based System of Fault Detection for Freight Trains

IEEE Transactions on Instrumentation and Measurement, 2020, 69(7): 5274-5284

Yang Zhang, Moyun Liu, Yunian Chen, Hongjie Zhang, Yanwen Guo

Real-time vision based system of fault detection (RVBS-FD) for freight trains aims to complete routine maintenance tasks efficiently for ensuring railway security. Recently, the rapid development of deep learning techniques enables systems to provide a robust solution for the RVBS-FD of freight trains. We propose a CNN-based detector called Light FTI-FDet for the RVBS-FD of freight train. The results on five typical fault benchmarks indicate that our Light FTI-FDet achieves higher accuracy and fast speed with about 17% model size of the well-known Faster R-CNN detector, substantially outperforming state-of-the-art methods.

Experiences

Research Intern | Tencent Youtu Lab, Tencent, Shanghai, China | Feb. 2021 – Jun. 2021 (Advisor: Yuqiang Ren. Topic: Object Detection.)

Postdoctoral Fellow | The Chinese University of Hong Kong, Hongkong, China | Nov. 2022 – May. 2023. (Advisor: Hongliang Ren. Topic: Industrial Robot.)

Activities

The reviewer for IEEE Transactions on Industrial Informatics, IEEE Transactions on Systems, Man and Cybernetics: Systems, IEEE Transactions on Cognitive and Developmental Systems.

The reviewer for the IEEE International Conference on Robotics and Automation (ICRA2024).

The reviewer for Information Fusion, Computers in Biology and Medicine, Computers and Electrical Engineering, Computers and Electronics in Agriculture, and Pattern Recognition Letters.

The reviewer for Infrared Physics and Technology, Computers in Industry, Image and Vision Computing.

The reviewer for Pattern Recognition, Advanced Engineering Informatics, Engineering Applications of Artificial Intelligence, and Expert Systems with Applications.

The reviewer for the Journal of Energy Storage, Displays, Optics Communications, Measurement, Applied Soft Computing, and Scientific Reports.

Awards

Wuhan Talent Plan (2021)

The Program B for Outstanding PhD candidate of Nanjing University (2020)

China Telecom Scholarship (2017) (Only 1700 students receive this honor annually in China.)

National Scholarship for Graduate Students (2016, 2015)

Top Award in the 10th “Challenge Cup” Hubei College Students’ Extracurricular Academic Science and Technology Works Contest (2015)

Gold Award in the Progressive Innovation Award of the 15th “Challenge Cup” National College Students’ Extracurricular Academic Science and Technology Works Contest (2013)

Last modified: Mar. 7, 2024

Template from http://www.Styleshout.com