National Science & Technology Development Agency - Thailand; National Electronics & Computer Technology Center (NECTEC)
Type
Article
Source Title
JOURNAL OF IMAGING
Year
2021
Volume
7
Open Access
Green Published, gold
Publisher
MDPI
DOI
10.3390/jimaging7120264
Format
PDF
Abstract
This paper presents an extended model for a pedestrian attribute recognition network utilizing skeleton data as a soft attention model to extract a local feature corresponding to a specific attribute. This technique helped keep valuable information surrounding the target area and handle the variation of human posture. The attention masks were designed to focus on the partial and the whole-body regions. This research utilized an augmented layer for data augmentation inside the network to reduce over-fitting errors. Our network was evaluated in two datasets (RAP and PETA) with various backbone networks (ResNet-50, Inception V3, and Inception-ResNet V2). The experimental result shows that our network improves overall classification performance with a mean accuracy of about 2-3% in the same backbone network, especially local attributes and various human postures.