This part covers the researches related to sign language recognition, which focuses on continuous sign language translation (CSLT). In order to improve the recognition accuracy of isolated sign words, some early works design an adaptive hidden Markov model (HMM) framework. These methods can fully explore the intrinsic properties and complementary relationship among the hidden sign states. CSLT suffers from challenges presented by hybrid semantics learning among sequential variations of visual representations, sign linguistics, and textual grammars. In order to convey the spatio-temporal transition at different granularities, a hierarchical recurrent neural network (RNN) with visual content and word embedding is adopted to encode and decode sign language features. In the coding stage, key segments in the temporal stream are adaptively captured. Not only can RNN be used for sequential learning, convolutional neural network (CNN) can also encode temporal cues in continuous gestures. The combination of CNN and RNN enhances the robustness of the network. By using CNN and RNN to perform temporal learning in parallel, the feature representation of visual content in sign language has been further improved. CSLT is a weakly supervised task, due to the gesture variation without word alignment annotation. Model based on hybrid RNN and CTC are proposed to solve the alignment of clip features and word text. In addition, the pseudo-supervised learning mechanism also contributes to solving the word alignment problem.
|
Graph-Based Multimodal Sequential Embedding for Sign Language Translation Shengeng Tang, Dan Guo, Richang Hong, and Meng Wang IEEE Transactions on Multimedia (TMM), 2021 [Paper] [BibTex] |
|
Review of Sign Language Recognition, Translation and Generation Dan Guo, Shengeng Tang, Richang Hong, and Meng Wang Computer Science, 2021 [Paper] [BibTex] |
|
Hierarchical Recurrent Deep Fusion Using Adaptive Clip Summarization for Sign Language Translation Dan Guo, Wengang Zhou, Anyang Li, Houqiang Li, and Meng Wang IEEE Transactions on Image Processing (TIP), 2020 [Paper] [BibTex] |
|
Parallel Temporal Encoder For Sign Language Translation Peipei Song, Dan Guo, Haoran Xin, and Meng Wang IEEE International Conference on Image Processing (ICIP), 2019 [Paper] [BibTex] [Poster] |
|
Connectionist Temporal Modeling of Video and Language: A Joint Model for Translation and Sign Labeling Dan Guo, Shengeng Tang, and Meng Wang International Joint Conference on Artificial Intelligence (IJCAI), 2019 [Paper] [BibTex] [Slides] [Poster] |
|
Dense Temporal Convolution Network for Sign Language Translation Dan Guo, Shuo Wang, Qi Tian, and Meng Wang International Joint Conference on Artificial Intelligence (IJCAI), 2019 [Paper] [BibTex] [Poster] |
|
Connectionist Temporal Fusion for Sign Language Translation Shuo Wang, Dan Guo, Wengang Zhou, Zhengjun Zha, and Meng Wang International ACM International Conference on Multimedia (ACM MM), 2018 [Paper] [BibTex] |
|
Hierarchical LSTM for Sign Language Translation Dan Guo, Wengang Zhou, Houqiang Li, and Meng Wang AAAI Conference on Artificial Intelligence (AAAI), 2018 [Paper] [BibTex] |
|
Online Early-Late Fusion Based on Adaptive HMM for Sign Language Recognition Dan Guo, Wengang Zhou, Houqiang Li, and Meng Wang ACM Transactions on Multimedia Computing Communications and Applications (TOMCCAP), 2018 [Paper] [BibTex] |
|
Sign Language Recognition Based on Adaptive HMMs with Data Augmentation Dan Guo, Wengang Zhou, Houqiang Li, and Meng Wang IEEE International Conference on Image Processing (ICIP), 2016 [Paper] [BibTex] |
|
|
|
|
|
|
|
|
Waiting for Update...