Biography

I began my research career at Shanghai Jiao Tong University, earning a Bachelor of Engineering (B.E.) in 2015 and a Ph.D. in 2021. My doctoral research, supervised by Prof. Xiaokang Yang, focused on multimodal human motion understanding.

My research interest lies in leveraging cutting-edge AI to transform the creation pipeline of digital content, particularly in revolutionizing how artistic assets, character performances, interactive mechanics (e.g., gameplay design), and immersive environments (e.g., level design) are conceived and realized.

Currently, I am a researcher in Generative AI for 3D Animation, specializing in generating stylized human body and cloth motion grounded in physical plausibility.

My research focuses on:

  • Developing AI-assisted generative frameworks for artistic design
  • Advancing data-driven & physical-simulation integrated learning methodologies
  • Building animation foundation models with multimodal data
  • Establishing human-AI co-creative pipelines with artist-steerable control

Publications

  • Skeleton2mesh: Kinematics prior injected unsupervised human mesh recovery.
    Zhenbo Yu, Junjie Wang, Jingwei Xu, Bingbing Ni, Chenglong Zhao, Minsi Wang, Wenjun Zhang
    International Conference on Computer Vision (ICCV), 2021.

    [Paper] / [Project]

  • 3d human action representation learning via cross-view consistency pursuit.
    Linguo Li*, Minsi Wang*, Bingbing Ni, Hang Wang, Jiancheng Yang, Wenjun Zhang
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021.

    [Paper] / [Code]

  • Learning multi-view interactional skeleton graph for action recognition.
    Minsi Wang, Bingbing Ni, Xiaokang Yang
    IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2020.

    [Paper] / [Code]

  • Loopy residual hashing: Filling the quantization gap for image retrieval.
    Jiale Bai, Zefan Li, Bingbing Ni, Minsi Wang, Xiaokang Yang, Chuanping Hu, Wen Gao
    IEEE Transactions on Multimedia (TMM) 2019.

    [Paper]

  • Multiple granularity group interaction prediction.
    Taiping Yao*, Minsi Wang*, Bingbing Ni*, Huawei Wei, Xiaokang Yang
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.

    [Paper]

  • Fine-grained video captioning for sports narrative.
    Huanyu Yu, Shuo Cheng, Bingbing Ni, Minsi Wang, Jian Zhang, Xiaokang Yang
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.

    [Paper]

  • Crowd counting via adversarial cross-scale consistency pursuit.
    Zan Shen, Yi Xu, Bingbing Ni, Minsi Wang, Jianguo Hu, Xiaokang Yang
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.

    [Paper]

  • Deep progressive hashing for image retrieval.
    Jiale Bai, Bingbing Ni, Minsi Wang, Yang Shen, Hanjiang Lai, Chongyang Zhang, Lin Mei, Chuanping Hu, Chen Yao
    Proceedings of the 25th ACM international conference on Multimedia (ACM MM) 2017.

    [Paper]

  • Recurrent modeling of interaction context for collective activity recognition.
    Minsi Wang, Bingbing Ni, Xiaokang Yang
    IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.

    [Paper]

  • A parallel-fusion RNN-LSTM architecture for image caption generation.
    Minsi Wang, Li Song, Xiaokang Yang, Chuanfei Luo
    IEEE International Conference on Image Processing (ICIP), 2016.

    [Paper]