About me

I am a researcher at OpenRobotLab, Shanghai AI Laboratory, working on embodied AI. My research focuses on constructing a comprehensive 3D understanding of our world from ego-centric, multi-modal inputs, thereby enabling embodied planning and physical interactions. In recent years, we have contributed several fundamental endeavors in general 3D perception from LiDAR point clouds, monocular images and videos with an open-source codebase, MMDetection3D.

Working with Dr. Jiangmiao Pang and Prof. Dahua Lin, our group is dedicated to building Embodied AGI systems and empowering academia and industry through open-source initiatives. If you are interested, please reach out to us for potential positions or collaborations.

I earned my Ph.D. degree from MMLab, The Chinese University of Hong Kong. Before that, I received my B.Eng degree from Zhejiang University with the highest honors in 2019.

News

  • [2024/03] EmbodiedScan and GenNBV are accepted by CVPR 2024. The Challenge Server is online!
  • [2024/02] We will host the Multi-View 3D Visual Grounding track in the Autonomous Grand Challenge.
  • [2024/01] UniHSI is accepted by ICLR 2024 as Spotlight.
  • [2023/12] We release EmbodiedScan, the first ego-centric, multi-modal 3D perception suite for holistic 3D scene understanding.
  • [2023/08] We release PointLLM, the first work empowering LLMs to understand point clouds with solid evaluation and benchmarks.

Education

cuhk
 The Chinese University of Hong Kong (CUHK)
  August 2019 - July 2023
  Ph.D. in Information Engineering
zju
 Zhejiang University (ZJU)
  August 2015 - July 2019
  Major: B.E. in Information Engineering
  Minor: Advanced Honor Class of Engineering Education (ACEE), Chu Kochen Honors College

Publications

Multi-Modal 3D Perception
embodiedscan
 EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite
 Towards Embodied AI
  Tai Wang*, Xiaohan Mao*, Chenming Zhu*, et al.
  IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2024
  [Project Page] [Paper] [Code] [中文解读]
object2scene
 Object2Scene: Putting Objects in Context for Open-Vocabulary 3D Detection
  Chenming Zhu, Wenwei Zhang, Tai Wang, Xihui Liu, Kai Chen
  Arxiv preprint
  [Paper] [Code](Coming Soon)

Perception & Interaction with LLMs
unihsi
 UniHSI: Unified Human-Scene Interaction via Prompted Chain-of-Contacts
  Zeqi Xiao, Tai Wang, Jingbo Wang, Jinkun Cao, Wenwei Zhang, Bo Dai, Dahua Lin, Jiangmiao Pang
  International Conference on Learning Representations (ICLR) 2024, Spotlight
  [Project Page] [Paper] [Code]
pointllm
 PointLLM: Empowering Large Language Models to Understand Point Clouds
  Runsen Xu, Xiaolong Wang, Tai Wang, Yilun Chen, Jiangmiao Pang, Dahua Lin
  Arxiv preprint
  [Project Page] [Paper] [Code]

Active 3D Perception & Reconstruction
gennbv
 GenNBV: Generalizable Next-Best-View Policy for Active 3D Reconstruction
  Xiao Chen, Quanyi Li, Tai Wang, Tianfan Xue, Jiangmiao Pang
  IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2024
  [Project Page] [Paper] [Code]

Vision-Based 3D Perception
dort
 DORT: Modeling Dynamic Objects in Recurrent for Multi-Camera
 3D Object Detection and Tracking
  Qing Lian, Tai Wang, Jiangmiao Pang, Dahua Lin
  Conference on Robot Learning (CoRL) 2023
  [Paper] [Code]
bev-survey
 Vision-Centric BEV Perception: A Survey
  Yuexin Ma*, Tai Wang*, Xuyang Bai*, Huitong Yang, Yuenan Hou, Yaming Wang,
  Yu Qiao, Ruigang Yang, Dinesh Manocha, Xinge Zhu
  Arxiv preprint
  [Paper] [Code]
occupancy
 Scene as Occupancy
  Chonghao Sima*, Wenwen Tong*, Tai Wang, Li Chen, Silei Wu, Hanming Deng, Yi Gu, Lewei Lu,
  Ping Luo, Dahua Lin, Hongyang Li
  End-to-End Autonomous Driving, CVPR 2023 Workshop and Challenge
  IEEE/CVF International Conference on Computer Vision (ICCV) 2023
  [Paper] [Code]
geomim
 GeoMIM: Towards Better 3D Knowledge Transfer via Masked Image Modeling
 for Multi-view 3D Understanding
  Jihao Liu, Tai Wang, Boxiao Liu, Qihang Zhang, Yu Liu, Hongsheng Li
  IEEE/CVF International Conference on Computer Vision (ICCV) 2023
  [Paper] [Code]
monodetr
 MonoDETR: Depth-guided Transformer for Monocular 3D Object Detection
  Renrui Zhang, Han Qiu, Tai Wang, Ziyu Guo, Ziteng Cui, Peng Gao, Yu Qiao, Hongsheng Li
  IEEE/CVF International Conference on Computer Vision (ICCV) 2023
  [Paper] [Code]
dfm
 Monocular 3D Object Detection with Depth from Motion
  Tai Wang, Jiangmiao Pang, Dahua Lin
  European Conference on Computer Vision (ECCV) 2022, Oral
  [Paper] [Code]
mv-fcos3d
 MV-FCOS3D++: Multi-View Camera-Only 4D Object Detection
 with Pretrained Monocular Backbones
  Tai Wang, Qing Lian, Chenming Zhu, Xinge Zhu, Wenwei Zhang
  Runner-up solution in the Waymo Camera-Only 3D detection challenge, CVPR 2022
  [Preliminary Tech Report] [Code]
pgd
 Probabilistic and Geometric Depth: Detecting Objects in Perspective
  Tai Wang, Xinge Zhu, Jiangmiao Pang, Dahua Lin
  Conference on Robot Learning (CoRL) 2021
  [Paper] [Code] [Poster]
fcos3d
 FCOS3D: Fully Convolutional One-Stage Monocular 3D Object Detection
  Tai Wang, Xinge Zhu, Jiangmiao Pang, Dahua Lin
  ICCV Workshop on 3D Object Detection from Images (ICCVW) 2021, Best Paper Award
  1st place solution of vision-only methods in the nuScenes 3D detection challenge, NeurIPS 2020
  [Paper] [Code] [Slides] [Zhihu]
side
 SIDE: Center-Based Stereo 3D Detector with Structure-Aware
 Instance Depth Estimation
  Xidong Peng, Xinge Zhu, Tai Wang, Yuexin Ma
  IEEE Winter Conference on Applications of Computer Vision (WACV) 2022
  [Paper] [Code]

Voxel Representation Learning in LiDAR-Based Perception
p3former
 Position-Guided Point Cloud Panoptic Segmentation Transformer
  Zeqi Xiao*, Wenwei Zhang*, Tai Wang*, Chen Change Loy, Dahua Lin, Jiangmiao Pang
  International Journal of Computer Vision (IJCV) 2024
  [Paper] [Code]
mvjar
 MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based
 Self-Supervised Pre-Training
  Runsen Xu, Tai Wang, Wenwei Zhang, Runjian Chen, Jinkun Cao, Jiangmiao Pang, Dahua Lin
  IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2023
  [Paper] [Code]
cylinder3d
 Cylindrical and Asymmetrical 3D Convolution Networks for
 LiDAR Segmentation
  Xinge Zhu*, Hui Zhou*, Tai Wang, Fangzhou Hong, Yuexin Ma, Wei Li, Hongsheng Li, Dahua Lin
  IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2021, Oral
  IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 2021
  [Paper] [Code] [TPAMI version] [Bibtex]
reconfig
 Reconfigurable Voxels: A New Representation for LiDAR-Based
 Point Clouds
  Tai Wang, Xinge Zhu, Dahua Lin
  Conference on Robot Learning (CoRL) 2020
  [Paper] [Spotlight Talk]
ssn
 SSN: Shape Signature Networks for Object Detection from
 Point Clouds
  Xinge Zhu, Yuexin Ma, Tai Wang, Yan Xu, Jianping Shi, Dahua Lin,
  European Conference on Computeer Vision (ECCV) 2020
  [Paper] [Code]

Efficient Annotation of LiDAR Point Clouds
flava
 FLAVA: Find, Localize, Adjust and Verify to Annotate LiDAR-based
 Point Clouds
  Tai Wang, Conghui He, Zhe Wang, Jianping Shi, Dahua Lin
  ACM Symposium on User Interface Software and Technology (UIST) 2020, Poster
  [Full Tech Report] [Poster] [Poster Summary] [Demo]

Other 3D Vision Research
dcd
 Density-aware Chamfer Distance as a Comprehensive Metric for
 Point Cloud Completion
  Tong Wu, Liang Pan, Junzhe Zhang, Tai Wang, Ziwei Liu, Dahua Lin
  Advances in Neural Information Processing Systems (NeurIPS), 2021
  [Paper] [Code]

Research Projects

mmdet3d
 MMDetection3D: The Next-Generation Platform for General 3D Detection
  A versatile, open-source 3D object detection toolbox based on PyTorch
  MMDetection3D Contributors
  May 2020 – Now
  [Code] [Doc] [Bibtex]
s2mesh
 Spherical Convolutional Networks for 3D Mesh Processing
  New approaches to generating 3D meshes from scratch with S2 parametrization & extended spherical CNNs
  Tai Wang, Weiwei Zhou and Zicheng Liao
  Under revision and further development
  Mar 2018 – Nov 2018

Selected Awards

Teaching

  • Computer Vision (Undergraduate Course), Winter 2018 @ ZJU
  • IERG2080: Introduction to Systems Programming, Fall 2020 @ CUHK
  • IERG2470B/ESTR2308: Probability Models and Applications (Elite Students), Spring 2021 @ CUHK

Miscellaneous

Academic Services
I served as a reviewer for CVPR, ICCV, ECCV, CoRL, NeurIPS, ICLR, ICML, WACV, TPAMI, IJCV, TVCG.

Hobbies
Love: 🏀Basketball (I am a big fan of Stephen Curry and Tracy McGrady), 🎵music/🎤singing and good at 🖌️Chinese calligraphy (learned from MA Liangchen and MA Shanshuang).