Jiahao Ma

Hello, I'm Jiahao Ma

I am a third-year PhD student in Computer Science at Australian National University where I am advised by Miaomiao Liu, David Ahmedt-Aristizabal and Chuong Nguyen. Before that, I received my Master of Philosophy degree supervised by Liang Zheng from Australian National University in 2023. I am currently a research intern at X-Humanoid, working closely with Jony Zhang. My research interests include control & planning in robotics, reinforcement learning, vision perception, and 3D reconstruction.


Selected Publications

FlowOp: Morphology-Agnostic Animation-to-Robot Motion Retargeting via Sceneflow-Conditioned Diffusion

Zeyu Gao, Zeran Su, Peiran Liu, Zicheng Duan, Zelin Tao, Qiang Zhang, Jiahao Ma
arXiv, 2026

FlowOp uses sceneflow-conditioned diffusion to retarget arbitrary animation character motions to humanoid robots without body correspondence.

RobotPan: A 360° Surround-View Robotic Vision System for Embodied Perception

Jiahao Ma*, Qiang Zhang*, X-Humanoid Team
arXiv, 2026

RobotPan predicts metric-scaled compact 3D Gaussians from sparse surround-view inputs for real-time 360° rendering and reconstruction on humanoid robots.

Heracles: Bridging Precise Tracking and Generative Synthesis for General Humanoid Control

X-Humanoid Team
arXiv, 2026

Heracles is a state-conditioned diffusion middleware that bridges precise motion tracking and generative synthesis for general-purpose humanoid control.

MeshMimic: Geometry-Aware Humanoid Motion Learning through 3D Scene Reconstruction

Qiang Zhang*, Jiahao Ma*, Peiran Liu*, Shuai Shi*, X-Humanoid Team
arXiv, 2026

MeshMimic bridges 3D scene reconstruction and embodied intelligence to enable humanoid robots to learn coupled motion-terrain interactions directly from monocular video.


Uncertainty-aware 3D Edge Reconstruction with Difference of Gaussians

Uncertainty-aware 3D Edge Reconstruction with Difference of Gaussians

3DV, 2026

EdgeDoG uses DoG kernels and dual uncertainty to improve 3D edge reconstruction.

DCHM Depth-Consistent Human Modeling for Multiview Detection

DCHM: Depth-Consistent Human Modeling for Multiview Detection

ICCV, 2025

DCHM leverages superpixel GS to generate consistent point clouds for label-free multiview detection.

HashPoint: Accelerated Point Searching and Sampling for Neural Rendering

HashPoint: Accelerated Point Searching and Sampling for Neural Rendering

CVPR, 2024 Highlight

HashPoint accelerates the volume rendering by combining rasterization with ray tracing.

Multiview Detection with Cardboard Human Modeling

Multiview Detection with Cardboard Human Modeling

ACCV, 2024

Cardboard human modeling aggregate multiview pedestrian features.

Voxelized 3D Feature Aggregation for Multiview Detection

Voxelized 3D Feature Aggregation for Multiview Detection

DICTA, 2024

VFA, a voxelized 3D feature aggregation method, improves multiview detection accuracy by reducing occlusion and projection errors.

* denotes equal contribution, denotes corresponding author.

Projects

MultiviewC GIF

Multiview animal monitoring

Multiview detection algorithm design and synthetic dataset generation.

  • Publication: The paper has been accepted by DICTA 2024.
  • MultiviewC dataset: Proposed the MultiviewC dataset for multiview animal action recognition, 3D detection, and tracking.

Humanoid Occupancy

Enabling A Generalized Multimodal Occupancy Perception System on Humanoid.

  • Publication: IROS 2025 Workshop - Perception and Planning for Mobile Manipulation in Changing Environments.