Staff Machine Learning Engineer - Foundation Model
Published: 2025-11-14XPENG is a leading smart technology company at the forefront of innovation, integrating advanced AI and autonomous driving technologies into its vehicles, including electric vehicles (EVs), electric vertical take-off and landing (eVTOL) aircraft, and robotics. With a strong focus on intelligent mobility, XPENG is dedicated to reshaping the future of transportation through cutting-edge R&D in AI, machine learning, and smart ...
Job details
Santa Clara, United States (city)
$215k - $364k
On-site
Full-time
Categories
XPENG is a leading smart technology company at the forefront of innovation, integrating advanced AI and autonomous driving technologies into its vehicles, including electric vehicles (EVs), electric vertical take-off and landing (eVTOL) aircraft, and robotics. With a strong focus on intelligent mobility, XPENG is dedicated to reshaping the future of transportation through cutting-edge R&D in AI, machine learning, and smart connectivity.
We are looking for a full-time Machine Learning Engineer / Research Scientist to drive the modeling and algorithmic development of XPENG’s next-generation Vision-Language-Action (VLA) Foundation Model — the core brain that powers our end-to-end autonomous driving systems.
You will work closely with world-class researchers, perception and planning engineers, and infrastructure experts to design, train, and deploy large-scale multi-modal models that unify vision, language, and control. Your work will directly shape the intelligence that enables XPENG’s future L3/L4 autonomous driving products.
Key Responsibilities
Apply - Design and implement large-scale multi-modal architectures (e.g., vision–language–action transformers) for end-to-end autonomous driving.
- Develop pretraining and fine-tuning strategies leveraging massive labeled and unlabeled fleet data (images, video, LiDAR, CAN bus, maps, human driving behaviors, etc.).
- Research and integrate cross-modal alignment (e.g., visual grounding, temporal reasoning, policy distillation, imitation and reinforcement learning) to improve model interpretability and action quality.
- Collaborate with infrastructure engineers to scale training across thousands of GPUs using distributed training frameworks (FSDP, DDP, etc.).
- Conduct systematic ablation, evaluation, and visualization of model behavior across perception, reasoning, and planning tasks.
- Contribute to model deployment optimization, including quantization, export, and latency–accuracy trade-offs for onboard execution.
- Master’s degree or higher in Computer Science, Electrical/Computer Engineering, or related field, with 3+ years of experience in deep learning research or productization.
- Strong proficiency in PyTorch and modern transformer-based model design.
- Experience in large-scale pretraining or multi-modal modeling (vision, language, or planning).
- Deep understanding of representation learning, temporal modeling, and self-supervised or reinforcement learning techniques.
- Familiarity with distributed training (DDP, FSDP) and large-batch optimization.
- PhD in CS/CE/EE or related field, with 1+ years of relevant industry experience.
- Publication record in top-tier AI conferences (CVPR, ICCV, NeurIPS, ICLR, ICML, ECCV).
- Prior experience building foundation or end-to-end driving models, or LLM/VLM architectures (e.g., ViT, Flamingo, BEVFormer, RT-2, or GRPO-style policies).
- Familiarity with RLHF/DPO/GRPO, trajectory prediction, or policy learning for control tasks.
- Proven ability to collaborate cross-functionally with infra, perception, and planning teams to deliver production-ready models.
- A collaborative, research-driven environment with access to massive real-world data and industry-scale compute.
- An opportunity to work with top-tier researchers and engineers advancing the frontier of foundation models for autonomous driving.
- Direct impact on the next generation of intelligent mobility systems.
- Opportunity to make significant impact on the transportation revolution by the means of advancing autonomous driving.
- Competitive compensation package.
- Snacks, lunches, dinners, and fun activities.