Chen Fuhao


Work

XPeng Motors

Guangzhou, China

Big Data Architect - platform team

Jun 2021 - Present
  • Built a real-time autonomous driving data processing platform that provides comprehensive data processing services for domestic and international collection vehicles and user vehicles. The platform handles various types of data, including images, videos, LiDAR, and Canbus, processing over 60TB of data on average per day while ensuring processing time is less than 12 hours. The platform efficiently handles data ingestion, cleaning, filtering, and anonymization from multiple data sources, sensors, and scenarios, ensuring downstream users can quickly access accurate and compliant data. Committed to providing efficient and reliable data processing solutions for users.
  • The data collected from collection-vehicles serves as cold-start data for the perception team to establish a foundational model. On the other hand, user-vehicles remotely trigger events when they encounter corner cases or failures of NGP(Navigate on Autopilot) that require user intervention. These trigger events capture the scene at that precise moment and encompass images, videos, LiDAR, Canbus data, text logs, and DDS data results from the vehicle model output. The recorded data is then transmitted to the cloud server. Within the cloud, the data processing platform injects this data into a simulation system, enabling the perception team to reconstruct and comprehend the scenarios. This process effectively tackles the long-tail effect of perception models and significantly contributes to enhancing KPIs
  • Built a cloud-based pipeline utilizing Unreal Engine (UE) for the generation of synthetic data. This pipeline efficiently generates a substantial amount of relevant datasets, focusing on specialized corner cases. By complementing real-world collected data, it effectively enhances model KPIs while simultaneously reducing the overall costs associated with data collection.
  • Developed the company's first image and video data anonymization system, implemented both on the cloud and vehicle side, and applied it to data collection tasks both domestically and internationally. This system plays a crucial role in protecting user privacy by ensuring sensitive data undergoes appropriate handling and protection(1 patent, company's quality project award)
  • By leveraging CDC(Change Data Capture) technology, aggregate data from various business systems, including data pipelines, map POIs(Points of Interest), annotation platforms, and full-stack testing management, ensuring compatibility with multiple data types. We then analyze and refine the aggregated data to provide users with valuable insights.
  • As a Technical Lead, I have been involved in designing parts of the data flow architecture, optimizing data quality, and reducing costs. My role includes project delivery, coordinating team resources, and minimizing technical debt.
  • Keywords: Autonomous Driving Big Data Processing

Grab

Singapore

Big Data Architect - traffic team

Jun 2019 - Jun 2021
  • Established a real-time geospatial traffic flow platform that covers over 100 cities in Southeast Asia by integrating real-time location data from drivers and delivery riders. This platform provides real-time traffic conditions, facilitates optimal route planning, estimates arrival time (ETA), and offers fare prediction services for our ride-hailing and food delivery core businesses
  • Performed a major reconstruction of the real-time traffic computation system by modularizing and separating each component. Built up the capability to quickly integrate third-party map data sources, such as Google Maps, Here Maps, and SKT Maps. Additionally, implemented the entire continuous delivery process based on Kubernetes, ensuring efficient and streamlined deployment of updates and new features
  • By decoupling offline data from real-time computation through source modularization, the system's complexity and operational costs were reduced. This implementation introduced real-time aggregation of vehicle GPS data and watermark calculations, leading to enhanced ETA accuracy and fair billing
  • Keywords: Traffic Flow Big Data Processing

Tencent

Shenzhen, China

Big Data Engineer - platform team

May 2018 - Jun 2019
  • Introduced automated sharding and indexing with hierarchical splitting, resulting in a 30% improvement in log indexing speed for the game log retrieval platform, facilitating faster data flow for user data
  • Designed a unified interface framework that enables routing, migration, hierarchical structuring, scaling, monitoring, and fault recovery of game data across different database components, reducing user complexity
  • Developed a management system for multiple database components, enhancing the efficiency of automated database management and operations
  • Open-sourced a relatively independent and comprehensive Elasticsearch automation monitoring dashboard
  • Keywords: Gaming Big Data Storage

JD.com

Beijing, China

Big Data Engineer - ads quality team

Jul 2015 - Apr 2018
  • Designed an intelligent advertising bidding platform that incorporates a Multi-Touch Attribution(MTA) model and confidence intervals, addressing the issue of sparse advertising data across different dimensions. This platform enables advertisers to easily and intelligently place advertisements, resulting in a 5% increase in Return on Investment(ROI) for both the platform's internal and external pages(1 patent, company's gold project award)
  • Implemented a multi-dimensional valuation system that helps advertisers achieve automatic differentiated bidding across various channels, allowing for precise allocation of their advertising budgets.
  • Built a product retrieval system that integrates with WeChat's social users, achieving a monthly Gross Merchandise Volume(GMV) of millions and attracting new users.
  • Keywords: Advertising Big Data Retrieval

Ancient history

Sun Yat-sen University

Guangzhou, China

Master of Engineering | Computer Vision | GPA 4.5/5.0

Sep 2012 - Jun 2015

    Wuhan University of Technology

    Wuhan, China

    Bachelor of Engineering | Computer Vision | GPA 4.8/5.0

    Sep 2008 - Jun 2012

      JD.com

      Beijing, China

      Software Engineer Intern, ads quality team

      May 2015 - Jun 2015

        Meitu

        Xiamen, China

        Software Engineer Intern, machine learning team

        Dec 2014 - Jan 2015

          China Telecom Academy

          Guangzhou, China

          Software Engineer Intern, incubation team

          Jul 2014 - Aug 2014

            First author publishing

            Scholar

            Method for 3D gray-to-gray crosstalk uniformity measurement with high perceptual relevance

            A straightforward method to assess motion blur for different types of displays

            Precise evaluation of LCD gray-to-gray response time based on a reference pattern synchronous measurement using high speed charge-coupled device camera

            Patents

            Picture processing method and device

            A kind of method and apparatus for assessing disaggregated model

            A kind of fuzzy method of direct measurement display motion based on motion square width

            A kind of liquid crystal display response time measuring method based on reference brightness