Mobile Car Projects

Content
Zhi Xing Car I

Zhi Xing Car I - SLAM Navigation Robot

A comprehensive autonomous navigation robot system integrating SLAM mapping, path planning, voice control, and face recognition. Built on ROS framework with LiDAR-based localization and multi-sensor fusion, achieving robust autonomous navigation and intelligent interaction in indoor environments.

Video

1. Vehicle Kinematics Analysis

This section summarizes the differential-drive vehicle kinematics and control models, including forward/inverse kinematics, relationships between wheel speeds and robot linear/angular velocities, and common trajectory tracking algorithms.

1.1 Odometry & Coordinate Transforms

In ROS-based robots the most common frames are map, odom and base_link. The map frame is a fixed, long-term global reference (Z axis up). Because localization may apply discrete corrections when new sensor data arrives, poses expressed in map can occasionally "jump"; this makes map suitable as a global reference but not ideal as a fast local control frame.

The odom frame is a locally-continuous reference produced by dead-reckoning sources (wheel encoders, visual odometry, IMU). odom guarantees smooth, continuous pose updates (no sudden jumps) and is useful as a short-term reference for sensors and controllers, but it accumulates drift and therefore cannot serve as a precise long-term global frame. Typically map and odom start aligned; localization modules compute the transform map → odom (via tf) to correct odometry drift.

Note the distinction between the odom coordinate frame and an /odom topic: the frame is the coordinate reference, whereas the /odom topic commonly publishes the transform from odom to base_link (i.e., the robot pose in the odom frame).

The base_link frame is the robot body frame (usually centered on the platform); the core localization task is estimating base_link's pose in the map frame. Each sensor has its own fixed sensor frame (for example laser or base_laser for a LiDAR), and the transform between that sensor frame and base_link is constant and should be published via tf.

map Global fixed frame (long-term reference) odom Local continuous frame (short-term, may drift) base_link Robot body frame (pose expressed here) map → odom (tf correction) odom → base_link (odom topic) laser LiDAR sensor frame (fixed transform to base_link) fixedTF Note: map may "jump" when localization updates; odom provides smooth short-term pose but accumulates drift. Use sensor fusion (IMU, AMCL) to compute map→odom corrections. Frames diagram (map, odom, base_link, laser)

1.2 Kinematic Transformation from Robot Velocity to Motor Speeds

The kinematic model of the differential wheel chassis is established as follows:
Car dynamics diagram

Below are the key points extracted from the image. Let the linear speeds of the two wheels be V_a (wheel A) and V_b (wheel B), and let the track width be 2L. When only one wheel is turning, the contact point of the stationary wheel is the instantaneous center of rotation (ICR), which yields special-case kinematic relations.

Car dynamics diagram

1) Only wheel A rotates (wheel B stationary):

$$V_{ox}=\frac{V_a}{2},\quad V_{oy}=0,\quad \dot{\theta}=-\frac{V_a}{2L}$$

2) Only wheel B rotates (wheel A stationary):

$$ \begin{cases} V_{ox}=\frac{V_b}{2}\\ V_{oy}=0\\ \dot{\theta}=\frac{V_b}{2L} \end{cases} $$

Combining the two special cases gives the general form (superposition of wheel speeds):

$$ \begin{cases} V_{ox}=\tfrac{1}{2}(V_a+V_b)\\ V_{oy}=0\\ \dot{\theta}=\tfrac{1}{2L}(V_b-V_a) \end{cases} $$

Based on the forward kinematic formulation, the corresponding inverse kinematic equations can be directly obtained, establishing the mapping from the robot’s body-frame velocity to each wheel’s linear velocity:

$$ \begin{cases} V_a = V_{ox} + L\,\dot{\theta}\\ V_b = V_{ox} - L\,\dot{\theta} \end{cases} $$

1.3 Kinematic Transformation from Motor Speeds to Robot Displacement

Car dynamics diagram 3

The following summarizes the coordinate transform and velocity relationships illustrated in the diagram above. Let the robot body frame be (x',y') rotated by heading \(\theta\) relative to the global frame (x,y). The coordinate transform from body to global is:

$$x = x'\cos\theta - y'\sin\theta\\ y = x'\sin\theta + y'\cos\theta$$

For the two-wheel differential drive with wheel linear speeds \(V_a\) and \(V_b\) (left/right or labeled in the figure), the derived body-frame velocity components are:

$$ \begin{cases} V_{ox}=\tfrac{1}{2}(V_a+V_b)\cos\phi\\ V_{oy}= -\tfrac{1}{2}(V_a+V_b)\sin\phi\\ \dot{\theta}=\tfrac{1}{2L}(V_b-V_a) \end{cases} $$

This set of equations represents the transformation from the angular velocities of the two motors to the robot’s motion velocity in the global (world) coordinate frame. During odometry computation, the robot’s linear and angular velocities are integrated over each time interval, and the accumulated result yields the current displacement, which corresponds to the data published on the /odom topic. With this, the derivation of the differential-drive kinematic model is complete.

2. SLAM Navigation Workflow

2.1 System Architecture & Hardware Setup

2.2 GMapping Algorithm Based on RBPF

Car dynamics diagram 3

GMapping is a widely used SLAM (Simultaneous Localization and Mapping) algorithm based on the Rao-Blackwellized Particle Filter (RBPF) framework.

It factorizes the full SLAM posterior into two parts: the robot trajectory estimation and the map estimation.

$$ p(x_{1:t}, m \mid z_{1:t}, u_{1:t-1}) = p(m \mid x_{1:t}, z_{1:t}) \, p(x_{1:t} \mid z_{1:t}, u_{1:t-1}) $$

where:

Since the map \(m\) can be determined given the trajectory \(x_{1:t}\), the SLAM problem reduces to estimating the trajectory posterior using a particle filter.

1. Improved Proposal Distribution

In traditional FastSLAM 1.0, the proposal distribution for sampling new particle poses only depends on the motion model:

$$p(x_t\mid x_{t-1}, u_t)$$

This approach suffers from high uncertainty due to odometry drift and noise, often causing particle degeneracy — only a few particles have significant weights.

GMapping improves this by incorporating the current observation \(z_t\) into the proposal distribution:

$$p(x_t\mid x_{t-1}, u_t, z_t)$$

Key idea: combining motion and observation information corrects the motion-model prediction using the latest laser scan, reducing uncertainty of sampled particles.

Implementation steps:

  1. For each particle \(i\), predict a set of pose candidates using the motion model.
  2. Evaluate each candidate by computing the likelihood \(p(z_t\mid x_t^{(i)}, m_{t-1}^{(i)})\) through scan matching.
  3. Approximate this likelihood as a Gaussian distribution around the best-matched pose.
  4. Sample the new particle pose from this corrected Gaussian proposal.

The particle weight update becomes:

$$w_t^{(i)} \propto w_{t-1}^{(i)} \dfrac{p(z_t\mid x_t^{(i)}, m_{t-1}^{(i)})\, p(x_t^{(i)}\mid x_{t-1}^{(i)}, u_t)}{q(x_t^{(i)}\mid x_{t-1}^{(i)}, u_t, z_t)}$$

Effect:

2. Selective Resampling

In a standard particle filter, resampling is performed at every iteration to prevent weight degeneracy. Frequent resampling can cause sample impoverishment, reducing particle diversity and potentially leading to local minima. GMapping uses selective resampling:

Only perform resampling when the effective sample size \(N_{\text{eff}}\) falls below a threshold.

The effective sample size is defined as:

$$N_{\text{eff}} = \dfrac{1}{\sum_{i=1}^{N} (w_t^{(i)})^2}$$

When \(N_{\text{eff}} < N_{\text{threshold}}\) (typically \(N/2\) or \(N/3\)), resampling is triggered.

Benefits:

2.3 Navigation

Car dynamics diagram 3 At the core of the navigation system lies the move_base node, which integrates inputs from odometry, robot pose estimation, and map data. This node performs both global planning and local planning: the global planner generates an overall path to the target, while the local planner continuously adjusts the trajectory in response to environmental changes, thus achieving autonomous obstacle avoidance during navigation.

1.Localization

Below is the standard MCL (Monte Carlo Localization) algorithm presented as pseudocode. Variables: \(\mathcal{X}_{t-1}\) is the particle set at time \(t-1\), \(u_t\) the control input, \(z_t\) the current observation, and \(m\) the map.

1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
function MCL(\(\mathcal{X}_{t-1}\), \(u_t\), \(z_t\), \(m\)):
\(\bar{\mathcal{X}}_t := \varnothing\)
for \(m = 1\) to \(M\) do
\(x_t^{[m]} = \text{sample\_motion\_model}(u_t, x_{t-1}^{[m]})\)
\(w_t^{[m]} = \text{measurement\_model}(z_t, x_t^{[m]}, m)\)
add \(\langle x_t^{[m]}, w_t^{[m]} \rangle\) to \(\bar{\mathcal{X}}_t\)
endfor
for \(m = 1\) to \(M\) do
draw index \(i\) with probability \(\propto w_t^{[i]}\)
add \(x_t^{[i]}\) to \(\mathcal{X}_t\)
endfor
return \(\mathcal{X}_t\)

AMCL (Adaptive Monte Carlo Localization) is a particle-filter-based method for robot localization. Its objective is to estimate the robot pose over time from sensor measurements and control inputs. We denote the robot pose at time \(t\) by \(x_t\); the control (odometry) input applied between time \(t-1\) and \(t\) by \(u_t\); and the sensor observation at time \(t\) by \(z_t\). When referring to histories we use the shorthand sequences \(x_{1:t}\), \(u_{1:t}\) and \(z_{1:t}\). Using the Bayes filter formulation, the posterior can be written as:

$$ p(x_t\mid z_{1:t}, u_{1:t}) = \eta\; p(z_t\mid x_t)\; p(x_t\mid z_{1:t-1}, u_{1:t}) $$

Here the prediction (motion update) step is given by the Chapman–Kolmogorov equation:

$$ p(x_t\mid z_{1:t-1}, u_{1:t}) = \int p(x_t\mid x_{t-1}, u_t)\; p(x_{t-1}\mid z_{1:t-1}, u_{1:t-1})\, dx_{t-1}. $$

The measurement update then incorporates the current observation \(z_t\) via the likelihood \(p(z_t\mid x_t)\), and \(\eta\) is a normalizing constant.

AMCL approximates the posterior with a set of weighted particles \(\{x_t^{(i)}, w_t^{(i)}\}_{i=1}^M\). The particle filter loop typically performs:

Here, M denotes the number of particles.

  1. Prediction: sample a new pose for each particle from the motion model \(x_t^{(i)} \sim p(x_t\mid x_{t-1}^{(i)}, u_t)\).
  2. Update: compute the importance weight using the observation likelihood, e.g. \(w_t^{(i)} \propto w_{t-1}^{(i)}\; p(z_t\mid x_t^{(i)})\), then normalize the weights.
  3. Resample: when particle degeneracy is detected, draw a new particle set according to the normalized weights.

$$ w_t^{(i)} \propto w_{t-1}^{(i)}\; p(z_t\mid x_t^{(i)}) $$

A common metric to decide whether to resample is the Effective Sample Size (ESS):

$$ N_{\text{eff}} = \dfrac{1}{\sum_{i=1}^{M} (w_t^{(i)})^2} $$

When \(N_{\text{eff}}\) falls below a chosen threshold (for example \(M/2\)), resampling is performed. AMCL uses adaptive resampling strategies (and can vary the number of particles) based on ESS to balance computational cost and localization accuracy. This adaptive behavior helps maintain robustness in real-time robot localization.

2. move_base

Global path planning is primarily used to generate an optimal route from the robot’s current position to the target within a known map, while local path planning focuses on real-time motion adjustment along the global path to ensure obstacle avoidance in partially unknown environments.

The move_base stack commonly uses costmaps to represent the environment and perform planning. In ROS, the costmap_2d package generates different costmap layers on top of the base map to support both local and global planners. Typical layers include:

3. Common Global and Local Path Planners

The navigation stack typically separates planning into global planners (produce a full path on a known map) and local planners (reactive, operate in velocity space to follow the global path while avoiding obstacles).

Global Planners

Local Planners

References

  1. G. Grisetti, C. Stachniss, and W. Burgard, "Improving grid-based SLAM with Rao-Blackwellized particle filters by adaptive proposals and selective resampling," in Proc. IEEE Int. Conf. Robotics and Automation (ICRA), 2005, pp. 2432–2437. [PDF]
  2. Z. Wang, "GMapping and AMCL summary," CSDN Blog, 2024. [https://blog.csdn.net/m0_55202222/article/details/131065091]

Zhi Xing Car II

Zhi Xing Car II

We designed and deployed a SLAM system on NVIDIA Jetson, integrating 2D LiDAR and RealSense D435 for robust multi-sensor fusion. Implemented Google Cartographer for high-accuracy 2D localization and mapping, and extended to 3D ESDF-based mapping for precise environmental reconstruction and optimized path planning in real-world navigation tasks.

Zhi Xing Car II SLAM

Auto-tracking Car

Auto-tracking Car

We used OpenCV to rectify the camera-captured map and extract obstacle coordinates for path planning. Based on the map, we designed an efficient motion planning algorithm using A* search algorithm, enabling the car to avoid randomly positioned obstacles. Then we implemented a PID controller for precise speed and steering control, ensuring smooth navigation.

Video

1. Rectify the Camera-captured Map

2. Extract Obstacle Coordinates

Sampling strategy (9x9 grid center sampling):

i = j = 40
tiles = []
while i < 720:
    while j < 720:
        tiles.append(hsv2color(hsv[i, j]))  # sample every 80 pixels at the center of each grid cell
        j += 80
    j = 40
    i += 80

# Start from (40, 40), step = 80 pixels
# Cover 720x720 image → 9x9 = 81 tiles
      

Color to state mapping for each sampled grid center:

  • 0: black (obstacle)
  • 2: blue (treasure)
  • 3: green (exit)
  • 4: yellow (start)

3. Path Planning

Path Finding

Graph construction logic:

# Build graph
G = nx.Graph()
for target in range(0, 81):
  up = target - 9
  down = target + 9
  left = target - 1
  right = target + 1
    
  # obstacle weight = 1000 (penalize but not forbidden), passable weight = 1
  if up >= 0:
    if tiles[up] == 0:
      G.add_weighted_edges_from([(target, up, 1000)])
    else:
      G.add_weighted_edges_from([(target, up, 1)])
  if down < 81:
    if tiles[down] == 0:
      G.add_weighted_edges_from([(target, down, 1000)])
    else:
      G.add_weighted_edges_from([(target, down, 1)])
  # ensure left/right do not cross row boundaries
  if left // 9 == target // 9:
    if tiles[left] == 0:
      G.add_weighted_edges_from([(target, left, 1000)])
    else:
      G.add_weighted_edges_from([(target, left, 1)])
  if right // 9 == target // 9:
    if tiles[right] == 0:
      G.add_weighted_edges_from([(target, right, 1000)])
    else:
      G.add_weighted_edges_from([(target, right, 1)])

# Node indices: 0-80, row-major from top-left to bottom-right
# Weight strategy: obstacle=1000 (penalty), normal=1
# Two-stage routing:
find_treasure = nx.dijkstra_path(G, source=start, target=treasure)
find_exit = nx.dijkstra_path(G, source=treasure, target=destination)
    

Using A* (recommended for grid maps)

A* uses a heuristic to guide the search. For a 2D grid, the Manhattan distance is a simple admissible heuristic:

def manhattan(u, v):
    ux, uy = divmod(u, 9)
    vx, vy = divmod(v, 9)
    return abs(ux - vx) + abs(uy - vy)

# Using networkx's A* (assuming G is built as above and weights represent cost):
path_to_treasure = nx.astar_path(G, source=start, target=treasure, heuristic=manhattan, weight='weight')
path_to_exit = nx.astar_path(G, source=treasure, target=destination, heuristic=manhattan, weight='weight')