No version for distro humble. Known supported distros are highlighted in the buttons above.
No version for distro jazzy. Known supported distros are highlighted in the buttons above.
No version for distro rolling. Known supported distros are highlighted in the buttons above.

Package Summary

Tags No category tags.
Version 0.0.0
License MIT
Build type CATKIN
Use RECOMMENDED

Repository Summary

Description F1Tenth Simulation Code: Platooning, Computer Vision, Reinforcement Learning, Path Planning
Checkout URI https://github.com/pmusau17/platooning-f1tenth.git
VCS Type git
VCS Version noetic-port
Last Updated 2022-05-20
Dev Status UNKNOWN
CI status No Continuous Integration
Released UNRELEASED
Tags No category tags.
Contributing Help Wanted (0)
Good First Issues (0)
Pull Requests to Review (0)

Package Description

The reinforcement learning package

Additional Links

No additional links.

Maintainers

  • Nathaniel Hamilton

Authors

No additional authors.

Reinforcement Learning

These methods are designed for training a vehicle follower, which we plan on expanding to platooning. However, we might expand this to train a single racer in the future.

Both methods require a parameter file is loaded before running. We’ll explain these more below, but they contain all of hyperparameters for the learning algorithm. Modifying these can create different results which might be better than our performance. We use nominal values discussed in papers opting for consistent results instead of peak performance.

DDPG

Deep Deterministic Policy Gradient (DDPG) is explained in Lillicrap et. al, 2015. The method uses model-free, off-policy learning to determine an optimal deterministic policy for the agent to follow. This means, during training, the agent executes random actions not determined using the learned policy. This string of randomly executed actions is stored in a replay buffer, which is sampled from after each step to learn an estimation of the state-action value (a.k.a. Q-value) function. This Q-function is used to train the policy to take actions that result in the largest Q-value. The folder containing this method, DDPG, is made of the following parts:

  • config_ddpg.yaml: YAML file with all of the configuration information including the hyperparameters used for training.
  • ddpg.py: The main file for running this method. Every step of the DDPG algorithm is implemented in this file.
  • nn_ddpg.py: The neural network architecture described in Lillicrap et. al, 2015 is implemented here using PyTorch. Forward passes and initialization are handled in this class as well.
  • noise_OU.py: Ornstein-Uhlenbeck process noise is implemented in this file. This noise is applied to create the random exploration actions. This is the same noise used in Lillicrap et. al, 2015.
  • replay_buffer.py:

To run this method, do the normal launch of the racing environment with 2 cars, then in a separate terminal:

$ rosparam load config_ddpg.yaml
$ rosrun rl ddpg.py

PPO

Proximal Policy Optimization (PPO) is explained in Schulman et. al, 2017 as a simplified improvement to Trust Region Policy Optimization (TRPO). The method uses model-free, on-policy learning to determine an optimal stochastic policy for the agent to follow. This means, during training, the agent executes actions randomly chosen from the output policy distribution. The policy is followed over the course of a horizon. After the horizon is completed, the Advantages are computed and used to determine the effectiveness of the policy. The Advantage values are used along with the probability of the action being taken to compute a loss function, which is then clipped to prevent large changes in the policy. The clipped loss is used to update the policy and improve future Advantage estimation. The folder containing this method, PPO, is made of the following scripts:

  • class_nn.py: The neural network architecture described in Lillicrap et. al, 2015 is implemented here using PyTorch. Forward passes and initialization are handled in this class as well. We are currently working on changing the architecture to match that used in Schulman et. al, 2017.
  • ppo.py: The main file for running this method. Every step of the PPO algorithm is implemented in this file.
  • ppo_config.py: YAML file with all of the configuration information including the hyperparameters used for training.

To run this method, do the normal launch of the racing environment with 2 cars, then in a separate terminal:

$ rosparam load ppo_config.yaml
$ rosrun rl ppo.py

Reset the Environment

If the car crashes or you want to start the experiment again. Simply run:

 $ rosrun race reset_world.py
 

to restart the experiment.

Running teleoperation nodes

To run a node to tele-operate the car via the keyboard run the following in a new terminal:

$ rosrun race keyboard_gen.py racecar

‘racecar’ can be replaced with ‘racecar1’ ‘racecar2’ if there are multiple cars.

Additionally if using the f1_tenth_devel.launch file, simply type the following:

```bash $ roslaunch race f1_tenth_devel.launch enable_keyboard:=true

CHANGELOG
No CHANGELOG found.

Wiki Tutorials

This package does not provide any links to tutorials in it's rosindex metadata. You can check on the ROS Wiki Tutorials page for the package.

Dependant Packages

No known dependants.

Launch files

No launch files found

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged rl at Robotics Stack Exchange

No version for distro noetic. Known supported distros are highlighted in the buttons above.
No version for distro ardent. Known supported distros are highlighted in the buttons above.
No version for distro bouncy. Known supported distros are highlighted in the buttons above.
No version for distro crystal. Known supported distros are highlighted in the buttons above.
No version for distro eloquent. Known supported distros are highlighted in the buttons above.
No version for distro dashing. Known supported distros are highlighted in the buttons above.
No version for distro galactic. Known supported distros are highlighted in the buttons above.
No version for distro foxy. Known supported distros are highlighted in the buttons above.
No version for distro iron. Known supported distros are highlighted in the buttons above.
No version for distro lunar. Known supported distros are highlighted in the buttons above.
No version for distro jade. Known supported distros are highlighted in the buttons above.
No version for distro indigo. Known supported distros are highlighted in the buttons above.
No version for distro hydro. Known supported distros are highlighted in the buttons above.
No version for distro kinetic. Known supported distros are highlighted in the buttons above.
No version for distro melodic. Known supported distros are highlighted in the buttons above.