No version for distro humble. Known supported distros are highlighted in the buttons above.

No version for distro jazzy. Known supported distros are highlighted in the buttons above.

No version for distro kilted. Known supported distros are highlighted in the buttons above.

No version for distro rolling. Known supported distros are highlighted in the buttons above.

rl package from platooning-f1tenth repo

ackermann_cmd_mux computer_vision f110_rrt_star map_server2 mpc multi_object_tracking_lidar particle_filter pure_pursuit race racecar rl rtreach racecar_control racecar_description racecar_gazebo

API Docs
Browse Code

Package Summary

Tags	No category tags.
Version	0.0.0
License	MIT
Build type	CATKIN
Use	RECOMMENDED

Repository Summary

Description	F1Tenth Simulation Code: Platooning, Computer Vision, Reinforcement Learning, Path Planning
Checkout URI	https://github.com/pmusau17/platooning-f1tenth.git
VCS Type	git
VCS Version	noetic-port
Last Updated	2022-05-20
Dev Status	UNMAINTAINED
CI status	No Continuous Integration
Released	UNRELEASED
Tags	No category tags.
Contributing	Help Wanted (0) Good First Issues (0) Pull Requests to Review (0)

Package Description

The reinforcement learning package

Additional Links

No additional links.

Maintainers

Nathaniel Hamilton

Authors

No additional authors.

src/rl/README.md

Reinforcement Learning

These methods are designed for training a vehicle follower, which we plan on expanding to platooning. However, we might expand this to train a single racer in the future.

Both methods require a parameter file is loaded before running. We’ll explain these more below, but they contain all of hyperparameters for the learning algorithm. Modifying these can create different results which might be better than our performance. We use nominal values discussed in papers opting for consistent results instead of peak performance.

DDPG

Deep Deterministic Policy Gradient (DDPG) is explained in Lillicrap et. al, 2015. The method uses model-free, off-policy learning to determine an optimal deterministic policy for the agent to follow. This means, during training, the agent executes random actions not determined using the learned policy. This string of randomly executed actions is stored in a replay buffer, which is sampled from after each step to learn an estimation of the state-action value (a.k.a. Q-value) function. This Q-function is used to train the policy to take actions that result in the largest Q-value. The folder containing this method, DDPG, is made of the following parts:

config_ddpg.yaml: YAML file with all of the configuration information including the hyperparameters used for training.
ddpg.py: The main file for running this method. Every step of the DDPG algorithm is implemented in this file.
nn_ddpg.py: The neural network architecture described in Lillicrap et. al, 2015 is implemented here using PyTorch. Forward passes and initialization are handled in this class as well.
noise_OU.py: Ornstein-Uhlenbeck process noise is implemented in this file. This noise is applied to create the random exploration actions. This is the same noise used in Lillicrap et. al, 2015.
replay_buffer.py:

To run this method, do the normal launch of the racing environment with 2 cars, then in a separate terminal:

$ rosparam load config_ddpg.yaml
$ rosrun rl ddpg.py

PPO

Proximal Policy Optimization (PPO) is explained in Schulman et. al, 2017 as a simplified improvement to Trust Region Policy Optimization (TRPO). The method uses model-free, on-policy learning to determine an optimal stochastic policy for the agent to follow. This means, during training, the agent executes actions randomly chosen from the output policy distribution. The policy is followed over the course of a horizon. After the horizon is completed, the Advantages are computed and used to determine the effectiveness of the policy. The Advantage values are used along with the probability of the action being taken to compute a loss function, which is then clipped to prevent large changes in the policy. The clipped loss is used to update the policy and improve future Advantage estimation. The folder containing this method, PPO, is made of the following scripts:

class_nn.py: The neural network architecture described in Lillicrap et. al, 2015 is implemented here using PyTorch. Forward passes and initialization are handled in this class as well. We are currently working on changing the architecture to match that used in Schulman et. al, 2017.
ppo.py: The main file for running this method. Every step of the PPO algorithm is implemented in this file.
ppo_config.py: YAML file with all of the configuration information including the hyperparameters used for training.

To run this method, do the normal launch of the racing environment with 2 cars, then in a separate terminal:

$ rosparam load ppo_config.yaml
$ rosrun rl ppo.py

Reset the Environment

If the car crashes or you want to start the experiment again. Simply run:

 $ rosrun race reset_world.py

to restart the experiment.

Running teleoperation nodes

To run a node to tele-operate the car via the keyboard run the following in a new terminal:

$ rosrun race keyboard_gen.py racecar

‘racecar’ can be replaced with ‘racecar1’ ‘racecar2’ if there are multiple cars.

Additionally if using the f1_tenth_devel.launch file, simply type the following:

```bash $ roslaunch race f1_tenth_devel.launch enable_keyboard:=true

CHANGELOG

No CHANGELOG found.

Wiki Tutorials

This package does not provide any links to tutorials in it's rosindex metadata. You can check on the ROS Wiki Tutorials page for the package.

Package Dependencies

Deps	Name
	catkin
	message_generation
	message_runtime
	tf
	roscpp
	rospy
	sensor_msgs
	std_msgs
	geometry_msgs
	gazebo_ros

System Dependencies

No direct system dependencies.

Dependant Packages

No known dependants.

Launch files

No launch files found

Messages

No message files found.

Services

No service files found

Plugins

No plugins found.

Recent questions tagged `rl` at Robotics Stack Exchange

API Docs Browse Code

No version for distro noetic. Known supported distros are highlighted in the buttons above.

No version for distro galactic. Known supported distros are highlighted in the buttons above.

No version for distro iron. Known supported distros are highlighted in the buttons above.

No version for distro melodic. Known supported distros are highlighted in the buttons above.