Force-Based Hindsight Experience Prioritization

Erdi Sayar
Zhenshan Bing
Carlo D'Eramo
Ozgur S. Oguz
Alois Knoll
Technical University of Munich
Technical University of Munich
University of Würzburg
Bilkent University
Technical University of Munich
ICRA 2024
[Download Paper]
[GitHub Code]


DETAILS ARE COMMING

How to run benchmarks

Using docker containers

#!/bin/bash
       

git clone https://github.com/erdiphd/HER_force.git
cd HER_force/docker

#train our methods
docker-compose run --rm -e mujoco_env=FetchPickAndPlace-v1 -e log_tag=log/t1_contact_energy_pick -e n_epochs=50 -e num_cpu=20 -e prioritization=contact_energy -e reward_type=sparse her_tactile
docker-compose run --rm -e mujoco_env=FetchPush-v1 -e log_tag=log/t1_contact_energy_push -e n_epochs=50 -e num_cpu=20 -e prioritization=contact_energy -e reward_type=sparse her_tactile
docker-compose run --rm -e mujoco_env=FetchSlide-v1 -e log_tag=log/t1_contact_energy_slide -e n_epochs=50 -e num_cpu=20 -e prioritization=contact_energy -e reward_type=sparse her_tactile

#train cper methods
docker-compose run --rm -e mujoco_env=FetchPickAndPlace-v1 -e log_tag=log/t1_cper_pick -e n_epochs=50 -e num_cpu=20 -e prioritization=cper -e reward_type=intrinsic her_tactile
docker-compose run --rm -e mujoco_env=FetchPush-v1 -e log_tag=log/t1_cper_push -e n_epochs=50 -e num_cpu=20 -e prioritization=cper -e reward_type=intrinsic her_tactile
docker-compose run --rm -e mujoco_env=FetchSlide-v1 -e log_tag=log/t1_cper_slide -e n_epochs=50 -e num_cpu=20 -e prioritization=cper -e reward_type=intrinsic her_tactile

#train MEP methods
docker-compose run --rm -e mujoco_env=FetchPickAndPlace-v1 -e log_tag=log/t1_mep_pick -e n_epochs=50 -e num_cpu=20 -e prioritization=entropy -e reward_type=sparse her_tactile
docker-compose run --rm -e mujoco_env=FetchPush-v1 -e log_tag=log/t1_mep_push -e n_epochs=50 -e num_cpu=20 -e prioritization=entropy -e reward_type=sparse her_tactile
docker-compose run --rm -e mujoco_env=FetchSlide-v1 -e log_tag=log/t1_mep_slide -e n_epochs=50 -e num_cpu=20 -e prioritization=entropy -e reward_type=sparse her_tactile

#train PER methods
docker-compose run --rm -e mujoco_env=FetchPickAndPlace-v1 -e log_tag=log/t1_per_pick -e n_epochs=50 -e num_cpu=20 -e prioritization=tderror -e reward_type=sparse her_tactile
docker-compose run --rm -e mujoco_env=FetchPush-v1 -e log_tag=log/t1_per_push -e n_epochs=50 -e num_cpu=20 -e prioritization=tderror -e reward_type=sparse her_tactile
docker-compose run --rm -e mujoco_env=FetchSlide-v1 -e log_tag=log/t1_per_slide -e n_epochs=50 -e num_cpu=20 -e prioritization=tderror -e reward_type=sparse her_tactile

#train her methods
docker-compose run --rm -e mujoco_env=FetchPickAndPlace-v1 -e log_tag=log/t1_her_pick -e n_epochs=50 -e num_cpu=20 -e prioritization=none -e reward_type=sparse her_tactile
docker-compose run --rm -e mujoco_env=FetchPush-v1 -e log_tag=log/t1_her_push -e n_epochs=50 -e num_cpu=20 -e prioritization=none -e reward_type=sparse her_tactile
docker-compose run --rm -e mujoco_env=FetchSlide-v1 -e log_tag=log/t1_her_slide -e n_epochs=50 -e num_cpu=20 -e prioritization=none -e reward_type=sparse her_tactile

#train EBP methods
git clone https://github.com/ruizhaogit/EnergyBasedPrioritization
#Please use our Gym environment with the EBP

Creating virtual anaconda python environment

#!/bin/bash

source /home/user/conda/bin/activate
conda create --name her python=3.7 -y
conda activate her
pip install numpy && pip install cffi lockfile imageio glfw tensorflow==1.14 cython && pip install mujoco_py && pip install beautifultable==0.7.0
pip install numpy-quaternion gym==0.15.4 click joblib mpi4py scipy protobuf==3.19 scikit-learn pyyaml pyquaternion
#Clone the repository
git clone https://github.com/erdiphd/HER_force.git
#train the robot
cd HER_force/code/Algorithm
conda activate her
python baselines/her/experiment/train.py --env_name FetchPickAndPlace-v1 --logdir=log/t1_contact_energy --n_epochs=50 --num_cpu=20 --prioritization=contact_energy --reward_type sparse


Source Code and Environment

[GitHub]