Gym Skill

The AEA gym skill demonstrates how a custom Reinforcement Learning agent, that uses OpenAI's gym library, may be embedded into an AEA skill and connection.

Discussion

The gym skills demonstrate how to wrap a Reinforcement Learning agent in a skill. The example decouples the RL agent from the

gym.Env

allowing them to run in separate execution environments, potentially owned by different entities.

Preparation Instructions

Dependencies

Follow the Preliminaries and Installation sections from the AEA quick start.

Download the necessary directories into your working directory:

mkdir gym_skill_agent
svn export https://github.com/fetchai/agents-aea.git/trunk/examples

Install the

gym

and

numpy

library.

pip install numpy gym

Demo Instructions

Create the AEA

First, fetch the gym AEA:

aea fetch fetchai/gym_aea:0.26.5 --alias my_gym_aea
cd my_gym_aea
aea install

??? note "Alternatively, create from scratch:"

### Create the AEA

In the root directory, create the gym AEA and enter the project.

``` bash
aea create my_gym_aea
cd my_gym_aea
```

### Add the gym skill

``` bash
aea add skill fetchai/gym:0.21.6
```

### Set gym connection as default

``` bash
aea config set agent.default_connection fetchai/gym:0.20.6
```

### Install the skill dependencies

To install the `gym` package, a dependency of the gym skill, from PyPI run

``` bash
aea install
```

Set up the Training Environment

Copy the Gym Environment to the AEA Directory

mkdir gyms
cp -a ../examples/gym_ex/gyms/. gyms/

Update the Connection Configuration

aea config set vendor.fetchai.connections.gym.config.env 'gyms.env.BanditNArmedRandom'

Create and Add a Private Key

aea generate-key fetchai
aea add-key fetchai

Run the AEA with the Gym Connection

aea run

You will see the gym training logs.

Delete the AEA

When you're done, you can go up a level and delete the AEA.

cd ..
aea delete my_gym_aea

Communication

This diagram shows the communication between the AEA and the gym environment

    sequenceDiagram
        participant AEA
        participant Environment

        activate AEA
        activate Environment
        AEA->>Environment: reset
        loop learn
            AEA->>Environment: act
            Environment->>AEA: percept
        end
        AEA->>Environment: close

        deactivate AEA
        deactivate Environment

Skill Architecture

The skill consists of two core components:

GymHandler

and

GymTask

In the

setup

method of the

GymHandler

the

GymTask

is initialized, as well as its

setup

and

execute

methods called. The handler, which is registered against the

GymMessage.protocol_id

then filters for messages of that protocol with the performative

GymMessage.Performative.PERCEPT

. These messages are passed to the

proxy_env_queue

of the task.

The

GymTask

is responsible for training the RL agent. In particular,

MyRLAgent

is initialized and trained against

ProxyEnv

. The

ProxyEnv

instantiates a

gym.Env

class and therefore implements its API. This means the proxy environment is compatible with any

gym

compatible RL agent. However, unlike other environments it only acts as a proxy and does not implement an environment of its own. It allows for the decoupling of the process environment of the

gym.env

from the process environment of the RL agent. The actual

gym.env

against which the agent is trained is wrapped by the

gym

connection. The proxy environment and gym connection communicate via a protocol, the

gym

protocol. Note, it would trivially be possible to implement the

gym

environment in another AEA; this way one AEA could provide

gym

environments as a service. Naturally, the overhead created by the introduction of the extra layers causes a higher latency when training the RL agent.

In this particular skill, which chiefly serves for demonstration purposes, we implement a very basic RL agent. The agent trains a model of price of

goods: it aims to discover the most likely price of each good. To this end, the agent randomly selects one of the

goods on each training step and then chooses as an

action

the price which it deems is most likely accepted. Each good is represented by an id and the possible price range

[1,100]

divided into 100 integer bins. For each price bin, a

PriceBandit

is created which models the likelihood of this price. In particular, a price bandit maintains a beta distribution. The beta distribution is initialized to the uniform distribution. Each time the price associated with a given

PriceBandit

is accepted or rejected the distribution maintained by the

PriceBandit

is updated. For each good, the agent can therefore over time learn which price is most likely.

The illustration shows how the RL agent only interacts with the proxy environment by sending it

action (A)

and receiving

observation (O)

reward (R)

done (D)

and

info (I)

Gym Skill

The AEA gym skill demonstrates how a custom Reinforcement Learning agent, that uses OpenAI's gym library, may be embedded into an AEA skill and connection.

Discussion

The gym skills demonstrate how to wrap a Reinforcement Learning agent in a skill. The example decouples the RL agent from the

gym.Env

allowing them to run in separate execution environments, potentially owned by different entities.

Preparation Instructions

Dependencies

Follow the Preliminaries and Installation sections from the AEA quick start.

Download the necessary directories into your working directory:

mkdir gym_skill_agent
svn export https://github.com/fetchai/agents-aea.git/trunk/examples

Install the

gym

and

numpy

library.

pip install numpy gym

Demo Instructions

Create the AEA

First, fetch the gym AEA:

aea fetch fetchai/gym_aea:0.26.5 --alias my_gym_aea
cd my_gym_aea
aea install

??? note "Alternatively, create from scratch:"

### Create the AEA

In the root directory, create the gym AEA and enter the project.

``` bash
aea create my_gym_aea
cd my_gym_aea
```

### Add the gym skill

``` bash
aea add skill fetchai/gym:0.21.6
```

### Set gym connection as default

``` bash
aea config set agent.default_connection fetchai/gym:0.20.6
```

### Install the skill dependencies

To install the `gym` package, a dependency of the gym skill, from PyPI run

``` bash
aea install
```

Set up the Training Environment

Copy the Gym Environment to the AEA Directory

mkdir gyms
cp -a ../examples/gym_ex/gyms/. gyms/

Update the Connection Configuration

aea config set vendor.fetchai.connections.gym.config.env 'gyms.env.BanditNArmedRandom'

Create and Add a Private Key

aea generate-key fetchai
aea add-key fetchai

Run the AEA with the Gym Connection

aea run

You will see the gym training logs.

Delete the AEA

When you're done, you can go up a level and delete the AEA.

cd ..
aea delete my_gym_aea

Communication

This diagram shows the communication between the AEA and the gym environment

    sequenceDiagram
        participant AEA
        participant Environment

        activate AEA
        activate Environment
        AEA->>Environment: reset
        loop learn
            AEA->>Environment: act
            Environment->>AEA: percept
        end
        AEA->>Environment: close

        deactivate AEA
        deactivate Environment

Skill Architecture

The skill consists of two core components:

GymHandler

and

GymTask

In the

setup

method of the

GymHandler

the

GymTask

is initialized, as well as its

setup

and

execute

methods called. The handler, which is registered against the

GymMessage.protocol_id

then filters for messages of that protocol with the performative

GymMessage.Performative.PERCEPT

. These messages are passed to the

proxy_env_queue

of the task.

The

GymTask

is responsible for training the RL agent. In particular,

MyRLAgent

is initialized and trained against

ProxyEnv

. The

ProxyEnv

instantiates a

gym.Env

class and therefore implements its API. This means the proxy environment is compatible with any

gym

compatible RL agent. However, unlike other environments it only acts as a proxy and does not implement an environment of its own. It allows for the decoupling of the process environment of the

gym.env

from the process environment of the RL agent. The actual

gym.env

against which the agent is trained is wrapped by the

gym

connection. The proxy environment and gym connection communicate via a protocol, the

gym

protocol. Note, it would trivially be possible to implement the

gym

environment in another AEA; this way one AEA could provide

gym

environments as a service. Naturally, the overhead created by the introduction of the extra layers causes a higher latency when training the RL agent.

In this particular skill, which chiefly serves for demonstration purposes, we implement a very basic RL agent. The agent trains a model of price of

goods: it aims to discover the most likely price of each good. To this end, the agent randomly selects one of the

goods on each training step and then chooses as an

action

the price which it deems is most likely accepted. Each good is represented by an id and the possible price range

[1,100]

divided into 100 integer bins. For each price bin, a

PriceBandit

PriceBandit

is accepted or rejected the distribution maintained by the

PriceBandit

is updated. For each good, the agent can therefore over time learn which price is most likely.

The illustration shows how the RL agent only interacts with the proxy environment by sending it

action (A)

and receiving

observation (O)

reward (R)

done (D)

and

info (I)

Gym Skill

Gym Skill

Discussion

Preparation Instructions

Dependencies

Demo Instructions

Create the AEA

Set up the Training Environment

Copy the Gym Environment to the AEA Directory

Update the Connection Configuration

Create and Add a Private Key

Run the AEA with the Gym Connection

Delete the AEA

Communication

Skill Architecture

Related Skills

Nano Banana Pro

Markdown Converter

1password

Gym Skill

Discussion

Preparation Instructions

Dependencies

Demo Instructions

Create the AEA

Set up the Training Environment

Copy the Gym Environment to the AEA Directory

Update the Connection Configuration

Create and Add a Private Key

Run the AEA with the Gym Connection

Delete the AEA

Communication

Skill Architecture