fishyrl.buffers module#

Buffer classes for storing and managing experiences.

class fishyrl.buffers.Buffer#

Bases: object

Class template for buffers.

abstractmethod add(experience: dict[str, ndarray]) → None#

Add an experience to the buffer.

Parameters:: experience (dict[str, np.ndarray]) – The experience to add, represented as a dictionary of numpy arrays.

abstractmethod load_state_dict(state_dict: dict[str, Any]) → None#

Load the state of the buffer from a dictionary.

Parameters:: state_dict (dict[str, Any]) – A dictionary containing the state of the buffer, usually obtained from state_dict().

abstractmethod reset() → None#: Reset/initialize the buffer.

abstractmethod sample(batch_size: int) → dict[str, ndarray]#

Sample a batch of experiences from the buffer.

Parameters:: batch_size (int) – The number of experiences to sample.
Returns:: A batch of experiences, represented as a dictionary of numpy arrays.
Return type:: dict[str, np.ndarray]

abstractmethod state_dict() → dict[str, Any]#

Return a dictionary containing the state of the buffer for saving.

Returns:: A dictionary containing the state of the buffer.
Return type:: dict[str, Any]

abstract property size: int#

The current number of experiences stored in the buffer.

Type:: int

class fishyrl.buffers.IndependentVectorizedBuffer(num_buffers: int, *buffer_args: list, buffer_class: Buffer = <class 'fishyrl.buffers.SequentialBuffer'>, seed: int = None, **buffer_kwargs: dict)#

Bases: Buffer

Class for group manipulation of independent buffers for vectorized environments.

__init__(num_buffers: int, *buffer_args: list, buffer_class: Buffer = <class 'fishyrl.buffers.SequentialBuffer'>, seed: int = None, **buffer_kwargs: dict) → None#

Initialize the buffer group.

Parameters:

num_buffers (int) – The number of buffers to create.
capacity (int) – The maximum number of experiences each buffer can store.
validate_keys (bool) – Whether to validate that all experiences have the same keys across buffers.
buffer_class (Buffer) – The class of the buffers to create.
seed (int) – The seed for the random number generator used for sampling.
buffer_args (list) – Positional arguments to pass to the buffer class.
buffer_kwargs (dict) – Keyword arguments to pass to the buffer class.

add(experience: dict[str, ndarray]) → None#

Add experience to all buffers. Note that buffers will desync if an error occurs while adding and experience to any buffer.

Parameters:: experience (dict[str, np.ndarray]) – Vectorized experiences to add, represented as a dictionary of 2-dimensional numpy arrays.

load_state_dict(state_dict: dict[str, Any]) → None#

Load the state of all buffers from a dictionary.

Parameters:: state_dict (dict[str, Any]) – A dictionary containing the state of all buffers, usually obtained from state_dict().

reset() → None#: Reset/initialize all buffers.

sample(batch_size: int, **sample_kwargs: dict[str, Any]) → dict[str, ndarray]#

Sample a batch of experiences from each buffer and concatenate them.

Parameters:

batch_size (int) – The number of experiences to sample, divided randomly among buffers.
sample_kwargs (dict[str, Any]) – Additional keyword arguments to pass to each buffer’s sample method.

Returns:

A batch of experiences, represented as a dictionary of numpy arrays.

Return type:

dict[str, np.ndarray]

state_dict() → dict[str, Any]#

Return a dictionary containing the state of all buffers for saving.

Returns:: A dictionary containing the state of all buffers.
Return type:: dict[str, Any]

property size: int#

The current number of experiences stored in all buffers.

Type:: int

class fishyrl.buffers.SequentialBuffer(capacity: int, validate_keys: bool = True, seed: int = None)#

Bases: Buffer

A buffer that stores experiences sequentially.

__init__(capacity: int, validate_keys: bool = True, seed: int = None) → None#

Initialize the buffer.

Parameters:

capacity (int) – The maximum number of experiences to store.
validate_keys (bool) – Whether to validate that all experiences have the same keys.

add(experience: dict[str, ndarray]) → None#

Add an experience to the buffer.

Parameters:: experience (dict[str, np.ndarray]) – The experience to add, represented as a dictionary of numpy arrays.

load_state_dict(state_dict: dict[str, Any]) → None#

Load the state of the buffer from a dictionary.

Parameters:: state_dict (dict[str, Any]) – A dictionary containing the state of the buffer, usually obtained from state_dict().

reset() → None#: Reset/initialize the buffer.

sample(batch_size: int, sequence_length: int) → dict[str, ndarray]#

Sample a batch of experiences from the buffer.

Parameters:

batch_size (int) – The number of experiences to sample.
sequence_length (int) – The length of each sampled experience.

Returns:

A batch of experiences, represented as a dictionary of numpy arrays.

Return type:

dict[str, np.ndarray]

state_dict() → dict[str, Any]#

Return a dictionary containing the state of the buffer for saving.

Returns:: A dictionary containing the state of the buffer.
Return type:: dict[str, Any]

property capacity: int#

The maximum number of experiences the buffer can store.

Type:: int

property is_full: bool#

Whether the buffer is full.

Type:: bool

property size: int#

The current number of experiences stored in the buffer.

Type:: int

fishyrl.buffers.convert_samples_to_tensors(samples: dict[str, ndarray], **tensor_kwargs: dict[str, Any]) → dict[str, Tensor]#

Convert sampled experiences from numpy arrays to tensors.

Parameters:

samples (dict[str, np.ndarray]) – A batch of experiences, represented as a dictionary of numpy arrays.
tensor_kwargs (dict[str, Any]) – Keyword arguments to pass to the torch.tensor constructor, usually for specifying device.

Returns:

A batch of experiences, represented as a dictionary of tensors.

Return type:

dict[str, torch.Tensor]

fishyrl.buffers module#

This Page