fishyrl.buffers module#

Buffer classes for storing and managing experiences.

class fishyrl.buffers.Buffer#

Bases: object

Class template for buffers.

abstractmethod add(experience: dict[str, ndarray]) None#

Add an experience to the buffer.

Parameters:

experience (dict[str, np.ndarray]) – The experience to add, represented as a dictionary of numpy arrays.

abstractmethod load_state_dict(state_dict: dict[str, Any]) None#

Load the state of the buffer from a dictionary.

Parameters:

state_dict (dict[str, Any]) – A dictionary containing the state of the buffer, usually obtained from state_dict().

abstractmethod reset() None#

Reset/initialize the buffer.

abstractmethod sample(batch_size: int) dict[str, ndarray]#

Sample a batch of experiences from the buffer.

Parameters:

batch_size (int) – The number of experiences to sample.

Returns:

A batch of experiences, represented as a dictionary of numpy arrays.

Return type:

dict[str, np.ndarray]

abstractmethod state_dict() dict[str, Any]#

Return a dictionary containing the state of the buffer for saving.

Returns:

A dictionary containing the state of the buffer.

Return type:

dict[str, Any]

abstract property size: int#

The current number of experiences stored in the buffer.

Type:

int

class fishyrl.buffers.IndependentVectorizedBuffer(num_buffers: int, *buffer_args: list, buffer_class: Buffer = <class 'fishyrl.buffers.SequentialBuffer'>, seed: int = None, **buffer_kwargs: dict)#

Bases: Buffer

Class for group manipulation of independent buffers for vectorized environments.

__init__(num_buffers: int, *buffer_args: list, buffer_class: Buffer = <class 'fishyrl.buffers.SequentialBuffer'>, seed: int = None, **buffer_kwargs: dict) None#

Initialize the buffer group.

Parameters:
  • num_buffers (int) – The number of buffers to create.

  • capacity (int) – The maximum number of experiences each buffer can store.

  • validate_keys (bool) – Whether to validate that all experiences have the same keys across buffers.

  • buffer_class (Buffer) – The class of the buffers to create.

  • seed (int) – The seed for the random number generator used for sampling.

  • buffer_args (list) – Positional arguments to pass to the buffer class.

  • buffer_kwargs (dict) – Keyword arguments to pass to the buffer class.

add(experience: dict[str, ndarray]) None#

Add experience to all buffers. Note that buffers will desync if an error occurs while adding and experience to any buffer.

Parameters:

experience (dict[str, np.ndarray]) – Vectorized experiences to add, represented as a dictionary of 2-dimensional numpy arrays.

load_state_dict(state_dict: dict[str, Any]) None#

Load the state of all buffers from a dictionary.

Parameters:

state_dict (dict[str, Any]) – A dictionary containing the state of all buffers, usually obtained from state_dict().

reset() None#

Reset/initialize all buffers.

sample(batch_size: int, **sample_kwargs: dict[str, Any]) dict[str, ndarray]#

Sample a batch of experiences from each buffer and concatenate them.

Parameters:
  • batch_size (int) – The number of experiences to sample, divided randomly among buffers.

  • sample_kwargs (dict[str, Any]) – Additional keyword arguments to pass to each buffer’s sample method.

Returns:

A batch of experiences, represented as a dictionary of numpy arrays.

Return type:

dict[str, np.ndarray]

state_dict() dict[str, Any]#

Return a dictionary containing the state of all buffers for saving.

Returns:

A dictionary containing the state of all buffers.

Return type:

dict[str, Any]

property size: int#

The current number of experiences stored in all buffers.

Type:

int

class fishyrl.buffers.SequentialBuffer(capacity: int, validate_keys: bool = True, seed: int = None)#

Bases: Buffer

A buffer that stores experiences sequentially.

__init__(capacity: int, validate_keys: bool = True, seed: int = None) None#

Initialize the buffer.

Parameters:
  • capacity (int) – The maximum number of experiences to store.

  • validate_keys (bool) – Whether to validate that all experiences have the same keys.

add(experience: dict[str, ndarray]) None#

Add an experience to the buffer.

Parameters:

experience (dict[str, np.ndarray]) – The experience to add, represented as a dictionary of numpy arrays.

load_state_dict(state_dict: dict[str, Any]) None#

Load the state of the buffer from a dictionary.

Parameters:

state_dict (dict[str, Any]) – A dictionary containing the state of the buffer, usually obtained from state_dict().

reset() None#

Reset/initialize the buffer.

sample(batch_size: int, sequence_length: int) dict[str, ndarray]#

Sample a batch of experiences from the buffer.

Parameters:
  • batch_size (int) – The number of experiences to sample.

  • sequence_length (int) – The length of each sampled experience.

Returns:

A batch of experiences, represented as a dictionary of numpy arrays.

Return type:

dict[str, np.ndarray]

state_dict() dict[str, Any]#

Return a dictionary containing the state of the buffer for saving.

Returns:

A dictionary containing the state of the buffer.

Return type:

dict[str, Any]

property capacity: int#

The maximum number of experiences the buffer can store.

Type:

int

property is_full: bool#

Whether the buffer is full.

Type:

bool

property size: int#

The current number of experiences stored in the buffer.

Type:

int

fishyrl.buffers.convert_samples_to_tensors(samples: dict[str, ndarray], **tensor_kwargs: dict[str, Any]) dict[str, Tensor]#

Convert sampled experiences from numpy arrays to tensors.

Parameters:
  • samples (dict[str, np.ndarray]) – A batch of experiences, represented as a dictionary of numpy arrays.

  • tensor_kwargs (dict[str, Any]) – Keyword arguments to pass to the torch.tensor constructor, usually for specifying device.

Returns:

A batch of experiences, represented as a dictionary of tensors.

Return type:

dict[str, torch.Tensor]