fishyrl.buffers module#
Buffer classes for storing and managing experiences.
- class fishyrl.buffers.Buffer#
Bases:
objectClass template for buffers.
- abstractmethod add(experience: dict[str, ndarray]) None#
Add an experience to the buffer.
- Parameters:
experience (dict[str, np.ndarray]) – The experience to add, represented as a dictionary of numpy arrays.
- abstractmethod load_state_dict(state_dict: dict[str, Any]) None#
Load the state of the buffer from a dictionary.
- Parameters:
state_dict (dict[str, Any]) – A dictionary containing the state of the buffer, usually obtained from
state_dict().
- abstractmethod reset() None#
Reset/initialize the buffer.
- abstractmethod sample(batch_size: int) dict[str, ndarray]#
Sample a batch of experiences from the buffer.
- Parameters:
batch_size (int) – The number of experiences to sample.
- Returns:
A batch of experiences, represented as a dictionary of numpy arrays.
- Return type:
dict[str, np.ndarray]
- abstractmethod state_dict() dict[str, Any]#
Return a dictionary containing the state of the buffer for saving.
- Returns:
A dictionary containing the state of the buffer.
- Return type:
dict[str, Any]
- abstract property size: int#
The current number of experiences stored in the buffer.
- Type:
int
- class fishyrl.buffers.IndependentVectorizedBuffer(num_buffers: int, *buffer_args: list, buffer_class: Buffer = <class 'fishyrl.buffers.SequentialBuffer'>, seed: int = None, **buffer_kwargs: dict)#
Bases:
BufferClass for group manipulation of independent buffers for vectorized environments.
- __init__(num_buffers: int, *buffer_args: list, buffer_class: Buffer = <class 'fishyrl.buffers.SequentialBuffer'>, seed: int = None, **buffer_kwargs: dict) None#
Initialize the buffer group.
- Parameters:
num_buffers (int) – The number of buffers to create.
capacity (int) – The maximum number of experiences each buffer can store.
validate_keys (bool) – Whether to validate that all experiences have the same keys across buffers.
buffer_class (Buffer) – The class of the buffers to create.
seed (int) – The seed for the random number generator used for sampling.
buffer_args (list) – Positional arguments to pass to the buffer class.
buffer_kwargs (dict) – Keyword arguments to pass to the buffer class.
- add(experience: dict[str, ndarray]) None#
Add experience to all buffers. Note that buffers will desync if an error occurs while adding and experience to any buffer.
- Parameters:
experience (dict[str, np.ndarray]) – Vectorized experiences to add, represented as a dictionary of 2-dimensional numpy arrays.
- load_state_dict(state_dict: dict[str, Any]) None#
Load the state of all buffers from a dictionary.
- Parameters:
state_dict (dict[str, Any]) – A dictionary containing the state of all buffers, usually obtained from
state_dict().
- reset() None#
Reset/initialize all buffers.
- sample(batch_size: int, **sample_kwargs: dict[str, Any]) dict[str, ndarray]#
Sample a batch of experiences from each buffer and concatenate them.
- Parameters:
batch_size (int) – The number of experiences to sample, divided randomly among buffers.
sample_kwargs (dict[str, Any]) – Additional keyword arguments to pass to each buffer’s sample method.
- Returns:
A batch of experiences, represented as a dictionary of numpy arrays.
- Return type:
dict[str, np.ndarray]
- state_dict() dict[str, Any]#
Return a dictionary containing the state of all buffers for saving.
- Returns:
A dictionary containing the state of all buffers.
- Return type:
dict[str, Any]
- property size: int#
The current number of experiences stored in all buffers.
- Type:
int
- class fishyrl.buffers.SequentialBuffer(capacity: int, validate_keys: bool = True, seed: int = None)#
Bases:
BufferA buffer that stores experiences sequentially.
- __init__(capacity: int, validate_keys: bool = True, seed: int = None) None#
Initialize the buffer.
- Parameters:
capacity (int) – The maximum number of experiences to store.
validate_keys (bool) – Whether to validate that all experiences have the same keys.
- add(experience: dict[str, ndarray]) None#
Add an experience to the buffer.
- Parameters:
experience (dict[str, np.ndarray]) – The experience to add, represented as a dictionary of numpy arrays.
- load_state_dict(state_dict: dict[str, Any]) None#
Load the state of the buffer from a dictionary.
- Parameters:
state_dict (dict[str, Any]) – A dictionary containing the state of the buffer, usually obtained from
state_dict().
- reset() None#
Reset/initialize the buffer.
- sample(batch_size: int, sequence_length: int) dict[str, ndarray]#
Sample a batch of experiences from the buffer.
- Parameters:
batch_size (int) – The number of experiences to sample.
sequence_length (int) – The length of each sampled experience.
- Returns:
A batch of experiences, represented as a dictionary of numpy arrays.
- Return type:
dict[str, np.ndarray]
- state_dict() dict[str, Any]#
Return a dictionary containing the state of the buffer for saving.
- Returns:
A dictionary containing the state of the buffer.
- Return type:
dict[str, Any]
- property capacity: int#
The maximum number of experiences the buffer can store.
- Type:
int
- property is_full: bool#
Whether the buffer is full.
- Type:
bool
- property size: int#
The current number of experiences stored in the buffer.
- Type:
int
- fishyrl.buffers.convert_samples_to_tensors(samples: dict[str, ndarray], **tensor_kwargs: dict[str, Any]) dict[str, Tensor]#
Convert sampled experiences from numpy arrays to tensors.
- Parameters:
samples (dict[str, np.ndarray]) – A batch of experiences, represented as a dictionary of numpy arrays.
tensor_kwargs (dict[str, Any]) – Keyword arguments to pass to the torch.tensor constructor, usually for specifying device.
- Returns:
A batch of experiences, represented as a dictionary of tensors.
- Return type:
dict[str, torch.Tensor]