Dataset#

class PinaDataset(conditions_dict, max_conditions_lengths, automatic_batching)[source]#

Bases: Dataset, ABC

Abstract class for the PINA dataset which extends the PyTorch Dataset class. It defines the common interface for PinaTensorDataset and PinaGraphDataset classes.

Initialize the instance by storing the conditions dictionary, the maximum number of items per conditions to consider, and the automatic batching flag.

Parameters:
  • conditions_dict (dict) – A dictionary mapping condition names to their respective data. Each key represents a condition name, and the corresponding value is a dictionary containing the associated data.

  • max_conditions_lengths (dict) – Maximum number of data points that can be included in a single batch per condition.

  • automatic_batching (bool) – Indicates whether PyTorch automatic batching is enabled in PinaDataModule.

get_all_data()[source]#

Return all data in the dataset.

Returns:

A dictionary containing all the data in the dataset.

Return type:

dict

fetch_from_idx_list(idx)[source]#

Return data from the dataset given a list of indices.

Parameters:

idx (list[int]) – List of indices.

Returns:

A dictionary containing the data at the given indices.

Return type:

dict

class PinaDatasetFactory(conditions_dict, **kwargs)[source]#

Bases: object

Factory class for the PINA dataset.

Depending on the data type inside the conditions, it instanciate an object belonging to the appropriate subclass of PinaDataset. The possible subclasses are:

Instantiate the appropriate subclass of PinaDataset.

If a graph is present in the conditions, returns a PinaGraphDataset, otherwise returns a PinaTensorDataset.

Parameters:

conditions_dict (dict) – Dictionary containing all the conditions to be included in the dataset instance.

Returns:

A subclass of PinaDataset.

Return type:

PinaTensorDataset | PinaGraphDataset

Raises:

ValueError – If an empty dictionary is provided.

class PinaGraphDataset(conditions_dict, max_conditions_lengths, automatic_batching)[source]#

Bases: PinaDataset

Dataset class for the PINA dataset with Data and Graph data.

Initialize the instance by storing the conditions dictionary, the maximum number of items per conditions to consider, and the automatic batching flag.

Parameters:
  • conditions_dict (dict) – A dictionary mapping condition names to their respective data. Each key represents a condition name, and the corresponding value is a dictionary containing the associated data.

  • max_conditions_lengths (dict) – Maximum number of data points that can be included in a single batch per condition.

  • automatic_batching (bool) – Indicates whether PyTorch automatic batching is enabled in PinaDataModule.

create_batch(data)[source]#

Create a Batch object from a list of Data objects.

Parameters:

data (list[Data] | list[Graph]) – List of items to collate in a single batch.

Returns:

Batch object.

Return type:

Batch | LabelBatch

class PinaTensorDataset(conditions_dict, max_conditions_lengths, automatic_batching)[source]#

Bases: PinaDataset

Dataset class for the PINA dataset with torch.Tensor and LabelTensor data.

Initialize the instance by storing the conditions dictionary, the maximum number of items per conditions to consider, and the automatic batching flag.

Parameters:
  • conditions_dict (dict) – A dictionary mapping condition names to their respective data. Each key represents a condition name, and the corresponding value is a dictionary containing the associated data.

  • max_conditions_lengths (dict) – Maximum number of data points that can be included in a single batch per condition.

  • automatic_batching (bool) – Indicates whether PyTorch automatic batching is enabled in PinaDataModule.

property input#

Return the input data for the dataset.

Returns:

Dictionary containing the input points.

Return type:

dict