abaco.dataloader module#

abaco.dataloader.ABaCoDataLoader(data, device=torch.device, batch_label='batch', exp_label='tissue', batch_size=32, total_size=1024, total_batch=10)[source]#
abaco.dataloader.DataPreprocess(path: str, factors: list = ['sample', 'batch', 'tissue'], delimiter: str = ',')[source]#

Reads a CSV file and preprocesses the data by converting specified columns to categorical type. :param path: The path to the CSV file containing the data. :type path: str :param factors: List of factor columns to convert to categorical type. Default is [“sample”, “batch”, “tissue”]. :type factors: list, optional

Returns:

The preprocessed DataFrame with specified factor columns converted to categorical type.

Return type:

pd.DataFrame

abaco.dataloader.DataReverseTransform(data, original_data, factors=['sample', 'batch', 'tissue'], transformation='CLR', count=False)[source]#
abaco.dataloader.DataTransform(data, factors=['sample', 'batch', 'tissue'], transformation='CLR', count=False)[source]#

Transforms the data based on the specified transformation method.

Parameters:
  • data (pd.DataFrame) – The input data containing OTU counts and factors.

  • factors (list, optional) – List of factor columns to retain in the transformed data.

  • transformation (str, optional) – The transformation method to apply. Options are “CLR”, “Sqrt”, “ILR”, “ALR”.

  • count (bool, optional) – If True, the data is treated as count data; otherwise, a small offset is added to avoid log(0) issues.

Returns:

The transformed data with the specified factors and transformed OTU counts.

Return type:

pd.DataFrame

abaco.dataloader.class_to_int(labels)[source]#
abaco.dataloader.one_hot_encoding(labels: pandas.Series, dtype: torch.dtype = torch.float32) tuple[source]#

Converts a series of labels into a one-hot encoded matrix.

Parameters:
  • labels (pd.Series) – The input labels to be one-hot encoded.

  • dtype (torch.dtype, optional) – The data type of the output tensor. Default is torch.float32.

Returns:

A tuple containing: - torch.Tensor: A one-hot encoded matrix where each row corresponds to a label. - list: The categories (unique labels) encoded in matrix

Return type:

tuple

Example

>>> import pandas as pd
>>> import torch
>>> labels = pd.Series(['A', 'B', 'A', 'C'])
>>> one_hot_matrix, categories = one_hot_encoding(labels, dtype=torch.int32)
>>> print(one_hot_matrix)
tensor([[1, 0, 0],
        [0, 1, 0],
        [1, 0, 0],
        [0, 0, 1]], dtype=torch.int32)
>>> print(categories)
['A', 'B', 'C']