Skip to content

Dataset

Module superwise.models.dataset

This module implement Dataset model

Classes

Dataset(name: str, files: Union[str, List[str]], project_id: int, type: Union[superwise.resources.superwise_enums.DatasetType, str] = DatasetType.TRAIN, dtypes: Dict[str, str] = None, roles: Dict[str, str] = None, **kwargs) DataSet

Description:

Constructor for Dataset class

Args:

name: The name of the dataset.

files: The raw data of the dataset. Can be provided as local or cloud file paths. (GCS or S3)

project_id: The ID of the project which this dataset will be assigned to.

type: The type of the dataset (See 'superwise.resources.superwise_enums.DatasetType' enum). Default 'TRAIN'.

dtypes: An optional mapping between columns and their dtypes. if not provided, will be inferred.

roles: An optional mapping between columns and their roles. If not provided, will be inferred.

Ancestors (in MRO)

  • superwise.models.base.BaseModel

Static methods

generate_dataset_from_dataframe(name: str, project_id, dataframe: pandas.core.frame.DataFrame, **kwargs) Description:

Construct a Dataset providing a dataframe

Args:

name: The name of the dataset.

project_id: The ID of the project which this dataset will be assigned to.

dataframe: The dataframe object in memory to with the data of the dataset.

type: The type of the dataset (See 'superwise.resources.superwise_enums.DatasetType' enum). Default 'TRAIN'.

dtypes: An optional mapping between columns and their dtypes. if not provided, will be inferred.

roles: An optional mapping between columns and their roles. If not provided, will be inferred.