RegularDataProcessor
Bases: BaseProcessor
Main class for Regular/Tabular Data Preprocessing. It works like any other transformer in scikit learn with the methods fit, transform and inverse transform.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
num_cols |
list of strings
|
List of names of numerical columns. |
None
|
cat_cols |
list of strings
|
List of names of categorical columns. |
None
|
Source code in /opt/hostedtoolcache/Python/3.10.12/x64/lib/python3.10/site-packages/ydata_synthetic/preprocessing/regular/processor.py
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 |
|
fit(X)
Fits the DataProcessor to a passed DataFrame.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X |
DataFrame
|
DataFrame used to fit the processor parameters. Should be aligned with the num/cat columns defined in initialization. |
required |
Returns:
Name | Type | Description |
---|---|---|
self |
RegularDataProcessor
|
The fitted data processor. |
Source code in /opt/hostedtoolcache/Python/3.10.12/x64/lib/python3.10/site-packages/ydata_synthetic/preprocessing/regular/processor.py
inverse_transform(X)
Inverts the data transformation pipelines on a passed DataFrame.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X |
ndarray
|
Numpy array to be brought back to the original data format. Should share the schema of data transformed by this DataProcessor. Can be used to revert transformations of training data or for synthetic samples. |
required |
Returns:
Name | Type | Description |
---|---|---|
result |
DataFrame
|
DataFrame with all performed transformations inverted. |
Source code in /opt/hostedtoolcache/Python/3.10.12/x64/lib/python3.10/site-packages/ydata_synthetic/preprocessing/regular/processor.py
transform(X)
Transforms the passed DataFrame with the fit DataProcessor.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X |
DataFrame
|
DataFrame used to fit the processor parameters. Should be aligned with the columns types defined in initialization. |
required |
Returns:
Name | Type | Description |
---|---|---|
transformed |
ndarray
|
Processed version of the passed DataFrame. |