Skip to content

4. Transforms


데이터는 기계 학습 알고리즘 학습에 필요한 최종 처리된 형태로 항상 제공되는 것은 아닙니다. 변환(transforms)을 사용하여 데이터를 일부 조작하고 학습에 적합하게 만듭니다.

Data does not always come in its final processed form that is required for training machine learning algorithms. We use transforms to perform some manipulation of the data and make it suitable for training.

모든 TorchVision 데이터 세트에는 피처를 수정하기 위한 transforms와 레이블을 수정하기 위한 target_transform이라는 두 가지 매개변수가 있습니다. torchvision.transforms 모듈은 일반적으로 사용되는 몇 가지 변환을 즉시 제공합니다.

All TorchVision datasets have two parameters -transform to modify the features and target_transform to modify the labels - that accept callables containing the transformation logic. The torchvision.transforms module offers several commonly-used transforms out of the box.

FashionMNIST 피처는 PIL 이미지 형식이며 레이블은 정수입니다. 학습을 위해서는 정규화된 텐서로서의 피처와 원-핫 인코딩된 텐서로서의 레이블이 필요합니다. 이러한 변환을 수행하기 위해 ToTensorLambda를 사용합니다.

The FashionMNIST features are in PIL Image format, and the labels are integers. For training, we need the features as normalized tensors, and the labels as one-hot encoded tensors. To make these transformations, we use ToTensor and Lambda.

import torch
from torchvision import datasets
from torchvision.transforms import ToTensor, Lambda

ds = datasets.FashionMNIST(
    root="data",
    train=True,
    download=True,
    transform=ToTensor(),
    target_transform=Lambda(lambda y: torch.zeros(10, dtype=torch.float).scatter_(0, torch.tensor(y), value=1))
)
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to data/FashionMNIST/raw/train-images-idx3-ubyte.gz

  0%|          | 0/26421880 [00:00<?, ?it/s]
  0%|          | 32768/26421880 [00:00<01:26, 303731.82it/s]
  0%|          | 65536/26421880 [00:00<01:27, 302022.75it/s]
  0%|          | 131072/26421880 [00:00<00:59, 438800.92it/s]
  1%|          | 229376/26421880 [00:00<00:42, 622644.78it/s]
  2%|1         | 491520/26421880 [00:00<00:20, 1265543.53it/s]
  4%|3         | 950272/26421880 [00:00<00:11, 2268165.19it/s]
  7%|7         | 1933312/26421880 [00:00<00:05, 4475573.78it/s]
 15%|#4        | 3833856/26421880 [00:00<00:02, 8614764.50it/s]
 26%|##6       | 6946816/26421880 [00:00<00:01, 14863212.96it/s]
 38%|###7      | 10027008/26421880 [00:01<00:00, 19014905.35it/s]
 50%|####9     | 13172736/26421880 [00:01<00:00, 21951501.57it/s]
 62%|######1   | 16252928/26421880 [00:01<00:00, 23948574.50it/s]
 73%|#######3  | 19365888/26421880 [00:01<00:00, 25346762.91it/s]
 85%|########4 | 22380544/26421880 [00:01<00:00, 25844129.53it/s]
 96%|#########6| 25493504/26421880 [00:01<00:00, 26651853.19it/s]
100%|##########| 26421880/26421880 [00:01<00:00, 16118381.05it/s]
Extracting data/FashionMNIST/raw/train-images-idx3-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw/train-labels-idx1-ubyte.gz

  0%|          | 0/29515 [00:00<?, ?it/s]
100%|##########| 29515/29515 [00:00<00:00, 269561.74it/s]
100%|##########| 29515/29515 [00:00<00:00, 268364.97it/s]
Extracting data/FashionMNIST/raw/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz

  0%|          | 0/4422102 [00:00<?, ?it/s]
  1%|          | 32768/4422102 [00:00<00:14, 303198.46it/s]
  1%|1         | 65536/4422102 [00:00<00:14, 301460.36it/s]
  3%|2         | 131072/4422102 [00:00<00:09, 438993.70it/s]
  5%|5         | 229376/4422102 [00:00<00:06, 622346.90it/s]
 11%|#1        | 491520/4422102 [00:00<00:03, 1265633.46it/s]
 21%|##1       | 950272/4422102 [00:00<00:01, 2267940.59it/s]
 44%|####3     | 1933312/4422102 [00:00<00:00, 4477293.64it/s]
 87%|########6 | 3833856/4422102 [00:00<00:00, 8621432.69it/s]
100%|##########| 4422102/4422102 [00:00<00:00, 5066844.45it/s]
Extracting data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz

  0%|          | 0/5148 [00:00<?, ?it/s]
100%|##########| 5148/5148 [00:00<00:00, 25402678.81it/s]
Extracting data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw

ToTensor()

ToTensor는 PIL 이미지 또는 NumPy ndarrayFloatTensor로 변환합니다. [0., 1.] 범위에서 이미지의 픽셀 강도 값을 조정합니다.

ToTensor converts a PIL image or NumPy ndarray into a FloatTensor. and scales the image’s pixel intensity values in the range [0., 1.]

Lambda Transforms

Lambda 변환은 모든 사용자 정의 lambda 함수를 적용합니다. 여기에서 정수를 원-핫 인코딩된 텐서로 변환하는 함수를 정의합니다. 먼저 크기 10(데이터 세트의 레이블 수)의 0 텐서를 생성하고 레이블 y가 제공하는 인덱스에 value=1을 할당하는 scatter_를 호출합니다.

Lambda transforms apply any user-defined lambda function. Here, we define a function to turn the integer into a one-hot encoded tensor. It first creates a zero tensor of size 10 (the number of labels in our dataset) and calls scatter_ which assigns a value=1 on the index as given by the label y.

target_transform = Lambda(lambda y: torch.zeros(
    10, dtype=torch.float).scatter_(dim=0, index=torch.tensor(y), value=1))

References