1. Quickstart

이 섹션은 기계 학습의 일반적인 작업에 대한 API를 통해 실행됩니다.

This section runs through the API for common tasks in machine learning.

Working with data

PyTorch에는 데이터 작업을 위한 두 가지 프리미티브가 있습니다: torch.utils.data.DataLoader 및 torch.utils.data.Dataset. Dataset은 샘플과 해당 레이블을 저장하고 DataLoader는 Dataset 주위에 이터러블(iterable)을 래핑합니다.

PyTorch has two primitives to work with data: torch.utils.data.DataLoader and torch.utils.data.Dataset. Dataset stores the samples and their corresponding labels, and DataLoader wraps an iterable around the Dataset.

import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor

PyTorch는 데이터 세트를 포함하는 TorchText, TorchVision 및 TorchAudio와 같은 도메인별 라이브러리를 제공합니다. 이 튜토리얼에서는 TorchVision 데이터 세트를 사용합니다.

PyTorch offers domain-specific libraries such as TorchText, TorchVision, and TorchAudio, all of which include datasets. For this tutorial, we will be using a TorchVision dataset.

torchvision.datasets 모듈에는 CIFAR, COCO와 같은 많은 실제 비전 데이터에 대한 Dataset 객체가 포함되어 있습니다. 이 튜토리얼에서는 FashionMNIST 데이터 세트를 사용합니다. 모든 TorchVision Dataset에는 샘플과 레이블을 각각 수정하기 위한 transform 및 target_transform의 두 인수가 포함됩니다.

The torchvision.datasets module contains Dataset objects for many real-world vision data like CIFAR, COCO. In this tutorial, we use the FashionMNIST dataset. Every TorchVision Dataset includes two arguments: transform and target_transform to modify the samples and labels respectively.

CodeOut

# Download training data from open datasets.
training_data = datasets.FashionMNIST(
    root="data",
    train=True,
    download=True,
    transform=ToTensor(),
)

# Download test data from open datasets.
test_data = datasets.FashionMNIST(
    root="data",
    train=False,
    download=True,
    transform=ToTensor(),
)

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to data/FashionMNIST/raw/train-images-idx3-ubyte.gz

  0%|          | 0/26421880 [00:00<?, ?it/s]
  0%|          | 32768/26421880 [00:00<01:27, 302981.23it/s]
  0%|          | 65536/26421880 [00:00<01:27, 301520.86it/s]
  0%|          | 131072/26421880 [00:00<00:59, 438339.67it/s]
  1%|          | 229376/26421880 [00:00<00:42, 621303.86it/s]
  2%|1         | 491520/26421880 [00:00<00:20, 1263725.62it/s]
  4%|3         | 950272/26421880 [00:00<00:11, 2264148.73it/s]
  7%|7         | 1933312/26421880 [00:00<00:05, 4471176.13it/s]
 15%|#4        | 3833856/26421880 [00:00<00:02, 8594266.49it/s]
 26%|##6       | 6979584/26421880 [00:00<00:01, 14899285.08it/s]
 38%|###8      | 10059776/26421880 [00:01<00:00, 18901082.79it/s]
 50%|####9     | 13139968/26421880 [00:01<00:00, 21750384.65it/s]
 62%|######1   | 16252928/26421880 [00:01<00:00, 23744843.74it/s]
 73%|#######3  | 19333120/26421880 [00:01<00:00, 25177974.89it/s]
 85%|########4 | 22347776/26421880 [00:01<00:00, 25914510.08it/s]
 96%|#########6| 25427968/26421880 [00:01<00:00, 26624142.73it/s]
100%|##########| 26421880/26421880 [00:01<00:00, 16015379.45it/s]
Extracting data/FashionMNIST/raw/train-images-idx3-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw/train-labels-idx1-ubyte.gz

  0%|          | 0/29515 [00:00<?, ?it/s]
100%|##########| 29515/29515 [00:00<00:00, 271611.64it/s]
100%|##########| 29515/29515 [00:00<00:00, 270275.62it/s]
Extracting data/FashionMNIST/raw/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz

  0%|          | 0/4422102 [00:00<?, ?it/s]
  1%|          | 32768/4422102 [00:00<00:14, 303466.92it/s]
  1%|1         | 65536/4422102 [00:00<00:14, 301817.73it/s]
  3%|2         | 131072/4422102 [00:00<00:09, 439025.08it/s]
  5%|5         | 229376/4422102 [00:00<00:06, 622782.86it/s]
 11%|#1        | 491520/4422102 [00:00<00:03, 1265841.62it/s]
 21%|##1       | 950272/4422102 [00:00<00:01, 2264879.75it/s]
 44%|####3     | 1933312/4422102 [00:00<00:00, 4479651.37it/s]
 87%|########6 | 3833856/4422102 [00:00<00:00, 8628460.38it/s]
100%|##########| 4422102/4422102 [00:00<00:00, 5065928.30it/s]
Extracting data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz

  0%|          | 0/5148 [00:00<?, ?it/s]
100%|##########| 5148/5148 [00:00<00:00, 23911713.17it/s]
Extracting data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw

Dataset을 DataLoader에 대한 인수로 전달합니다. 이것은 우리의 데이터 세트에 대해 이터러블을 래핑하고 자동 일괄 처리, 샘플링, 셔플링 및 다중 프로세스 데이터 로드를 지원합니다. 여기서 우리는 배치 크기를 64로 정의합니다. 즉, 데이터 로더 이터러블의 각 요소는 64개의 피처 및 레이블 배치를 반환합니다.

We pass the Dataset as an argument to DataLoader. This wraps an iterable over our dataset, and supports automatic batching, sampling, shuffling and multiprocess data loading. Here we define a batch size of 64, i.e. each element in the dataloader iterable will return a batch of 64 features and labels.

CodeOut

batch_size = 64

# Create data loaders.
train_dataloader = DataLoader(training_data, batch_size=batch_size)
test_dataloader = DataLoader(test_data, batch_size=batch_size)

for X, y in test_dataloader:
    print(f"Shape of X [N, C, H, W]: {X.shape}")
    print(f"Shape of y: {y.shape} {y.dtype}")
    break

Shape of X [N, C, H, W]: torch.Size([64, 1, 28, 28])
Shape of y: torch.Size([64]) torch.int64

Creating Models

PyTorch에서 신경망을 정의하기 위해 nn.Module에서 상속하는 클래스를 만듭니다. __init__ 함수에서 네트워크 레이어를 정의하고 forward 함수에서 데이터가 네트워크를 통과하는 방법을 지정합니다. 신경망에서 작업을 가속화하기 위해 가능한 경우 GPU로 이동합니다.

To define a neural network in PyTorch, we create a class that inherits from nn.Module. We define the layers of the network in the __init__ function and specify how data will pass through the network in the forward function. To accelerate operations in the neural network, we move it to the GPU if available.

CodeOut

# Get cpu or gpu device for training.
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using {device} device")

# Define model
class NeuralNetwork(nn.Module):
    def __init__(self):
        super().__init__()
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28*28, 512),
            nn.ReLU(),
            nn.Linear(512, 512),
            nn.ReLU(),
            nn.Linear(512, 10)
        )

    def forward(self, x):
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

model = NeuralNetwork().to(device)
print(model)

Using cuda device
NeuralNetwork(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (linear_relu_stack): Sequential(
    (0): Linear(in_features=784, out_features=512, bias=True)
    (1): ReLU()
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): ReLU()
    (4): Linear(in_features=512, out_features=10, bias=True)
  )
)

Optimizing the Model Parameters

모델을 학습시키려면 손실 함수와 옵티마이저가 필요합니다.

To train a model, we need a loss function and an optimizer.

loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)

단일 학습 루프에서 모델은 학습 데이터 세트에 대한 예측을 수행하고(배치로 제공됨) 예측 오류를 역전파하여 모델의 매개변수를 조정합니다.

In a single training loop, the model makes predictions on the training dataset (fed to it in batches), and backpropagates the prediction error to adjust the model’s parameters.

def train(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)
    model.train()
    for batch, (X, y) in enumerate(dataloader):
        X, y = X.to(device), y.to(device)

        # Compute prediction error
        pred = model(X)
        loss = loss_fn(pred, y)

        # Backpropagation
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        if batch % 100 == 0:
            loss, current = loss.item(), batch * len(X)
            print(f"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]")

또한 테스트 데이터 세트에 대해 모델의 성능을 확인하여 학습 중인지 확인합니다.

We also check the model’s performance against the test dataset to ensure it is learning.

def test(dataloader, model, loss_fn):
    size = len(dataloader.dataset)
    num_batches = len(dataloader)
    model.eval()
    test_loss, correct = 0, 0
    with torch.no_grad():
        for X, y in dataloader:
            X, y = X.to(device), y.to(device)
            pred = model(X)
            test_loss += loss_fn(pred, y).item()
            correct += (pred.argmax(1) == y).type(torch.float).sum().item()
    test_loss /= num_batches
    correct /= size
    print(f"Test Error: \n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n")

학습 프로세스는 여러 반복(에포크)에 걸쳐 수행됩니다. 각 에포크 동안 모델은 더 나은 예측을 위해 매개변수를 학습합니다. 각 에포크에서 모델의 정확도와 손실을 출력합니다. 우리는 매 에포크마다 정확도가 증가하고 손실이 감소하는 것을 보고 싶습니다.

The training process is conducted over several iterations (epochs). During each epoch, the model learns parameters to make better predictions. We print the model’s accuracy and loss at each epoch; we’d like to see the accuracy increase and the loss decrease with every epoch.

CodeOut

epochs = 5
for t in range(epochs):
    print(f"Epoch {t+1}\n-------------------------------")
    train(train_dataloader, model, loss_fn, optimizer)
    test(test_dataloader, model, loss_fn)
print("Done!")

Epoch 1
-------------------------------
loss: 2.306185  [    0/60000]
loss: 2.298008  [ 6400/60000]
loss: 2.280848  [12800/60000]
loss: 2.273232  [19200/60000]
loss: 2.248048  [25600/60000]
loss: 2.229527  [32000/60000]
loss: 2.229292  [38400/60000]
loss: 2.199877  [44800/60000]
loss: 2.196356  [51200/60000]
loss: 2.163561  [57600/60000]
Test Error:
 Accuracy: 48.9%, Avg loss: 2.162945

Epoch 2
-------------------------------
loss: 2.173212  [    0/60000]
loss: 2.172224  [ 6400/60000]
loss: 2.117870  [12800/60000]
loss: 2.126270  [19200/60000]
loss: 2.074794  [25600/60000]
loss: 2.025475  [32000/60000]
loss: 2.045674  [38400/60000]
loss: 1.976126  [44800/60000]
loss: 1.975976  [51200/60000]
loss: 1.904687  [57600/60000]
Test Error:
 Accuracy: 57.0%, Avg loss: 1.905852

Epoch 3
-------------------------------
loss: 1.938858  [    0/60000]
loss: 1.918352  [ 6400/60000]
loss: 1.801787  [12800/60000]
loss: 1.828913  [19200/60000]
loss: 1.723467  [25600/60000]
loss: 1.677218  [32000/60000]
loss: 1.690066  [38400/60000]
loss: 1.593920  [44800/60000]
loss: 1.611946  [51200/60000]
loss: 1.502581  [57600/60000]
Test Error:
 Accuracy: 60.7%, Avg loss: 1.522618

Epoch 4
-------------------------------
loss: 1.585996  [    0/60000]
loss: 1.556208  [ 6400/60000]
loss: 1.398514  [12800/60000]
loss: 1.468179  [19200/60000]
loss: 1.347427  [25600/60000]
loss: 1.343468  [32000/60000]
loss: 1.356795  [38400/60000]
loss: 1.279234  [44800/60000]
loss: 1.317853  [51200/60000]
loss: 1.213202  [57600/60000]
Test Error:
 Accuracy: 63.2%, Avg loss: 1.242747

Epoch 5
-------------------------------
loss: 1.314441  [    0/60000]
loss: 1.301381  [ 6400/60000]
loss: 1.129021  [12800/60000]
loss: 1.238399  [19200/60000]
loss: 1.114024  [25600/60000]
loss: 1.140957  [32000/60000]
loss: 1.164962  [38400/60000]
loss: 1.096341  [44800/60000]
loss: 1.142519  [51200/60000]
loss: 1.056790  [57600/60000]
Test Error:
 Accuracy: 64.6%, Avg loss: 1.078601

Done!

Saving Models

모델을 저장하는 일반적인 방법은 내부 상태 사전(모델 매개변수 포함)을 직렬화하는 것입니다.

A common way to save a model is to serialize the internal state dictionary (containing the model parameters).

CodeOut

torch.save(model.state_dict(), "model.pth")
print("Saved PyTorch Model State to model.pth")

Saved PyTorch Model State to model.pth

Loading Models

모델을 로드하는 프로세스에는 모델 구조를 다시 만들고 상태 사전을 로드하는 작업이 포함됩니다.

The process for loading a model includes re-creating the model structure and loading the state dictionary into it.

CodeOut

model = NeuralNetwork()
model.load_state_dict(torch.load("model.pth"))

<All keys matched successfully>

이제 이 모델을 사용하여 예측할 수 있습니다.

This model can now be used to make predictions.

CodeOut

classes = [
    "T-shirt/top",
    "Trouser",
    "Pullover",
    "Dress",
    "Coat",
    "Sandal",
    "Shirt",
    "Sneaker",
    "Bag",
    "Ankle boot",
]

model.eval()
x, y = test_data[0][0], test_data[0][1]
with torch.no_grad():
    pred = model(x)
    predicted, actual = classes[pred[0].argmax(0)], classes[y]
    print(f'Predicted: "{predicted}", Actual: "{actual}"')

Predicted: "Ankle boot", Actual: "Ankle boot"

References

https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html