์ƒˆ์†Œ์‹

๋”ฅ๋Ÿฌ๋‹

Pytorch Tutorials

  • -

1. tensor

๐Ÿ”ธ ๋žœ๋คํ•œ ๊ฐ’์„ ๊ฐ€์ง€๋Š” ํ…์„œ ์ƒ์„ฑ

  1. torch.rand() : 0๊ณผ 1 ์‚ฌ์ด์˜ ์ˆซ์ž๋ฅผ ๊ท ๋“ฑํ•˜๊ฒŒ ์ƒ์„ฑ
  2. torch.rand_like() : ์‚ฌ์ด์ฆˆ๋ฅผ ํŠœํ”Œ๋กœ ์ž…๋ ฅํ•˜์ง€ ์•Š๊ณ  ๊ธฐ์กด์˜ ํ…์„œ๋กœ ์ •์˜
  3. torch.randn() : ํ‰๊ท ์ด 0์ด๊ณ  ํ‘œ์ค€ํŽธ์ฐจ๊ฐ€ 1์ธ ๊ฐ€์šฐ์‹œ์•ˆ ์ •๊ทœ๋ถ„ํฌ๋ฅผ ์ด์šฉํ•ด ์ƒ์„ฑ
  4. torch.randn_like() : ์‚ฌ์ด์ฆˆ๋ฅผ ํŠœํ”Œ๋กœ ์ž…๋ ฅํ•˜์ง€ ์•Š๊ณ  ๊ธฐ์กด์˜ ํ…์„œ๋กœ ์ •์˜
  5. torch.randint() : ์ฃผ์–ด์ง„ ๋ฒ”์œ„ ๋‚ด์˜ ์ •์ˆ˜๋ฅผ ๊ท ๋“ฑํ•˜๊ฒŒ ์ƒ์„ฑ, ์ž๋ฃŒํ˜•์€ torch.float32
  6. torch.randint_like() : ์‚ฌ์ด์ฆˆ๋ฅผ ํŠœํ”Œ๋กœ ์ž…๋ ฅํ•˜์ง€ ์•Š๊ณ  ๊ธฐ์กด์˜ ํ…์„œ๋กœ ์ •์˜
  7. torch.randperm() : ์ฃผ์–ด์ง„ ๋ฒ”์œ„ ๋‚ด์˜ ์ •์ˆ˜๋ฅผ ๋žœ๋คํ•˜๊ฒŒ ์ƒ์„ฑ

๐Ÿ”ธ ํŠน์ •ํ•œ ๊ฐ’์„ ๊ฐ€์ง€๋Š” ํ…์„œ ์ƒ์„ฑ

  1. torch.arange() : ์ฃผ์–ด์ง„ ๋ฒ”์œ„ ๋‚ด์˜ ์ •์ˆ˜๋ฅผ ์ˆœ์„œ๋Œ€๋กœ ์ƒ์„ฑ
  2. torch.ones() : ์ฃผ์–ด์ง„ ์‚ฌ์ด์ฆˆ์˜ 1๋กœ ์ด๋ฃจ์–ด์ง„ ํ…์„œ ์ƒ์„ฑ
  3. torch.zeros() : ์ฃผ์–ด์ง„ ์‚ฌ์ด์ฆˆ์˜ 0์œผ๋กœ ์ด๋ฃจ์–ด์ง„ ํ…์„œ ์ƒ์„ฑ
  4. torch.ones_like() : ์‚ฌ์ด์ฆˆ๋ฅผ ํŠœํ”Œ๋กœ ์ž…๋ ฅํ•˜์ง€ ์•Š๊ณ  ๊ธฐ์กด์˜ ํ…์„œ๋กœ ์ •์˜
  5. torch.zeros_like() : ์‚ฌ์ด์ฆˆ๋ฅผ ํŠœํ”Œ๋กœ ์ž…๋ ฅํ•˜์ง€ ์•Š๊ณ  ๊ธฐ์กด์˜ ํ…์„œ๋กœ ์ •์˜
  6. torch.linspace() : ์‹œ์ž‘์ ๊ณผ ๋์ ์„ ์ฃผ์–ด์ง„ ๊ฐฏ์ˆ˜๋งŒํผ ๊ท ๋“ฑํ•˜๊ฒŒ ๋‚˜๋ˆˆ ๊ฐ„๊ฒฉ์ ์„ ํ–‰๋ฒกํ„ฐ๋กœ ์ถœ๋ ฅ
  7. torch.logspace() : ์‹œ์ž‘์ ๊ณผ ๋์ ์„ ์ฃผ์–ด์ง„ ๊ฐฏ์ˆ˜๋งŒํผ ๋กœ๊ทธ๊ฐ„๊ฒฉ์œผ๋กœ ๋‚˜๋ˆˆ ๊ฐ„๊ฒฉ์ ์„ ํ–‰๋ฒกํ„ฐ๋กœ ์ถœ๋ ฅ




2. Dataset๊ณผ DataLoader

2.1 Dataset

PyTorch๋Š” torch.utils.data.DataLoader ์™€ torch.utils.data.Dataset ์˜ ๋‘ ๊ฐ€์ง€ ๋ฐ์ดํ„ฐ ๊ธฐ๋ณธ ์š”์†Œ๋ฅผ ์ œ๊ณตํ•˜์—ฌ ๋ฏธ๋ฆฌ ์ค€๋น„ํ•ด๋œ(pre-loaded) ๋ฐ์ดํ„ฐ์…‹ ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ๊ฐ€์ง€๊ณ  ์žˆ๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•ฉ๋‹ˆ๋‹ค.
Dataset ์€ ์ƒ˜ํ”Œ๊ณผ ์ •๋‹ต(label)์„ ์ €์žฅํ•˜๊ณ ,
DataLoader ๋Š” Dataset ์„ ์ƒ˜ํ”Œ์— ์‰ฝ๊ฒŒ ์ ‘๊ทผํ•  ์ˆ˜ ์žˆ๋„๋ก ์ˆœํšŒ ๊ฐ€๋Šฅํ•œ ๊ฐ์ฒด(iterable)๋กœ ๊ฐ์Œ‰๋‹ˆ๋‹ค.

2.1.1 ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋กœ ๋ถ€ํ„ฐ Dataset ๋กœ๋“œ

import torch
from torchvision import datasets
from torchvision.transforms import ToTensor, Lambda
import matplotlib.pyplot as plt

torchvision์€ datasets, models, transforms๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ๋‹ค.
     - torchvision.datasets์—๋Š” MNIST, Fashion-MNIST๋“ฑ ๋‹ค์–‘ํ•œ ๋ฐ์ดํ„ฐ์…‹ ์ œ๊ณต
     - torchvision.models์—๋Š” Alesnet, VGG, ResNet๋“ฑ์˜ ๋ชจ๋ธ ์ œ๊ณต
     - torchvision.transforms๋Š” ๋‹ค์–‘ํ•œ ์ด๋ฏธ์ง€ ๋ณ€ํ™˜ ๊ธฐ๋Šฅ๋“ค์„ ์ œ๊ณต
          - torchvision.transform.ToTensor์€ PIL Image๋‚˜ NumPy ndarray ๋ฅผ FloatTensor ๋กœ ๋ณ€ํ™˜ํ•˜๊ณ , ์ด๋ฏธ์ง€์˜ ํ”ฝ์…€์˜ ํฌ๊ธฐ(intensity) ๊ฐ’์„ [0., 1.] ๋ฒ”์œ„๋กœ ๋น„๋ก€ํ•˜์—ฌ ์กฐ์ •(scale)
          - from torchvision.transforms.Lambda๋Š” ์‚ฌ์šฉ์ž ์ •์˜ ๋žŒ๋‹ค(lambda) ํ•จ์ˆ˜๋ฅผ ์ ์šฉํ•œ๋‹ค.


training_data = datasets.FashionMNIST(
    root="data",
    train=True,
    download=True,
    transform=ToTensor(),
    target_transform=Lambda(lambda y: torch.zeros(10, dtype=torch.float).scatter_(0, torch.tensor(y), value=1))
)

test_data = datasets.FashionMNIST(
    root="data",
    train=False,
    download=True,
    transform=ToTensor()
)

๋‹ค์Œ ๋งค๊ฐœ๋ณ€์ˆ˜๋“ค์„ ์‚ฌ์šฉํ•˜์—ฌ FashionMNIST ๋ฐ์ดํ„ฐ์…‹์„ ๋ถˆ๋Ÿฌ์˜ค๋Š” ๊ณผ์ •์ด๋‹ค.
root ๋Š” ํ•™์Šต/ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ๊ฐ€ ์ €์žฅ๋˜๋Š” ๊ฒฝ๋กœ๋ฅผ ์ง€์ •
train ์€ ํ•™์Šต์šฉ ๋˜๋Š” ํ…Œ์ŠคํŠธ์šฉ ๋ฐ์ดํ„ฐ์…‹ ์—ฌ๋ถ€๋ฅผ ์ง€์ •
download=True ๋Š” root์— ๋ฐ์ดํ„ฐ๊ฐ€ ์—†๋Š” ๊ฒฝ์šฐ ์ธํ„ฐ๋„ท์—์„œ ๋‹ค์šด๋กœ๋“œ
transform ๊ณผ target_transform ์€ ํŠน์ง•(feature)๊ณผ ์ •๋‹ต(label) ๋ณ€ํ˜•(transform)์„ ์ง€์ •
Lambda์˜ lambda๋Š” ์—ฌ๊ธฐ์—์„œ๋Š” ์ •์ˆ˜๋ฅผ ์›-ํ•ซ์œผ๋กœ ๋ถ€ํ˜ธํ™”๋œ ํ…์„œ๋กœ ๋ฐ”๊พธ๋Š” ํ•จ์ˆ˜๋ฅผ ์ •์˜ํ•œ๋‹ค. ์ด ํ•จ์ˆ˜๋Š” ๋จผ์ € (๋ฐ์ดํ„ฐ์…‹ ์ •๋‹ต์˜ ๊ฐœ์ˆ˜์ธ) ํฌ๊ธฐ 10์งœ๋ฆฌ ์˜ ํ…์„œ(zero tensor)๋ฅผ ๋งŒ๋“ค๊ณ , scatter_ ๋ฅผ ํ˜ธ์ถœํ•˜์—ฌ ์ฃผ์–ด์ง„ ์ •๋‹ต y ์— ํ•ด๋‹นํ•˜๋Š” ์ธ๋ฑ์Šค์— value=1 ์„ ํ• ๋‹นํ•œ๋‹ค.


2.1.2 Dataset์„ ์ˆœํšŒํ•˜์—ฌ ์‹œ๊ฐํ™”

Dataset ์— ๋ฆฌ์ŠคํŠธ(list)์ฒ˜๋Ÿผ ์ง์ ‘ ์ ‘๊ทผ(index)ํ•  ์ˆ˜ ์žˆ๋‹ค.

labels_map = {
    0: "T-Shirt",
    1: "Trouser",
    2: "Pullover",
    3: "Dress",
    4: "Coat",
    5: "Sandal",
    6: "Shirt",
    7: "Sneaker",
    8: "Bag",
    9: "Ankle Boot",
}

figure = plt.figure(figsize=(8,8))
cols, rows= 3,3
for i in range(1, cols*rows+1):
  sample_idx = torch.randint(len(training_data), size=(1,)).item()
  img, label = training_data[sample_idx]
  figure.add_subplot(rows, cols, i)
  plt.title(labels_map[label])
  plt.imshow(img.squeeze(), cmap='gray')
plt.show()



2.2 ํŒŒ์ผ์—์„œ ์‚ฌ์šฉ์ž ์ •์˜ Dataset ์ƒ์„ฑ

์‚ฌ์šฉ์ž ์ •์˜ Dataset ํด๋ž˜์Šค๋Š” ๋ฐ˜๋“œ์‹œ 3๊ฐœ ํ•จ์ˆ˜๋ฅผ ๊ตฌํ˜„ํ•ด์•ผ ํ•œ๋‹ค
: __init__, __len__, and __getitem__

from torch.utils.data import Dataset
import os
import pandas as pd
from torchvision.io import read_image

class CustomImageDataset(Dataset):
    def __init__(self, annotation_file, img_dir, trnasform=None, target_transform=None):
      self.img_labels = pd.read_csv(annotation_file, names=['file_name', 'label'])
      self.img_dir = img_dir
      self.transform = transform
      self.target_transform = target_transform
    
    def __len__(self):
      return len(self.img_labels)
    
    def __getitem__(self, idx):
      img_path = os.path.join(self.img_dir, self.img_labels,iloc[idx, 0])
      image = read_image(img_path)
      label = self.img_labels.iloc[idx,1]
      if self.transform:
        image = self.transform(image)
      if self.target_transform:
        label = self.target_transfrom(label)
      sample = {"image": image, "label": label}
    return sample

  • __init__
    __init__ ํ•จ์ˆ˜๋Š” Dataset ๊ฐ์ฒด๊ฐ€ ์ƒ์„ฑ(instantiate)๋  ๋•Œ ํ•œ ๋ฒˆ๋งŒ ์‹คํ–‰๋œ๋‹ค.
    ์—ฌ๊ธฐ์„œ๋Š” ์ด๋ฏธ์ง€์™€ ์ฃผ์„ ํŒŒ์ผ(annotation_file)์ด ํฌํ•จ๋œ ๋””๋ ‰ํ† ๋ฆฌ์™€ ๋ณ€ํ˜•(transform)์„ ์ดˆ๊ธฐํ™”ํ•œ๋‹ค.

MNIST๋ฅผ ํŒŒ์ผ์—์„œ ๋ถˆ๋Ÿฌ์˜ฌ ๊ฒฝ์šฐ MNIST.csv๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

tshirt1.jpg, 0
tshirt2.jpg, 0
......
ankleboot999.jpg, 9


def __len__(self):
      return len(self.img_labels)
  • __len__
    __len__ ํ•จ์ˆ˜๋Š” ๋ฐ์ดํ„ฐ์…‹์˜ ์ƒ˜ํ”Œ ๊ฐœ์ˆ˜๋ฅผ ๋ฐ˜ํ™˜ํ•œ๋‹ค.

def __getitem__(self, idx):
      img_path = os.path.join(self.img_dir, self.img_labels,iloc[idx, 0])
      image = read_image(img_path)
      label = self.img_labels.iloc[idx,1]
      if self.transform:
        image = self.transform(image)
      if self.target_transform:
        label = self.target_transfrom(label)
      sample = {"image": image, "label": label}
    return sample
  • __getitem__
    __getitem__ ํ•จ์ˆ˜๋Š” ์ฃผ์–ด์ง„ ์ธ๋ฑ์Šค idx ์— ํ•ด๋‹นํ•˜๋Š” ์ƒ˜ํ”Œ์„ ๋ฐ์ดํ„ฐ์…‹์—์„œ ๋ถˆ๋Ÿฌ์˜ค๊ณ  ๋ฐ˜ํ™˜ํ•œ๋‹ค.
    ์ธ๋ฑ์Šค๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ, ๋””์Šคํฌ์—์„œ ์ด๋ฏธ์ง€์˜ ์œ„์น˜๋ฅผ ์‹๋ณ„ํ•˜๊ณ ,
    read_image ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ด๋ฏธ์ง€๋ฅผ ํ…์„œ๋กœ ๋ณ€ํ™˜ํ•˜๊ณ ,
    self.img_labels ์˜ csv ๋ฐ์ดํ„ฐ๋กœ๋ถ€ํ„ฐ ํ•ด๋‹นํ•˜๋Š” ์ •๋‹ต(label)์„ ๊ฐ€์ ธ์˜ค๊ณ ,
    (ํ•ด๋‹นํ•˜๋Š” ๊ฒฝ์šฐ) ๋ณ€ํ˜•(transform) ํ•จ์ˆ˜๋“ค์„ ํ˜ธ์ถœํ•œ ๋’ค,
    ํ…์„œ ์ด๋ฏธ์ง€์™€ ๋ผ๋ฒจ์„ Python ์‚ฌ์ „(dict)ํ˜•์œผ๋กœ ๋ฐ˜ํ™˜ํ•œ๋‹ค.

2.2.1 ์˜ˆ์‹œ

์ƒ˜ํ”Œ ๋ฐ์ดํ„ฐ

train_images = np.random.randint(256, size=(20,32,32,3))
train_labels = np.random.randint(2, size=(20,1))

## tensor๋กœ ๋ฐ”๊ฟ”์ฃผ๊ธฐ ์ „์— ๋ณดํ†ต ์ „์ฒ˜๋ฆฌ ๋ชจ๋“ˆ์„ ๋ถˆ๋Ÿฌ์™€ ์ „์ฒ˜๋ฆฌ ์ง„ํ–‰
# import preprocessing
# train_images, train_labels = preprocessing(train_images, train_labels)

print(train_images,shape, train_labels.shape)
>>> (20, 32, 32, 3) (20, 1)

custom dataset

class CustomDataset(Dataset):

	def __init__(self, x_data, y_data, transform=None):
    	self.x_data = x_data
        self.y_data = y_data
        self.transform = transform
        self.len = len(y_data)
        
    def __getitem__(self, index):
    	sample = self.x_data[index], self.y_data[index]
        
        if self.transform:
        	sample = self.transform(sample)
        return sample
    
    def __len__(self):
    	return self.len

transform์— ๋Œ€ํ•ด custom class๋ฅผ ๋งŒ๋“ค๊ธฐ

class ToTensor:
	def __call__(self, sample):
    	inputs, labels = sample
        inputs = torch.FloatTensor(inputs)
        inputs = inputs.permute(2,0,1)
        return inputs, torch.LongTensor(labels)
        
class LinearTensor:
	def __init__(self, slope, bias=0):
    	self.slope = slope
        self.bias = bias
    
    def __call__(self, sample):
    	inputs, labels = sample
        inputs = self.slope * inputs + self.bias
        return inputs, labels

๋งŒ๋“  class ์‚ฌ์šฉํ•˜๊ธฐ

trans = tr.Compose([ToTensor(), LinearTensor(2,5)])
dataset = CustomDataset(train_images, train_labels, transform=trans)
train_loader = DataLoader(datset, batch_size, shuffle=True)

custom class์ธ ToTensor()๊ฐ€ ์•„๋‹Œ torchvision์˜ tr.ToTensor()์„ ์“ฐ๊ณ  ์‹ถ์€ ๊ฒฝ์šฐ

class MyTransform:
	def __call__(self, sample):
    	inputs, labels = sample
        inputs = torch.FloatTensor(inputs)
        inputs = inputs.permute(2,0,1)
        labels = torch.FloatTensor(labels)
        
        transf = tr.Compose([tr.ToPILImabe(), tr.Resize(128), tr.ToTensor(), tr.Normalize((0.5,0.5,0.5), (0.5,0.5,0.5))]
        final_output = transf(inputs)
        
        return final_output, labels

2.3 ๊ฐ™์€ ํด๋ž˜์Šค ๋ณ„ ํด๋” ์ด๋ฏธ์ง€ ๋ฐ์ดํ„ฐ๋ฅผ ์ด์šฉํ•  ๋•Œ

./class/cat ./class/lion ๊ฐ™์ด ํด๋”๊ฐ€ ๋˜์–ด ์žˆ์„ ๋•Œ
torchvision.datasets.ImageFolder ๋ฐ์ดํ„ฐ ์ „์ฒด๋ฅผ ๋ถˆ๋Ÿฌ์˜ค๋ฉด์„œ ๋ ˆ์ด๋ธ”๋„ ์ž๋™์œผ๋กœ ๋งค๊ฒจ์ง€๋ฉด์„œ ์ „์ฒ˜๋ฆฌ๊นŒ์ง€ ๊ฐ€๋Šฅํ•˜๋‹ค.

transf = tr.Compose([tr.Resize(16), tr.ToTensor()])
train_data = torchvision.datasets.ImageFolder(root='./class', transform=transf)
train_dataloader = DataLoader(train_data, batch_size=10, shuffle=True, num_workers=2)

3. DataLoader

3.1 DataLodaer๋กœ ํ•™์Šต์šฉ ๋ฐ์ดํ„ฐ ์ค€๋น„

Dataset์€ ๋ฐ์ดํ„ฐ์…‹์˜ feature์„ ๊ฐ€์ ธ์˜ค๊ณ  ํ•˜๋‚˜์˜ ์ƒ˜ํ”Œ์— label์„ ์ง€์ •ํ•˜๋Š” ์ผ์„ ํ•œ ๋ฒˆ์— ํ•œ๋‹ค.

๋ณดํ†ต ๋ชจ๋ธ์„ ํ•™์Šตํ•  ๋•Œ, ์ผ๋ฐ˜์ ์œผ๋กœ ์ƒ˜ํ”Œ๋“ค์„ “๋ฏธ๋‹ˆ๋ฐฐ์น˜(minibatch)”๋กœ ์ „๋‹ฌํ•˜๊ณ , ๋งค ์—ํญ(epoch)๋งˆ๋‹ค ๋ฐ์ดํ„ฐ๋ฅผ ๋‹ค์‹œ ์„ž์–ด์„œ ๊ณผ์ ํ•ฉ(overfit)์„ ๋ง‰๊ณ , Python์˜ multiprocessing ์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ฐ์ดํ„ฐ ๊ฒ€์ƒ‰ ์†๋„๋ฅผ ๋†’์ด๋ ค๊ณ  ํ•œ๋‹ค.

DataLoader ๋Š” ๊ฐ„๋‹จํ•œ API๋กœ ์ด๋Ÿฌํ•œ ๋ณต์žกํ•œ ๊ณผ์ •๋“ค์„ ์ถ”์ƒํ™”ํ•œ ์ˆœํšŒ ๊ฐ€๋Šฅํ•œ ๊ฐ์ฒด(iterable)์ด๋‹ค.

from torch.utils.data import DataLoader

train_dataloader = DataLoader(training_data, batch_size=64, shuffle=True)
test_dataloader = DataLoader(test_data, batch_size=64, shuffle=True)

3.2 DataLoader์„ ํ†ตํ•ด ์ˆœํšŒ

DataLoader ์— ๋ฐ์ดํ„ฐ์…‹์„ ๋ถˆ๋Ÿฌ์˜จ ๋’ค์—๋Š” ํ•„์š”์— ๋”ฐ๋ผ ๋ฐ์ดํ„ฐ์…‹์„ ์ˆœํšŒ(iterate)ํ•  ์ˆ˜ ์žˆ๋‹ค.
์•„๋ž˜์˜ ๊ฐ ์ˆœํšŒ(iter)๋Š” (๊ฐ๊ฐ batch_size=64 ์˜ ํŠน์ง•(feature)๊ณผ ์ •๋‹ต(label)์„ ํฌํ•จํ•˜๋Š”) train_features ์™€ train_labels ์˜ ๋ฌถ์Œ(batch)์„ ๋ฐ˜ํ™˜ํ•œ๋‹ค.
shuffle=True ๋กœ ์ง€์ •ํ–ˆ์œผ๋ฏ€๋กœ, ๋ชจ๋“  ๋ฐฐ์น˜๋ฅผ ์ˆœํšŒํ•œ ๋’ค ๋ฐ์ดํ„ฐ๊ฐ€ ์„ž์ธ๋‹ค.

ํŒŒ์ด์ฌ ๋‚ด์žฅํ•จ์ˆ˜ next( ), iter( )
iter(ํ˜ธ์ถœ๊ฐ€๋Šฅํ•œ๊ฐ์ฒด, ๋ฐ˜๋ณต์„๋๋‚ผ๊ฐ’)
next(๋ฐ˜๋ณต๊ฐ€๋Šฅํ•œ๊ฐ์ฒด, ๊ธฐ๋ณธ๊ฐ’)

>> it = iter(range(3))
>> next(it, 10)
0
>> next(it, 10)
1
>> next(it, 10)
2
>> next(it, 10)
10
train_features, train_labels = next(iter(train_dataloader))
print(f"Feature batch shape: {train_features.size()}")
print(f"Labels batch shape: {train_labels.size()}")
print(f"Label: {label}")
img = train_features[0].squeeze()
label = train_labels[0]
plt.imshow(img, cmap="gray")
plt.show()
print(f"Label: {label}")

Feature batch shape: torch.Size([64, 1, 28, 28])
Labels batch shape: torch.Size([64])
Label : tensor([0., 0., 0., 0., 0., 0., 0., 1., 0., 0.])



4. ํ•™์Šต ํ™˜๊ฒฝ ์„ค์ •

torch.cuda ๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ํ™•์ธํ•˜๊ณ  ๊ทธ๋ ‡์ง€ ์•Š์œผ๋ฉด CPU๋ฅผ ๊ณ„์† ์‚ฌ์šฉํ•œ๋‹ค.

device = 'cuda' if torch.cuda.is_available() else 'cpu'
print(f'Using {device} device')


5. ์‹ ๊ฒฝ๋ง ๋ชจ๋ธ ๊ตฌ์ถ•

์‹ ๊ฒฝ๋ง ๋ชจ๋ธ์€ ์—ฐ์‚ฐ์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๊ณ„์ธต(layer)/๋ชจ๋“ˆ(module)๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ๋‹ค.

torch.nn๋Š” ์‹ ๊ฒฝ๋ง์„ ๊ตฌ์„ฑํ•˜๋Š”๋ฐ ํ•„์š”ํ•œ ๋ชจ๋“  ๊ตฌ์„ฑ ์š”์†Œ๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.
PyTorch์˜ ๋ชจ๋“  ๋ชจ๋“ˆ์€ nn.Module ์˜ ํ•˜์œ„ ํด๋ž˜์Šค(subclass)์ด๋‹ค.

5.1 ์‹ ๊ฒฝ๋ง ๋ชจ๋ธ ํด๋ž˜์Šค ์ •์˜

์‹ ๊ฒฝ๋ง ๋ชจ๋ธ์„ nn.Module ์˜ ํ•˜์œ„ํด๋ž˜์Šค๋กœ ์ •์˜ํ•˜๊ณ ,
init ์—์„œ ์‹ ๊ฒฝ๋ง ๊ณ„์ธต๋“ค์„ ์ดˆ๊ธฐํ™”ํ•œ๋‹ค.
nn.Module ์„ ์ƒ์†๋ฐ›์€ ๋ชจ๋“  ํด๋ž˜์Šค๋Š” forward ๋ฉ”์†Œ๋“œ์— ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ ์—ฐ์‚ฐ๋“ค์„ ๊ตฌํ˜„ํ•œ๋‹ค.

from torch import nn

class NeuralNetwork(nn.Module):
    def __init__(self):
        super(NeuralNetwork, self).__init__()
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28*28, 512),
            nn.ReLU(),
            nn.Linear(512, 512),
            nn.ReLU(),
            nn.Linear(512, 10),
        )

    def forward(self, x):
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

NeuralNetwork ์˜ ์ธ์Šคํ„ด์Šค(instance)๋ฅผ ์ƒ์„ฑํ•˜๊ณ  ์ด๋ฅผ device ๋กœ ์ด๋™์‹œํ‚จ๋‹ค.

model = NeuralNetwork().to(device)
print(model)
NeuralNetwork(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (linear_relu_stack): Sequential(
    (0): Linear(in_features=784, out_features=512, bias=True)
    (1): ReLU()
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): ReLU()
    (4): Linear(in_features=512, out_features=10, bias=True)
  )
)

๋ชจ๋ธ์— ์ž…๋ ฅ์„ ํ˜ธ์ถœํ•˜๋ฉด ๊ฐ ๋ถ„๋ฅ˜(class)์— ๋Œ€ํ•œ ์›์‹œ(raw) ์˜ˆ์ธก๊ฐ’์ด ์žˆ๋Š” 10-์ฐจ์› ํ…์„œ๊ฐ€ ๋ฐ˜ํ™˜๋œ๋‹ค.
์›์‹œ ์˜ˆ์ธก๊ฐ’์„ nn.Softmax ๋ชจ๋“ˆ์˜ ์ธ์Šคํ„ด์Šค์— ํ†ต๊ณผ์‹œ์ผœ ์˜ˆ์ธก ํ™•๋ฅ ์„ ์–ป๋Š”๋‹ค.

X = torch.rand(1, 28, 28, device=device)
logits = model(X)
pred_probab = nn.Softmax(dim=1)(logits)
y_pred = pred_probab.argmax(1)
print(f"Predicted class: {y_pred}")

>>> Predicted class: tensor([6])

5.1.1 Layer

Layer๋“ค์—์„œ ์–ด๋–ค ์ผ์ด ๋ฐœ์ƒํ•˜๋Š”์ง€ ํ™•์ธํ•ด ๋ณด๊ธฐ ์œ„ํ•ด 28X28ํฌ๊ธฐ์˜ ์ด๋ฏธ์ง€ 3๊ฐœ๋กœ ๊ตฌ์„ฑ๋œ ๋ฏธ๋‹ˆ๋ฐฐ์น˜๋ฅผ ์ด์šฉํ•˜๊ฒ ๋‹ค.

input_image = torch.rand(3,28,28)
print(input_image.size())

>>> torch.Size([3, 28, 28])

5.1.2 nn.Flatten

Flatten์€ ๊ณ„์ธต์„ ์ดˆ๊ธฐํ™”ํ•˜์—ฌ ๊ฐ 28x28์˜ 2D ์ด๋ฏธ์ง€๋ฅผ 784 ํ”ฝ์…€ ๊ฐ’์„ ๊ฐ–๋Š” ์—ฐ์†๋œ ๋ฐฐ์—ด๋กœ ๋ณ€ํ™˜ํ•œ๋‹ค. (dim=0์˜ ๋ฏธ๋‹ˆ๋ฐฐ์น˜ ์ฐจ์›์€ ์œ ์ง€)

flatten = nn.Flatten()
flat_image = flatten(input_image)
print(flat_image.size())

>>> torch.Size([3, 784])

5.1.3 nn.Linear

Linear์€ weight์™€ bias๋ฅผ ์ด์šฉํ•ด ์ž…๋ ฅ์— ์„ ํ˜• ๋ณ€ํ™˜์„ ์ ์šฉํ•˜๋Š” ๋ชจ๋“ˆ์ด๋‹ค.

layer1 = nn.Linear(in_features=28*28, out_features=20)
hidden1 = layer1(flat_image)
print(hidden1.size())

>>> torch.Size([3, 20])

5.1.4 nn.ReLU

Activation function์€ ์„ ํ˜• ์ƒํƒœ์— ๋น„์„ ํ˜•์„ฑ์„ ๋„์ž…ํ•˜์—ฌ ์‹ ๊ฒฝ๋ง์ด ๋‹ค์–‘ํ•œ ํ˜„์ƒ์„ ํ•™์Šตํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•œ๋‹ค.

hidden1 = nn.ReLU()(hidden1)

5.1.5 nn.Sequential

nn.Sequential ์€ ์ˆœ์„œ๋ฅผ ๊ฐ–๋Š” ๋ชจ๋“ˆ์˜ ์ปจํ…Œ์ด๋„ˆ์ด๋‹ค.
sequential container๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์•„๋ž˜์˜ seq_modules ์™€ ๊ฐ™์€ ์‹ ๊ฒฝ๋ง์„ ๋น ๋ฅด๊ฒŒ ๋งŒ๋“ค ์ˆ˜ ์žˆ๋‹ค.

seq_modules = nn.Sequential(
    flatten,
    layer1,
    nn.ReLU(),
    nn.Linear(20, 10)
)
input_image = torch.rand(3,28,28)
logits = seq_modules(input_image)

5.1.6 nn.Softmax

์‹ ๊ฒฝ๋ง์˜ ๋งˆ์ง€๋ง‰ ์„ ํ˜• ๊ณ„์ธต์€ [-infty, infty] ๋ฒ”์œ„์˜ ๊ฐ’(raw value)์ธ logits๋ฅผ ๋ฐ˜ํ™˜ํ•œ๋‹ค.
nn.Softmax ๋ชจ๋“ˆ์€ logits๋Š” ๋ชจ๋ธ์˜ ๊ฐ ๋ถ„๋ฅ˜(class)์— ๋Œ€ํ•œ ์˜ˆ์ธก ํ™•๋ฅ ์„ ๋‚˜ํƒ€๋‚ด๋„๋ก [0, 1] ๋ฒ”์œ„๋กœ ๋น„๋ก€ํ•˜์—ฌ ์กฐ์ •ํ•œ๋‹ค.
dim ๋งค๊ฐœ๋ณ€์ˆ˜๋Š” ๊ฐ’์˜ ํ•ฉ์ด 1์ด ๋˜๋Š” ์ฐจ์›์„ ๋‚˜ํƒ€๋‚ธ๋‹ค.

softmax = nn.Softmax(dim=1)
pred_probab = softmax(logits)
logits
>>> tensor([[-0.1024, -0.0443, -0.0061, -0.0646,  0.0962, -0.0137,  0.0917, -0.1101,
         -0.0819,  0.0465]], grad_fn=<AddmmBackward0>)

pred_probab
>>> tensor([[0.0917, 0.0972, 0.1010, 0.0953, 0.1119, 0.1003, 0.1114, 0.0910, 0.0936,
         0.1065]], grad_fn=<SoftmaxBackward0>)


6. Autograd

์‹ ๊ฒฝ๋ง์˜ ํ•ต์‹ฌ์€ backpropagation์ด๋‹ค.


forward propagation์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ง„ํ–‰๋œ๋‹ค.
์œ„ ์‹ ๊ฒฝ๋ง์—์„œ Weight์™€ bias๊ฐ€ ์ตœ์ ํ™”๋ฅผ ํ•ด์•ผํ•˜๋Š” ๋งค๊ฐœ๋ณ€์ˆ˜์ด๋‹ค.

backpropagation์€ Weight์™€ bias๋ฅผ Loss์— ๋Œ€ํ•œ weight์™€ bias์˜ derivative๋ฅผ ์ด์šฉํ•ด updateํ•ด ๋‚˜๊ฐ€๋Š” ๊ฒƒ์ด๋‹ค.

๊ทธ๋ ‡๊ฒŒ ํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ๋ณ€์ˆ˜๋“ค์— ๋Œ€ํ•œ loss์˜ derivative๋ฅผ ๊ณ„์‚ฐํ•˜๊ณ  chain rule์„ ์ด์šฉํ•ด ์ตœ์ข…์ ์œผ๋กœ ∂loss/∂w์™€ ∂loss/∂b๋ฅผ ๊ตฌํ•ด์•ผ ํ•œ๋‹ค.

Pytorch์—์„œ๋Š” ์ด๋Ÿฌํ•œ gradient์˜ ๊ณ„์‚ฐ์„ ์ž๋™ ์ง€์›ํ•˜๋Š” torch.autograd๊ฐ€ ์กด์žฌํ•œ๋‹ค.

์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์€ ์•„๋ž˜ ์ฝ”๋“œ์˜ backpropagation ๋ถ€๋ถ„์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.



7. ๋ชจ๋ธ Hyperparameter ์ตœ์ ํ™”, ํ•™์Šต, ๊ฒ€์ฆ, ํ…Œ์ŠคํŠธ

๋ชจ๋ธ๊ณผ ๋ฐ์ดํ„ฐ๊ฐ€ ์ค€๋น„๋œ ํ›„์—๋Š”, ๋ฐ์ดํ„ฐ์— ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์ตœ์ ํ™”ํ•˜์—ฌ ๋ชจ๋ธ์„ ํ•™์Šตํ•˜๊ณ , ๊ฒ€์ฆํ•˜๊ณ , ํ…Œ์ŠคํŠธํ•  ์ฐจ๋ก€์ด๋‹ค.
๋ชจ๋ธ์„ ํ•™์Šตํ•˜๋Š” ๊ณผ์ •์€ ๊ฐ epoch๋งˆ๋‹ค output์„ ์˜ˆ์ธกํ•˜๊ณ , predict์™€ ์ •๋‹ต ์‚ฌ์ด์˜ ์˜ค๋ฅ˜(์†์‹ค(loss))๋ฅผ ๊ณ„์‚ฐํ•˜๊ณ , ๋งค๊ฐœ๋ณ€์ˆ˜์— ๋Œ€ํ•œ ์˜ค๋ฅ˜์˜ ๋„ํ•จ์ˆ˜(derivative)๋ฅผ ์ˆ˜์ง‘ํ•œ ๋’ค, ๊ฒฝ์‚ฌํ•˜๊ฐ•๋ฒ•์„ ์‚ฌ์šฉํ•˜์—ฌ ์ด ํŒŒ๋ผ๋ฏธํ„ฐ๋“ค์„ ์ตœ์ ํ™”(optimize)ํ•˜๋Š” ๊ฒƒ์ด๋‹ค.

7.1 Hyperparameter

Hyperparameter๋Š” ๋ชจ๋ธ ์ตœ์ ํ™” ๊ณผ์ •์„ ์ œ์–ดํ•  ์ˆ˜ ์žˆ๋Š” ์กฐ์ ˆ ๊ฐ€๋Šฅํ•œ ๋งค๊ฐœ๋ณ€์ˆ˜์ด๋‹ค.
์„œ๋กœ ๋‹ค๋ฅธ ํ•˜์ดํผํŒŒ๋ผ๋งคํ„ฐ ๊ฐ’์€ ๋ชจ๋ธ ํ•™์Šต๊ณผ ์ˆ˜๋ ด์œจ(convergence rate)์— ์˜ํ–ฅ์„ ๋ฏธ์น  ์ˆ˜ ์žˆ๋‹ค.

epoch - ๋ฐ์ดํ„ฐ์…‹์„ ๋ฐ˜๋ณตํ•˜๋Š” ํšŸ์ˆ˜
batch size - ๋งค๊ฐœ๋ณ€์ˆ˜๊ฐ€ ๊ฐฑ์‹ ๋˜๊ธฐ ์ „ ์‹ ๊ฒฝ๋ง์„ ํ†ตํ•ด ์ „ํŒŒ๋œ ๋ฐ์ดํ„ฐ ์ƒ˜ํ”Œ์˜ ์ˆ˜
learning rate - ๊ฐ ๋ฐฐ์น˜/์—ํญ์—์„œ ๋ชจ๋ธ์˜ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์กฐ์ ˆํ•˜๋Š” ๋น„์œจ. ๊ฐ’์ด ์ž‘์„์ˆ˜๋ก ํ•™์Šต ์†๋„๊ฐ€ ๋Š๋ ค์ง€๊ณ , ๊ฐ’์ด ํฌ๋ฉด ํ•™์Šต ์ค‘ ์˜ˆ์ธกํ•  ์ˆ˜ ์—†๋Š” ๋™์ž‘์ด ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ๋‹ค.

7.2 Loss Function

predict์™€ ์‹ค์ œ ๊ฐ’ ์‚ฌ์ด์˜ ์˜ค์ฐจ๋ฅผ ์ธก์ •ํ•˜๋ฉฐ, ํ•™์Šต ์ค‘์— ์ด ๊ฐ’์„ ์ตœ์†Œํ™”ํ•˜๊ณ ์ž ํ•œ๋‹ค.
์ฃผ์–ด์ง„ ๋ฐ์ดํ„ฐ ์ƒ˜ํ”Œ์„ ์ž…๋ ฅ์œผ๋กœ ๊ณ„์‚ฐํ•œ ์˜ˆ์ธก๊ณผ ์ •๋‹ต(label)์„ ๋น„๊ตํ•˜์—ฌ ์†์‹ค(loss)์„ ๊ณ„์‚ฐํ•œ๋‹ค.

์ผ๋ฐ˜์ ์ธ ์†์‹คํ•จ์ˆ˜์—๋Š” ํšŒ๊ท€ ๋ฌธ์ œ์— ์‚ฌ์šฉํ•˜๋Š” nn.MSELoss(ํ‰๊ท  ์ œ๊ณฑ ์˜ค์ฐจ(MSE; Mean Square Error))๋‚˜ ๋ถ„๋ฅ˜(classification)์— ์‚ฌ์šฉํ•˜๋Š” nn.LogSoftmax์™€ nn.CrossEntropyLoss ๋“ฑ์ด ์žˆ๋‹ค.

7.3 Optimizer

๊ฐ ํ•™์Šต ๋‹จ๊ณ„์—์„œ ๋ชจ๋ธ์˜ ์˜ค๋ฅ˜๋ฅผ ์ค„์ด๊ธฐ ์œ„ํ•ด ๋ชจ๋ธ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์กฐ์ •ํ•˜๋Š” ๊ณผ์ •์ด๋‹ค.
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

ํ•™์Šต ๋‹จ๊ณ„(loop)์—์„œ ์ตœ์ ํ™”๋Š” ์„ธ๋‹จ๊ณ„๋กœ ์ด๋ค„์ง„๋‹ค.

  1. optimizer.zero_grad()๋ฅผ ํ˜ธ์ถœํ•˜์—ฌ ๋ชจ๋ธ ๋งค๊ฐœ๋ณ€์ˆ˜์˜ ๋ณ€ํ™”๋„๋ฅผ ์žฌ์„ค์ •ํ•œ๋‹ค.
    ๊ธฐ๋ณธ์ ์œผ๋กœ gradient๋Š” ๋”ํ•ด์ง€๊ธฐ(add up) ๋•Œ๋ฌธ์— ์ค‘๋ณต ๊ณ„์‚ฐ์„ ๋ง‰๊ธฐ ์œ„ํ•ด ๋ฐ˜๋ณตํ•  ๋•Œ๋งˆ๋‹ค ๋ช…์‹œ์ ์œผ๋กœ 0์œผ๋กœ ์„ค์ •ํ•œ๋‹ค.
  1. loss.backwards()๋ฅผ ํ˜ธ์ถœํ•˜์—ฌ prediction loss๋ฅผ backpropagateํ•œ๋‹ค. PyTorch๋Š” ๊ฐ ๋งค๊ฐœ๋ณ€์ˆ˜์— ๋Œ€ํ•œ ์†์‹ค์˜ ๋ณ€ํ™”๋„๋ฅผ ์ €์žฅํ•œ๋‹ค.
  1. ๋ณ€ํ™”๋„๋ฅผ ๊ณ„์‚ฐํ•œ ๋’ค์—๋Š” optimizer.step()์„ ํ˜ธ์ถœํ•˜์—ฌ backpropagation ๋‹จ๊ณ„์—์„œ ์ˆ˜์ง‘๋œ ๋ณ€ํ™”๋„๋กœ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์กฐ์ •ํ•ฉ๋‹ˆ๋‹ค.

7.4 train_loop / test_loop

def train_loop(dataloader, model, loss_fn, optimizer):
  size = len(dataloader.dataset)
  for batch, (X, y) in enumerate(dataloader):
  # forward propagation
    pred = model(X)
    loss = loss_fn(pred, y)

  # backpropagation
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    if batch % 100 == 0:
      loss, current = loss.item(), batch * len(X)
      print(f"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]")


def test_loop(dataloader, model, loss_fn):
  size = len(dataloader.dataset)
  num_batches = len(dataloader)
  test_loss, correct = 0, 0

  with torch.no_grad():
    for X, y in dataloader:
      pred = model(X)
      test_loss += loss_fn(pred, y).item()
      correct += (pred.argmax(1) == y).type(torch.float).sum().item()
  
  test_loss /= num_batches
  correct /= size
  print(f"Test Error: \n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \n")

learning_rate = 1e-3
batch_size = 64
epochs = 5

loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

for i in range(epochs):
  print(f"Epoch {i+1}\n----------------")
  train_loop(train_dataloader, model, loss_fn, optimizer)
  test_loop(test_dataloader, model, loss_fn)
print("Done")

>>> Epoch 1
----------------
loss: 2.143001  [    0/60000]
loss: 2.136818  [ 6400/60000]
loss: 2.114318  [12800/60000]
loss: 2.030327  [19200/60000]
loss: 2.062488  [25600/60000]
loss: 2.009769  [32000/60000]
loss: 1.964789  [38400/60000]
loss: 1.966160  [44800/60000]
loss: 1.953037  [51200/60000]
loss: 1.913669  [57600/60000]
Test Error: 
 Accuracy: 53.5%, Avg loss: 1.870262 


8. ๋ชจ๋ธ ์ €์žฅํ•˜๊ณ  ๋ถˆ๋Ÿฌ์˜ค๊ธฐ

import torch
import torchvision.models as models

8.1 ๋ชจ๋ธ ๊ฐ€์ค‘์น˜ ์ €์žฅ/๋ถˆ๋Ÿฌ์˜ค๊ธฐ

PyTorch ๋ชจ๋ธ์€ ํ•™์Šตํ•œ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ state_dict๋ผ๊ณ  ๋ถˆ๋ฆฌ๋Š” ๋‚ด๋ถ€ ์ƒํƒœ ์‚ฌ์ „(internal state dictionary)์— torch.save๋ฉ”์†Œ๋“œ๋ฅผ ์‚ฌ์šฉํ•ด ์ €์žฅ ํ•  ์ˆ˜ ์žˆ๋‹ค.

vgg๋ชจ๋ธ ๊ฐ€์ค‘์น˜ ์ €์žฅ

model = models.vgg16(pretrained=True)
torch.save(model.state_dict(), 'model_weights.pth')

๋ชจ๋ธ ๊ฐ€์ค‘์น˜ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ

model = models.vgg16() # ๊ธฐ๋ณธ ๊ฐ€์ค‘์น˜๋ฅผ ๋ถˆ๋Ÿฌ์˜ค์ง€ ์•Š์œผ๋ฏ€๋กœ pretrained=True๋ฅผ ์ง€์ •ํ•˜์ง€ ์•Š์Œ.
model.load_state_dict(torch.load('model_weights.pth'))
model.eval()

8.2 ๋ชจ๋ธ ๊ตฌ์กฐ๊นŒ์ง€ ์ €์žฅ/๋ถˆ๋Ÿฌ์˜ค๊ธฐ

์ €์žฅ

torch.save(model, 'model.pth')

๋ถˆ๋Ÿฌ์˜ค๊ธฐ

model = torch.load('model.pth')
 
728x90
Contents