https://github.com/pytorch/examples/tree/master/mnist

PyTorch Data Preprocess¶

import torch

from torchvision import datasets, transforms

Data Loader 부르기¶

파이토치는 DataLoader를 불러 model에 넣음

batch_size = 32
test_batch_size = 32

train_loader = torch.utils.data.DataLoader(
    datasets.MNIST('dataset/',train = True,download = True,
                 transform = transforms.Compose([
                     transforms.ToTensor(),
                     transforms.Normalize(mean = (0.5,), std = (0.5,))
                 ])),
    batch_size = batch_size,
    shuffle = True)

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to dataset/MNIST\raw\train-images-idx3-ubyte.gz

Extracting dataset/MNIST\raw\train-images-idx3-ubyte.gz to dataset/MNIST\raw
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to dataset/MNIST\raw\train-labels-idx1-ubyte.gz

Extracting dataset/MNIST\raw\train-labels-idx1-ubyte.gz to dataset/MNIST\raw
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to dataset/MNIST\raw\t10k-images-idx3-ubyte.gz

Extracting dataset/MNIST\raw\t10k-images-idx3-ubyte.gz to dataset/MNIST\raw
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to dataset/MNIST\raw\t10k-labels-idx1-ubyte.gz

Extracting dataset/MNIST\raw\t10k-labels-idx1-ubyte.gz to dataset/MNIST\raw
Processing...
Done!

C:\Users\user\Anaconda3\lib\site-packages\torchvision\datasets\mnist.py:469: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at  ..\torch\csrc\utils\tensor_numpy.cpp:141.)
  return torch.from_numpy(parsed.astype(m[2], copy=False)).view(*s)

test_loader = torch.utils.data.DataLoader(
    datasets.MNIST('dataset',train=False,
                  transform = transforms.Compose([
                      transforms.ToTensor(),
                      transforms.Normalize((0.5,), (0,5))
                  ])),
    batch_size = test_batch_size,
    shuffle = True)

첫번재 iteration에서 나오는 데이터 확인¶

images, labels = next(iter(train_loader))

images.shape 
# tensorflow에서는  (batch_size, height, width, channel) - (32, 28, 28, 1)
# pytorch에서는  (batch_size, channel, height, width) - (32, 1, 28, 28)
# rgb였으면 1이 아니라 3
# pytorch와 tensorflow의 차이점 !!

torch.Size([32, 1, 28, 28])

labels.shape

torch.Size([32])

PyTorch는 TensorFlow와 다르게 [Batch Size, Channel, Height, Width] 임을 명시해야함

데이터 시각화¶

import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

torch_image = torch.squeeze(images[0])
torch_image.shape

torch.Size([28, 28])

image = torch_image.numpy()
image.shape

(28, 28)

label = labels[0].numpy()

label.shape

()

label

array(8, dtype=int64)

plt.title(label)
plt.imshow(image,'gray')
plt.show()

[Pytorch 기초 - 4] MNIST data를 활용하여 CNN모델의 학습과 Optimizer, Evaluation (0)	2020.09.01
[Pytorch 기초 - 3] MNIST data를 활용하여 Pytorch로 CNN모델 구현 기본 (0)	2020.09.01
[Pytorch 기초 - 1] Pytorch의 가장 기본적인 함수들 (0)	2020.09.01

SH의 학습노트

SH의 학습노트

태그

최근글

댓글

공지사항

아카이브

PyTorch Data Preprocess¶

Data Loader 부르기¶

첫번재 iteration에서 나오는 데이터 확인¶

데이터 시각화¶

'DL in Python > Pytorch 기초' 카테고리의 다른 글

관련글

티스토리툴바