pytorch-handbook/chapter2/2.1.3-pytorch-basics-nerual-network.ipynb
2019-11-04 16:28:32 +08:00

424 lines
12 KiB
Plaintext
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# PyTorch 基础 : 神经网络包nn和优化器optm\n",
"torch.nn是专门为神经网络设计的模块化接口。nn构建于 Autograd之上可用来定义和运行神经网络。\n",
"这里我们主要介绍几个一些常用的类"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**约定torch.nn 我们为了方便使用会为他设置别名为nn本章除nn以外还有其他的命名约定**"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'1.0.0'"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# 首先要引入相关的包\n",
"import torch\n",
"# 引入torch.nn并指定别名\n",
"import torch.nn as nn\n",
"#打印一下版本\n",
"torch.__version__"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"除了nn别名以外我们还引用了nn.functional这个包中包含了神经网络中使用的一些常用函数这些函数的特点是不具有可学习的参数(如ReLUpoolDropOut等),这些函数可以放在构造函数中,也可以不放,但是这里建议不放。\n",
"\n",
"一般情况下我们会**将nn.functional 设置为大写的F**,这样缩写方便调用"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"import torch.nn.functional as F"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 定义一个网络\n",
"PyTorch中已经为我们准备好了现成的网络模型只要继承nn.Module并实现它的forward方法PyTorch会根据autograd自动实现backward函数在forward函数中可使用任何tensor支持的函数还可以使用if、for循环、print、log等Python语法写法和标准的Python写法一致。"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Net(\n",
" (conv1): Conv2d(1, 6, kernel_size=(3, 3), stride=(1, 1))\n",
" (fc1): Linear(in_features=1350, out_features=10, bias=True)\n",
")\n"
]
}
],
"source": [
"class Net(nn.Module):\n",
" def __init__(self):\n",
" # nn.Module子类的函数必须在构造函数中执行父类的构造函数\n",
" super(Net, self).__init__()\n",
" \n",
" # 卷积层 '1'表示输入图片为单通道, '6'表示输出通道数,'3'表示卷积核为3*3\n",
" self.conv1 = nn.Conv2d(1, 6, 3) \n",
" #线性层输入1350个特征输出10个特征\n",
" self.fc1 = nn.Linear(1350, 10) #这里的1350是如何计算的呢这就要看后面的forward函数\n",
" #正向传播 \n",
" def forward(self, x): \n",
" print(x.size()) # 结果:[1, 1, 32, 32]\n",
" # 卷积 -> 激活 -> 池化 \n",
" x = self.conv1(x) #根据卷积的尺寸计算公式计算结果是30具体计算公式后面第二章第四节 卷积神经网络 有详细介绍。\n",
" x = F.relu(x)\n",
" print(x.size()) # 结果:[1, 6, 30, 30]\n",
" x = F.max_pool2d(x, (2, 2)) #我们使用池化层计算结果是15\n",
" x = F.relu(x)\n",
" print(x.size()) # 结果:[1, 6, 15, 15]\n",
" # reshape-1表示自适应\n",
" #这里做的就是压扁的操作 就是把后面的[1, 6, 15, 15]压扁,变为 [1, 1350]\n",
" x = x.view(x.size()[0], -1) \n",
" print(x.size()) # 这里就是fc1层的的输入1350 \n",
" x = self.fc1(x) \n",
" return x\n",
"\n",
"net = Net()\n",
"print(net)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"网络的可学习参数通过net.parameters()返回"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Parameter containing:\n",
"tensor([[[[ 0.2745, 0.2594, 0.0171],\n",
" [ 0.0429, 0.3013, -0.0208],\n",
" [ 0.1459, -0.3223, 0.1797]]],\n",
"\n",
"\n",
" [[[ 0.1847, 0.0227, -0.1919],\n",
" [-0.0210, -0.1336, -0.2176],\n",
" [-0.2164, -0.1244, -0.2428]]],\n",
"\n",
"\n",
" [[[ 0.1042, -0.0055, -0.2171],\n",
" [ 0.3306, -0.2808, 0.2058],\n",
" [ 0.2492, 0.2971, 0.2277]]],\n",
"\n",
"\n",
" [[[ 0.2134, -0.0644, -0.3044],\n",
" [ 0.0040, 0.0828, -0.2093],\n",
" [ 0.0204, 0.1065, 0.1168]]],\n",
"\n",
"\n",
" [[[ 0.1651, -0.2244, 0.3072],\n",
" [-0.2301, 0.2443, -0.2340],\n",
" [ 0.0685, 0.1026, 0.1754]]],\n",
"\n",
"\n",
" [[[ 0.1691, -0.0790, 0.2617],\n",
" [ 0.1956, 0.1477, 0.0877],\n",
" [ 0.0538, -0.3091, 0.2030]]]], requires_grad=True)\n",
"Parameter containing:\n",
"tensor([ 0.2355, 0.2949, -0.1283, -0.0848, 0.2027, -0.3331],\n",
" requires_grad=True)\n",
"Parameter containing:\n",
"tensor([[ 2.0555e-02, -2.1445e-02, -1.7981e-02, ..., -2.3864e-02,\n",
" 8.5149e-03, -6.2071e-04],\n",
" [-1.1755e-02, 1.0010e-02, 2.1978e-02, ..., 1.8433e-02,\n",
" 7.1362e-03, -4.0951e-03],\n",
" [ 1.6187e-02, 2.1623e-02, 1.1840e-02, ..., 5.7059e-03,\n",
" -2.7165e-02, 1.3463e-03],\n",
" ...,\n",
" [-3.2552e-03, 1.7277e-02, -1.4907e-02, ..., 7.4232e-03,\n",
" -2.7188e-02, -4.6431e-03],\n",
" [-1.9786e-02, -3.7382e-03, 1.2259e-02, ..., 3.2471e-03,\n",
" -1.2375e-02, -1.6372e-02],\n",
" [-8.2350e-03, 4.1301e-03, -1.9192e-03, ..., -2.3119e-05,\n",
" 2.0167e-03, 1.9528e-02]], requires_grad=True)\n",
"Parameter containing:\n",
"tensor([ 0.0162, -0.0146, -0.0218, 0.0212, -0.0119, -0.0142, -0.0079, 0.0171,\n",
" 0.0205, 0.0164], requires_grad=True)\n"
]
}
],
"source": [
"for parameters in net.parameters():\n",
" print(parameters)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"net.named_parameters可同时返回可学习的参数及名称。"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"conv1.weight : torch.Size([6, 1, 3, 3])\n",
"conv1.bias : torch.Size([6])\n",
"fc1.weight : torch.Size([10, 1350])\n",
"fc1.bias : torch.Size([10])\n"
]
}
],
"source": [
"for name,parameters in net.named_parameters():\n",
" print(name,':',parameters.size())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"forward函数的输入和输出都是Tensor"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"torch.Size([1, 1, 32, 32])\n",
"torch.Size([1, 6, 30, 30])\n",
"torch.Size([1, 6, 15, 15])\n",
"torch.Size([1, 1350])\n"
]
},
{
"data": {
"text/plain": [
"torch.Size([1, 10])"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"input = torch.randn(1, 1, 32, 32) # 这里的对应前面fforward的输入是32\n",
"out = net(input)\n",
"out.size()"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"torch.Size([1, 1, 32, 32])"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"input.size()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"在反向传播前,先要将所有参数的梯度清零"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"net.zero_grad() \n",
"out.backward(torch.ones(1,10)) # 反向传播的实现是PyTorch自动实现的我们只要调用这个函数即可"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**注意**:torch.nn只支持mini-batches不支持一次只输入一个样本即一次必须是一个batch。\n",
"\n",
"也就是说就算我们输入一个样本也会对样本进行分批所以所有的输入都会增加一个维度我们对比下刚才的inputnn中定义为3维但是我们人工创建时多增加了一个维度变为了4维最前面的1即为batch-size"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 损失函数\n",
"在nn中PyTorch还预制了常用的损失函数下面我们用MSELoss用来计算均方误差"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"28.92203712463379\n"
]
}
],
"source": [
"y = torch.arange(0,10).view(1,10).float()\n",
"criterion = nn.MSELoss()\n",
"loss = criterion(out, y)\n",
"#loss是个scalar我们可以直接用item获取到他的python类型的数值\n",
"print(loss.item()) "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 优化器\n",
"在反向传播计算完所有参数的梯度后,还需要使用优化方法来更新网络的权重和参数,例如随机梯度下降法(SGD)的更新策略如下:\n",
"\n",
"weight = weight - learning_rate * gradient\n",
"\n",
"在torch.optim中实现大多数的优化方法例如RMSProp、Adam、SGD等下面我们使用SGD做个简单的样例"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"import torch.optim"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"torch.Size([1, 1, 32, 32])\n",
"torch.Size([1, 6, 30, 30])\n",
"torch.Size([1, 6, 15, 15])\n",
"torch.Size([1, 1350])\n"
]
}
],
"source": [
"out = net(input) # 这里调用的时候会打印出我们在forword函数中打印的x的大小\n",
"criterion = nn.MSELoss()\n",
"loss = criterion(out, y)\n",
"#新建一个优化器SGD只需要要调整的参数和学习率\n",
"optimizer = torch.optim.SGD(net.parameters(), lr = 0.01)\n",
"# 先梯度清零(与net.zero_grad()效果一样)\n",
"optimizer.zero_grad() \n",
"loss.backward()\n",
"\n",
"#更新参数\n",
"optimizer.step()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"这样神经网络的数据的一个完整的传播就已经通过PyTorch实现了下面一章将介绍PyTorch提供的数据加载和处理工具使用这些工具可以方便的处理所需要的数据。\n",
"\n",
"看完这节,大家可能对神经网络模型里面的一些参数的计算方式还有疑惑,这部分会在第二章 第四节 卷积神经网络有详细介绍,并且在第三章 第二节 MNIST数据集手写数字识别的实践代码中有详细的注释说明。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.7"
}
},
"nbformat": 4,
"nbformat_minor": 2
}