424 lines
12 KiB
Plaintext
424 lines
12 KiB
Plaintext
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# PyTorch 基础 : 神经网络包nn和优化器optm\n",
|
||
"torch.nn是专门为神经网络设计的模块化接口。nn构建于 Autograd之上,可用来定义和运行神经网络。\n",
|
||
"这里我们主要介绍几个一些常用的类"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"**约定:torch.nn 我们为了方便使用,会为他设置别名为nn,本章除nn以外还有其他的命名约定**"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 1,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"'1.0.0'"
|
||
]
|
||
},
|
||
"execution_count": 1,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 首先要引入相关的包\n",
|
||
"import torch\n",
|
||
"# 引入torch.nn并指定别名\n",
|
||
"import torch.nn as nn\n",
|
||
"#打印一下版本\n",
|
||
"torch.__version__"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"除了nn别名以外,我们还引用了nn.functional,这个包中包含了神经网络中使用的一些常用函数,这些函数的特点是,不具有可学习的参数(如ReLU,pool,DropOut等),这些函数可以放在构造函数中,也可以不放,但是这里建议不放。\n",
|
||
"\n",
|
||
"一般情况下我们会**将nn.functional 设置为大写的F**,这样缩写方便调用"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 2,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"import torch.nn.functional as F"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 定义一个网络\n",
|
||
"PyTorch中已经为我们准备好了现成的网络模型,只要继承nn.Module,并实现它的forward方法,PyTorch会根据autograd,自动实现backward函数,在forward函数中可使用任何tensor支持的函数,还可以使用if、for循环、print、log等Python语法,写法和标准的Python写法一致。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 3,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Net(\n",
|
||
" (conv1): Conv2d(1, 6, kernel_size=(3, 3), stride=(1, 1))\n",
|
||
" (fc1): Linear(in_features=1350, out_features=10, bias=True)\n",
|
||
")\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"class Net(nn.Module):\n",
|
||
" def __init__(self):\n",
|
||
" # nn.Module子类的函数必须在构造函数中执行父类的构造函数\n",
|
||
" super(Net, self).__init__()\n",
|
||
" \n",
|
||
" # 卷积层 '1'表示输入图片为单通道, '6'表示输出通道数,'3'表示卷积核为3*3\n",
|
||
" self.conv1 = nn.Conv2d(1, 6, 3) \n",
|
||
" #线性层,输入1350个特征,输出10个特征\n",
|
||
" self.fc1 = nn.Linear(1350, 10) #这里的1350是如何计算的呢?这就要看后面的forward函数\n",
|
||
" #正向传播 \n",
|
||
" def forward(self, x): \n",
|
||
" print(x.size()) # 结果:[1, 1, 32, 32]\n",
|
||
" # 卷积 -> 激活 -> 池化 \n",
|
||
" x = self.conv1(x) #根据卷积的尺寸计算公式,计算结果是30,具体计算公式后面第二章第四节 卷积神经网络 有详细介绍。\n",
|
||
" x = F.relu(x)\n",
|
||
" print(x.size()) # 结果:[1, 6, 30, 30]\n",
|
||
" x = F.max_pool2d(x, (2, 2)) #我们使用池化层,计算结果是15\n",
|
||
" x = F.relu(x)\n",
|
||
" print(x.size()) # 结果:[1, 6, 15, 15]\n",
|
||
" # reshape,‘-1’表示自适应\n",
|
||
" #这里做的就是压扁的操作 就是把后面的[1, 6, 15, 15]压扁,变为 [1, 1350]\n",
|
||
" x = x.view(x.size()[0], -1) \n",
|
||
" print(x.size()) # 这里就是fc1层的的输入1350 \n",
|
||
" x = self.fc1(x) \n",
|
||
" return x\n",
|
||
"\n",
|
||
"net = Net()\n",
|
||
"print(net)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"网络的可学习参数通过net.parameters()返回"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 4,
|
||
"metadata": {
|
||
"scrolled": true
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Parameter containing:\n",
|
||
"tensor([[[[ 0.2745, 0.2594, 0.0171],\n",
|
||
" [ 0.0429, 0.3013, -0.0208],\n",
|
||
" [ 0.1459, -0.3223, 0.1797]]],\n",
|
||
"\n",
|
||
"\n",
|
||
" [[[ 0.1847, 0.0227, -0.1919],\n",
|
||
" [-0.0210, -0.1336, -0.2176],\n",
|
||
" [-0.2164, -0.1244, -0.2428]]],\n",
|
||
"\n",
|
||
"\n",
|
||
" [[[ 0.1042, -0.0055, -0.2171],\n",
|
||
" [ 0.3306, -0.2808, 0.2058],\n",
|
||
" [ 0.2492, 0.2971, 0.2277]]],\n",
|
||
"\n",
|
||
"\n",
|
||
" [[[ 0.2134, -0.0644, -0.3044],\n",
|
||
" [ 0.0040, 0.0828, -0.2093],\n",
|
||
" [ 0.0204, 0.1065, 0.1168]]],\n",
|
||
"\n",
|
||
"\n",
|
||
" [[[ 0.1651, -0.2244, 0.3072],\n",
|
||
" [-0.2301, 0.2443, -0.2340],\n",
|
||
" [ 0.0685, 0.1026, 0.1754]]],\n",
|
||
"\n",
|
||
"\n",
|
||
" [[[ 0.1691, -0.0790, 0.2617],\n",
|
||
" [ 0.1956, 0.1477, 0.0877],\n",
|
||
" [ 0.0538, -0.3091, 0.2030]]]], requires_grad=True)\n",
|
||
"Parameter containing:\n",
|
||
"tensor([ 0.2355, 0.2949, -0.1283, -0.0848, 0.2027, -0.3331],\n",
|
||
" requires_grad=True)\n",
|
||
"Parameter containing:\n",
|
||
"tensor([[ 2.0555e-02, -2.1445e-02, -1.7981e-02, ..., -2.3864e-02,\n",
|
||
" 8.5149e-03, -6.2071e-04],\n",
|
||
" [-1.1755e-02, 1.0010e-02, 2.1978e-02, ..., 1.8433e-02,\n",
|
||
" 7.1362e-03, -4.0951e-03],\n",
|
||
" [ 1.6187e-02, 2.1623e-02, 1.1840e-02, ..., 5.7059e-03,\n",
|
||
" -2.7165e-02, 1.3463e-03],\n",
|
||
" ...,\n",
|
||
" [-3.2552e-03, 1.7277e-02, -1.4907e-02, ..., 7.4232e-03,\n",
|
||
" -2.7188e-02, -4.6431e-03],\n",
|
||
" [-1.9786e-02, -3.7382e-03, 1.2259e-02, ..., 3.2471e-03,\n",
|
||
" -1.2375e-02, -1.6372e-02],\n",
|
||
" [-8.2350e-03, 4.1301e-03, -1.9192e-03, ..., -2.3119e-05,\n",
|
||
" 2.0167e-03, 1.9528e-02]], requires_grad=True)\n",
|
||
"Parameter containing:\n",
|
||
"tensor([ 0.0162, -0.0146, -0.0218, 0.0212, -0.0119, -0.0142, -0.0079, 0.0171,\n",
|
||
" 0.0205, 0.0164], requires_grad=True)\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"for parameters in net.parameters():\n",
|
||
" print(parameters)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"net.named_parameters可同时返回可学习的参数及名称。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 5,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"conv1.weight : torch.Size([6, 1, 3, 3])\n",
|
||
"conv1.bias : torch.Size([6])\n",
|
||
"fc1.weight : torch.Size([10, 1350])\n",
|
||
"fc1.bias : torch.Size([10])\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"for name,parameters in net.named_parameters():\n",
|
||
" print(name,':',parameters.size())"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"forward函数的输入和输出都是Tensor"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 6,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"torch.Size([1, 1, 32, 32])\n",
|
||
"torch.Size([1, 6, 30, 30])\n",
|
||
"torch.Size([1, 6, 15, 15])\n",
|
||
"torch.Size([1, 1350])\n"
|
||
]
|
||
},
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"torch.Size([1, 10])"
|
||
]
|
||
},
|
||
"execution_count": 6,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"input = torch.randn(1, 1, 32, 32) # 这里的对应前面fforward的输入是32\n",
|
||
"out = net(input)\n",
|
||
"out.size()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 7,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"torch.Size([1, 1, 32, 32])"
|
||
]
|
||
},
|
||
"execution_count": 7,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"input.size()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"在反向传播前,先要将所有参数的梯度清零"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 8,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"net.zero_grad() \n",
|
||
"out.backward(torch.ones(1,10)) # 反向传播的实现是PyTorch自动实现的,我们只要调用这个函数即可"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"**注意**:torch.nn只支持mini-batches,不支持一次只输入一个样本,即一次必须是一个batch。\n",
|
||
"\n",
|
||
"也就是说,就算我们输入一个样本,也会对样本进行分批,所以,所有的输入都会增加一个维度,我们对比下刚才的input,nn中定义为3维,但是我们人工创建时多增加了一个维度,变为了4维,最前面的1即为batch-size"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 损失函数\n",
|
||
"在nn中PyTorch还预制了常用的损失函数,下面我们用MSELoss用来计算均方误差"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 9,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"28.92203712463379\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"y = torch.arange(0,10).view(1,10).float()\n",
|
||
"criterion = nn.MSELoss()\n",
|
||
"loss = criterion(out, y)\n",
|
||
"#loss是个scalar,我们可以直接用item获取到他的python类型的数值\n",
|
||
"print(loss.item()) "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 优化器\n",
|
||
"在反向传播计算完所有参数的梯度后,还需要使用优化方法来更新网络的权重和参数,例如随机梯度下降法(SGD)的更新策略如下:\n",
|
||
"\n",
|
||
"weight = weight - learning_rate * gradient\n",
|
||
"\n",
|
||
"在torch.optim中实现大多数的优化方法,例如RMSProp、Adam、SGD等,下面我们使用SGD做个简单的样例"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 10,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"import torch.optim"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 11,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"torch.Size([1, 1, 32, 32])\n",
|
||
"torch.Size([1, 6, 30, 30])\n",
|
||
"torch.Size([1, 6, 15, 15])\n",
|
||
"torch.Size([1, 1350])\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"out = net(input) # 这里调用的时候会打印出我们在forword函数中打印的x的大小\n",
|
||
"criterion = nn.MSELoss()\n",
|
||
"loss = criterion(out, y)\n",
|
||
"#新建一个优化器,SGD只需要要调整的参数和学习率\n",
|
||
"optimizer = torch.optim.SGD(net.parameters(), lr = 0.01)\n",
|
||
"# 先梯度清零(与net.zero_grad()效果一样)\n",
|
||
"optimizer.zero_grad() \n",
|
||
"loss.backward()\n",
|
||
"\n",
|
||
"#更新参数\n",
|
||
"optimizer.step()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"这样,神经网络的数据的一个完整的传播就已经通过PyTorch实现了,下面一章将介绍PyTorch提供的数据加载和处理工具,使用这些工具可以方便的处理所需要的数据。\n",
|
||
"\n",
|
||
"看完这节,大家可能对神经网络模型里面的一些参数的计算方式还有疑惑,这部分会在第二章 第四节 卷积神经网络有详细介绍,并且在第三章 第二节 MNIST数据集手写数字识别的实践代码中有详细的注释说明。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": []
|
||
}
|
||
],
|
||
"metadata": {
|
||
"kernelspec": {
|
||
"display_name": "Python 3",
|
||
"language": "python",
|
||
"name": "python3"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 3
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython3",
|
||
"version": "3.6.7"
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 2
|
||
}
|