pytorch-handbook/chapter1/autograd_tutorial.ipynb

{
  "cells": [
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "%matplotlib inline"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\nAutograd: Automatic Differentiation\n===================================\n\nCentral to all neural networks in PyTorch is the ``autograd`` package.\nLet\u2019s first briefly visit this, and we will then go to training our\nfirst neural network.\n\n\nThe ``autograd`` package provides automatic differentiation for all operations\non Tensors. It is a define-by-run framework, which means that your backprop is\ndefined by how your code is run, and that every single iteration can be\ndifferent.\n\nLet us see this in more simple terms with some examples.\n\nTensor\n--------\n\n``torch.Tensor`` is the central class of the package. If you set its attribute\n``.requires_grad`` as ``True``, it starts to track all operations on it. When\nyou finish your computation you can call ``.backward()`` and have all the\ngradients computed automatically. The gradient for this tensor will be\naccumulated into ``.grad`` attribute.\n\nTo stop a tensor from tracking history, you can call ``.detach()`` to detach\nit from the computation history, and to prevent future computation from being\ntracked.\n\nTo prevent tracking history (and using memory), you can also wrap the code block\nin ``with torch.no_grad():``. This can be particularly helpful when evaluating a\nmodel because the model may have trainable parameters with `requires_grad=True`,\nbut for which we don't need the gradients.\n\nThere\u2019s one more class which is very important for autograd\nimplementation - a ``Function``.\n\n``Tensor`` and ``Function`` are interconnected and build up an acyclic\ngraph, that encodes a complete history of computation. Each tensor has\na ``.grad_fn`` attribute that references a ``Function`` that has created\nthe ``Tensor`` (except for Tensors created by the user - their\n``grad_fn is None``).\n\nIf you want to compute the derivatives, you can call ``.backward()`` on\na ``Tensor``. If ``Tensor`` is a scalar (i.e. it holds a one element\ndata), you don\u2019t need to specify any arguments to ``backward()``,\nhowever if it has more elements, you need to specify a ``gradient``\nargument that is a tensor of matching shape.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "import torch"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Create a tensor and set requires_grad=True to track computation with it\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "x = torch.ones(2, 2, requires_grad=True)\nprint(x)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Do an operation of tensor:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "y = x + 2\nprint(y)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "``y`` was created as a result of an operation, so it has a ``grad_fn``.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "print(y.grad_fn)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Do more operations on y\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "z = y * y * 3\nout = z.mean()\n\nprint(z, out)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "``.requires_grad_( ... )`` changes an existing Tensor's ``requires_grad``\nflag in-place. The input flag defaults to ``False`` if not given.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "a = torch.randn(2, 2)\na = ((a * 3) / (a - 1))\nprint(a.requires_grad)\na.requires_grad_(True)\nprint(a.requires_grad)\nb = (a * a).sum()\nprint(b.grad_fn)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Gradients\n---------\nLet's backprop now\nBecause ``out`` contains a single scalar, ``out.backward()`` is\nequivalent to ``out.backward(torch.tensor(1))``.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "out.backward()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "print gradients d(out)/dx\n\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "print(x.grad)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "You should have got a matrix of ``4.5``. Let\u2019s call the ``out``\n*Tensor* \u201c$o$\u201d.\nWe have that $o = \\frac{1}{4}\\sum_i z_i$,\n$z_i = 3(x_i+2)^2$ and $z_i\\bigr\\rvert_{x_i=1} = 27$.\nTherefore,\n$\\frac{\\partial o}{\\partial x_i} = \\frac{3}{2}(x_i+2)$, hence\n$\\frac{\\partial o}{\\partial x_i}\\bigr\\rvert_{x_i=1} = \\frac{9}{2} = 4.5$.\n\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "You can do many crazy things with autograd!\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "x = torch.randn(3, requires_grad=True)\n\ny = x * 2\nwhile y.data.norm() < 1000:\n    y = y * 2\n\nprint(y)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "gradients = torch.tensor([0.1, 1.0, 0.0001], dtype=torch.float)\ny.backward(gradients)\n\nprint(x.grad)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "You can also stop autograd from tracking history on Tensors\nwith ``.requires_grad=True`` by wrapping the code block in\n``with torch.no_grad()``:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "print(x.requires_grad)\nprint((x ** 2).requires_grad)\n\nwith torch.no_grad():\n\tprint((x ** 2).requires_grad)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "**Read Later:**\n\nDocumentation of ``autograd`` and ``Function`` is at\nhttps://pytorch.org/docs/autograd\n\n"
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.6.7"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}