Published Date : 2020年7月13日15:50

【Python】A simple way to use numpy and a simple way to build a neural network

This blog has an English translation


Here's a little more about the 「【Python】A simple way to use numpy and a simple way to build a neural network」 video I uploaded to YouTube.




This video shows you how to build a simple neural network using a handy module called numpy,

which makes it easy to do the math calculations needed for "machine learning" and "deep learning".

The first half describes the basic matrix calculation method using numpy, and the second half describes how to build a neural network.


It's a video that can be understood in a short period of time even when you're busy, without any unnecessary explanation.


Table of Contents

From the installation of numpy to the simple calculation of the matrix


First we use pip to install numpy.

pip install numpy


Next, we'll launch the python interactive console to easily try out the results of numpy.

The shorter the name, the easier it is to type, so use [as] (Abbreviation of alias) to call numpy with the np.

import numpy as np


we will try numpy immediately.

x = np.array([1,2,3])


First, we create an array of numpy (ndarray).



[shape] returns the size of each dimension in the numpy array.

行か列のベクトルなら(a,)のような形, 行列なら(a,b)のような形, 三次元なら(a,b,c)のような形になります。

A vector of rows and columns looks like (a,), a matrix looks like (a, b), and a three-dimensional looks like (a, b, c).



[ndim] returns the number of dimensions in the array (ndarray) of numpy.

四則計算:Four arithmetic operations

次に、numpy配列 (ndarray) に四則計算を使う方法を説明します。

The following shows how to use four arithmetic operations on the numpy array (ndarray).


The method of arithmetic operations between matrices is very simple, each element of a matrix is computed on a one-to-one basis with each element of another matrix.

>>> a=np.array([[1,2,3],[1,2,3]])
>>> a*np.array([[1,2],[1,2]])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: operands could not be broadcast together with shapes (2,3) (2,2)
>>> a*np.array([[1,2],[1,2,3]])
<stdin>:1: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: operands could not be broadcast together with shapes (2,3) (2,)
>>> a*np.array([[1,2,3],[1,2,3],[])
  File "<stdin>", line 1
SyntaxError: closing parenthesis ')' does not match opening parenthesis '['
>>> a*np.array([[1,2,3],[1,2,3],[1,2,3]])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: operands could not be broadcast together with shapes (2,3) (3,3)



This is an error, so adjust the size for each dimension.

Basically, you just add and divide normally.


>>> a+b
array([[2, 4, 6],
       [2, 4, 6]])
>>> a-b
array([[0, 0, 0],
       [0, 0, 0]])
>>> a*b
array([[1, 4, 9],
       [1, 4, 9]])
>>> a/b
array([[1., 1., 1.],
       [1., 1., 1.]])


Please see the video for the detailed calculation process of the matrix.



Next, let's look at the broadcast of numpy. It is used in scalar (a single real number) and matrix calculations. The broadcast is simple in that it automatically duplicates the scalar into an N×N matrix so that the scalar can be treated as a computation between matrices.

>>> np.array([[[1,2],[1,2]],[[1,2],[1,2]]])*2
array([[[2, 4],
        [2, 4]],

       [[2, 4],
        [2, 4]]])

Simple construction method of neural network


For an explanation of neural networks, read my previous blog post. So I'll keep it brief.


A very simple neural network has only input and output values.


However, if the network simply outputs the input value, it cannot be learned.


So we increase the layer of the network so that it can be treated as a function, and then we compare the output with the correct data.


The biggest advantage of being treated as a function is that you can adjust the error between the output value and the correct value by changing the variable.


After all, neural networks are just functions. Performs some computation on an input and returns the output. If you know how to calculate it, you can adjust the variables to approximate the desired answer.


The variables are weight and bias.


The weight is very hard to understand in Japanese, so in other words, think "The value to adjust how important the entered value is".


Bias is also difficult to understand in Japanese. In a graph of a linear function, this would be "Intercept". With this bias, even if the input value or weight are zero, some number is given to the constant output layer.


It's the same bias we use in biased thinking. This bias often prevents people from seeing things correctly.

なんでこんなことをするかというと、ニューラルネットワークは人間の脳のネットワークを模したものだからです。人は学習する時に常にこのバイアスが掛かっている状態と仮定して、学習過程においてこのバイアスは必要であろうと考えたからです。 つまり人はバイアスがある状態で学習して、最初に必ず間違えると過程し、そこでその間違いに気付いてそのバイアスを調整することによって学習を進めるのだろうといった具合に。 つまりニューラルネットワークはまず最初に間違えさせてから、その間違えを修正することによって学習を進めるように設計されています。

The reason for doing this is because neural networks mimic networks in the human brain. We assumed that this bias was always present when learning, and we thought that this bias would be necessary in the learning process. In other words, people learn with a bias, and when they make a mistake at first, they process it, and then they notice the mistake and adjust the bias to get better at learning. In other words, neural networks are designed to let you make mistakes first, and then learn by correcting those mistakes.



It's a long story, so I'll end with the explanation of the matrix product.


This matrix product is a little different from the four arithmetic operations of calculating matrices described above.

例えば、2行3列の行列があります。この時掛ける方の行列の行は3行と掛けられる行列の3列と数が一致しなければなりません。つまり2x3 に対して 3xNの行列を用意しなくてはなりません。

For example, a matrix with two rows and three columns. The row of the matrix to be multiplied must match column of the matrix to be multiplied. That is, there must be a 2×3 matrix for a 2×N matrix.


I want you to watch the video for the actual calculation method, but I will consider the meaning of this calculation.


The bottom line is that you can do a lot of calculations easily and quickly.


In the first place, computers are good at "parallel computing".


The matrix makes this "parallel computing" possible.


Let me show you an example. Let's first look at "way of doing one calculation at a time" using the regular For statement and "Bulk Calculations with Parallel Processing" using linear algebra by numpy.

# Calculations that do not use numpy
x = [[1,2],[3,4],[5,6]] # three rows, two colmuns
y = [[1,2,3],[4,5,6]] # two rows, three colmuns
a = []
for i in range(len(x)):
    accum = []
    for j in range(len(y[0])):
        elem = 0
        for k in range(len(y)): 
            elem += x[i][k] * y[k][j]
[[9, 12, 15], [19, 26, 33], [29, 40, 51]]

# Using Linear Algebra with numpy
X = np.array([[1,2],[3,4],[5,6]])
Y = np.array([[1,2,3],[4,5,6]])
A =,Y)
array([[ 9, 12, 15],
       [19, 26, 33],
       [29, 40, 51]])

コードの量が少なくなってスッキリしました。さらにここで注目するのはnumpyでの処理は行列を一つの塊として計算していることです。 それはコードを見れば直観的に理解できます。コンピューターに並列処理をさせるようにする為に行列の仕組みが必要なのです。 深層学習等ではそれこそ万を越える配列を処理します。 それを全てFor文で一つずつ計算するとなると、いくら人間の数百倍計算速度が高いCPUでもどれだけ大変な作業と時間がかかるか想像できるでしょう。

The amount of cords has been reduced, so it's clean. What's more, it's important to note here that the process in numpy is to compute the matrix as a whole. You can understand it intuitively by looking at the code. We need a matrix mechanism to get computers to do parallel processing. In deep learning and so on, that's dealing with tens of thousands of arrays. You can imagine how much work and time it would take to compute all of them in For statements, even on CPUs that are hundreds of times faster than humans.

numpyはC/C++で書かれていて、ndarrayはメモリの連続領域上に確保されるようになっています。 またnumpyにはndarrayの全要素に対して演算処理を行うユニバーサル関数と言われるものが備わっています。 これらのおかげで前述の通り、高速な並列計算処理の恩恵を受けて、深層学習に際に必要な膨大な計算量が可能になるわけです。

Numpy is written in C/C + +, and ndarray is reserved on a contiguous block of memory. Numpy also has a universal function that performs operations on all elements of ndarray. As mentioned earlier, this allows the enormous amount of computation required for deep learning to benefit from fast parallel computing.


Now remember how to calculate the matrix product.

Responsive image


Let's change the point of view a little bit so that the vector on the right goes in from the top and the result of the matrix appears on the right.

Responsive image


Don't you think you can do some calculation at once?

Responsive image


How does it compare to calculating one by one?

Responsive image


By the way, it's just a metaphor for things. Imagine if the numbers in the array of computer memory were scattered around, going in and out.

Responsive image


What if you were able to calculate a number in such a way that it is neatly arranged and flows smoothly, or at the same time? How is it compared to the previous time?

Responsive image


It's just a metaphor. However, the more you look at it, the more it resembles the idea and calculation method of the matrix and the calculation using memory well.


See You Next Page!