Snippets Python / Numpy / Arrays

Numpy - Arrays

By Marcelo Fernandes Dec 3, 2017

Numpy Arrays - Basics

The array is the main object of Numpy. It is composed by a table of elements that are generally numbers, all of the same type (homogeneous), and indexed by a tuple of positive integers.

In Numpy, the dimensions are called Axes and the number of axes is called Ranks.

For example, the coordinates of a point in 3D space [1, 2, 1] is an array of rank 1, because it has one axis. That axis has a length of 3.

n the example pictured below, the array has rank 2 (it is 2-dimensional). The first dimension (axis) has a length of 2, the second dimension has a length of 3.



[[ 1., 0., 0.],
 [ 0., 1., 2.]]


NumPy’s array class is called ndarray(N-dimensions array to avoid confusion with standard library array.array). It is also known by the alias array.

Let's see what the numpy array is capable of:


import numpy as np

# Instantiation of an array:
array = np.array([(1,2,3,4), (5,6,7,8)])

# Getting the number of dimensions (axes), also known as Ranks:
array.ndim  # 2

# Getting the shape. The size of the array in each dimension (Tuple):
array.shape  # (2, 4)

# Getting the size, or the total number of elements in the array.
array.size  # 8

# Getting the type of the elements in an array. You can use standard types from,
# Python, but you can also use numpy custom types: numpy.int32, numpy.int16, etc..
array.dtype  # dtype('int64')

# The size in bytes for each type of the elements in the array can also be
# checked:
array.itemsize  # 8

# Load a buffer containing the actual elements in an array.
# (it is not commonly used, once we access data via index)
array.data  # memory at 0x10565db40



Upon creation, a numpy array can have its type defined:



complex_array = np.array([(1,2,3), (4,5,6)], dtype=complex)

>>> array([[ 1.+0.j,  2.+0.j,  3.+0.j],
          [ 4.+0.j,  5.+0.j,  6.+0.j]])


Often, the elements of an array are originally unknown, but its size is known. Hence, NumPy offers several functions to create arrays with initial placeholder content. These minimize the necessity of growing arrays, an expensive operation.

The function zeros creates an array full of zeros, the function ones creates an array full of ones, and the function empty creates an array whose initial content is random and depends on the state of the memory. By default, the dtype of the created array is float64.



>>> np.zeros((2,4))
array([[ 0.,  0.,  0.,  0.],
      [ 0.,  0.,  0.,  0.]])

>>> np.ones((2,4), dtype=np.int64)
array([[ 1,  1,  1,  1],
       [ 1,  1,  1,  1]])

>>> np.empty((2,4))
array([[  0.00000000e+000,   0.00000000e+000,   1.26480805e-321,
          0.00000000e+000],
       [  3.50977866e+064,   3.25938554e-311,               nan,
          1.11687648e-308]])



To create sequences of numbers, NumPy provides a function analogous to range that returns arrays instead of lists.



>>> np.arange(10, 30, 5)
array([10, 15, 20, 25])

>>> np.arange(0, 2, 0.3)  # it accepts float arguments
array([ 0. ,  0.3,  0.6,  0.9,  1.2,  1.5,  1.8])


As stated on the documentation, When arange is used with floating point arguments, it is generally not possible to predict the number of elements obtained, due to the finite floating point precision. For this reason, it is usually better to use the function linspace that receives as an argument the number of elements that we want, instead of the step:



>>> from numpy import pi

>>> np.linspace(0, 2, 9)                 # 9 numbers from 0 to 2
array([ 0.  ,  0.25,  0.5 ,  0.75,  1.  ,  1.25,  1.5 ,  1.75,  2.  ])

>>> x = np.linspace(0, 2 * pi, 10)        # useful to evaluate function at lots of points

>>> f = np.sin(x)

>>> f
array([ 0.        ,  0.64251645,  0.98468459,  0.8665558 ,  0.34335012,
       -0.34035671, -0.86496168, -0.98523494, -0.644954  , -0.0031853 ])




Arithmetic Operations

Arithmetic operations in numpy arrays (+ - * /) are executed elementwise it means that the operation will be placed one element at a time, and a part from that, a new array will be created to handle the result of this operation:



>>> b = np.arange(4)

>>> b
array([0, 1, 2, 3])

>>> b**2
array([0, 1, 4, 9])

>>> b + b
array([0, 2, 4, 6])

>>> b+b > 2
array([False, False,  True,  True], dtype=bool)



Atention: Unlike in many matrix languages, the product operator * operates element-wise in NumPy arrays. The matrix product can be performed using the dot function or method:


>>> A = np.array([(1, 2),
                  (4, 5)])

>>> B = np.array([(0, 1),
                  (0, 1)])

>>> A*B
array([[0, 2],
       [0, 5]])

>>> np.dot(A, B)
array([[0, 3],
       [0, 9]])


Some operations, such as += and *= act in place to modify an existing array rather than create a new one.


>>> a = np.ones((2,3), dtype=int)

>>> a *= 2
array([[2, 2, 2],
       [2, 2, 2]])


Many unary operations, such as computing the sum of all the elements in the array, are implemented as methods of the ndarray class.


>>> a = np.random.random((2,3))

>>> a
array([[ 0.17745364,  0.3097356 ,  0.79262014],
       [ 0.52188135,  0.5828747 ,  0.09173246]])

>>> a.sum()
2.4762978922296273

>>> a.max()
0.79262013676275278

>>> a.min()
0.091732460205380062

>>> a.mean()
0.41271631537160453

>>> a.std()
0.24296158183677824


Notes