Numerical Python, or NumPy is the fundamental package required for high performance scientific computing and data analysis. It’s the foundation on whick all of the higher-level tools for data analisis are built. The main characteristics are:
- ndarray, a fast and space-efficient multidimensional array providing vectorized aritmetic operations and sophisticated broadcasting capabilities.
- Standard mathematical functions on entire arrays of data without having to write loops.
- Tools for reading and writting to disk and working with memory-mapped files
- Lineal Algebra, random number generation, and Fourier transform capabilities
- Tools for integrating code written in C, C++ and Fortran
Essential
If you end up working with data … you will work with numpy. … whether you want it or not!
Although NumPy is not specialized in data analysis, it is needed in the mayority of tools prepared for data analysis, like pandas, and it is good to know the ndarray object, because it is the base for vector and matrix calculation.
ndarray is a multidimensional container for homogeneos data.
Every matrix has a shape and a dtype.
One of the basic things you need to know about ndarray is that when you make a selection, the data IS NOT copied. Let’s try this example:
# Manejando datos con NumPy import numpy as np arr = np.arange(10) print arr print arr[4:6] barr = np.array(arr) carr = arr[4:6] carr[:] = 12 print "Original " print arr print "Copia " print barr print "Slicing and Index" print carr
First of all, let’s load the NumPy paackage, and let’s create a ndarray with 10 elements.
Next, let’s create a new array based on the original. As you can verify, there are NO modifications because we are creating a new object.
Next, let’s create a new variable carr selecting only a few elements from the original variable. Let’s modify their values and see what happend: the original elements are modified:
Numpy Arrays are not like lists
Let’s see another example about how different are ndarrays and lists, with this simple code:
import numpy as np mi_lista = [1,4,6,3] print mi_lista print mi_lista * 2 mi_array = np.array(mi_lista) print mi_array print mi_array * 2
And the results:
Multiply by 2 a list increase the list, from 4 to 8 elements, duplicating the content, while doing the same operation with a ndarray, the outcome is the multiplication of every element by two (as espected). Of course, we could do it by writting [ x * 2 for x in mi_lista ] , but the results is more intuitive with Numpy.
Inn summary, NumPy is a complex library you need to know, so start working with it!