NumPy User Guide Release 1.9.
CONTENTS 1 Introduction 1.1 What is NumPy? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Building and installing NumPy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 How to find documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Numpy basics 2.1 Data types . . . . . . . . . . . . . . . 2.2 Array creation . . . . . . . . . . . . . 2.3 I/O with Numpy . . . . . . . . . . . . 2.
ii
NumPy User Guide, Release 1.9.0 This guide is intended as an introductory overview of NumPy and explains how to install and make use of the most important features of NumPy. For detailed reference documentation of the functions and classes contained in the package, see the reference. Warning: This “User Guide” is still a work in progress; some of the material is not organized, and several aspects of NumPy are not yet covered sufficient detail.
NumPy User Guide, Release 1.9.
CHAPTER ONE INTRODUCTION 1.1 What is NumPy? NumPy is the fundamental package for scientific computing in Python. It is a Python library that provides a multidimensional array object, various derived objects (such as masked arrays and matrices), and an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more.
NumPy User Guide, Release 1.9.0 This saves all the overhead involved in interpreting the Python code and manipulating Python objects, but at the expense of the benefits gained from coding in Python. Furthermore, the coding work required increases with the dimensionality of our data.
NumPy User Guide, Release 1.9.0 A lightweight alternative is to download the Python installer from www.python.org and the NumPy installer for your Python version from the Sourceforge download site The NumPy installer includes binaries for different CPU’s (without SSE instructions, with SSE2 or with SSE3) and installs the correct one automatically. If needed, this can be bypassed from the command line with numpy-<1.y.z>-superpack-win32.exe /arch nosse or ‘sse2’ or ‘sse3’ instead of ‘nosse’.
NumPy User Guide, Release 1.9.0 FORTRAN ABI mismatch The two most popular open source fortran compilers are g77 and gfortran. Unfortunately, they are not ABI compatible, which means that concretely you should avoid mixing libraries built with one with another. In particular, if your blas/lapack/atlas is built with g77, you must use g77 when building numpy and scipy; on the contrary, if your atlas is built with gfortran, you must build numpy/scipy with gfortran.
NumPy User Guide, Release 1.9.0 Ubuntu 8.04 and lower You can install the necessary packages for optimized ATLAS with this command: sudo apt-get install atlas3-base-dev If you have a recent CPU with SIMD suppport (SSE, SSE2, etc...), you should also install the corresponding package for optimal performances. For example, for SSE2: sudo apt-get install atlas3-sse2 1.3 How to find documentation See Also: Numpy-specific help functions How to find things in NumPy. 1.3.
NumPy User Guide, Release 1.9.0 8 Chapter 1.
CHAPTER TWO NUMPY BASICS 2.1 Data types See Also: Data type objects 2.1.1 Array types and conversions between types Numpy supports a much greater variety of numerical types than Python does. This section shows which are available, and how to modify an array’s data-type.
NumPy User Guide, Release 1.9.0 >>> import numpy as np the dtypes are available as np.bool_, np.float32, etc. Advanced types, not listed in the table above, are explored in section Structured arrays (aka “Record arrays”). There are 5 basic numerical types representing booleans (bool), integers (int), unsigned integers (uint) floating point (float) and complex. Those with numbers in their name indicate the bitsize of the type (i.e. how many bits are needed to represent a single value in memory).
NumPy User Guide, Release 1.9.0 >>> np.issubdtype(d, float) False 2.1.2 Array Scalars Numpy generally returns elements of arrays as array scalars (a scalar with an associated dtype). Array scalars differ from Python scalars, but for the most part they can be used interchangeably (the primary exception is for versions of Python older than v2.x, where integer array scalars cannot act as indices for lists and tuples).
NumPy User Guide, Release 1.9.0 and types >>> x = np.array([[ 1.+0.j, 2.+0.j], [ 0.+0.j, 0.+0.j], [ 1.+1.j, 3.+0.j]]) 2.2.3 Intrinsic Numpy Array Creation Numpy has built-in functions for creating arrays from scratch: zeros(shape) will create an array filled with 0 values with the specified shape. The default dtype is float64. >>> np.zeros((2, 3)) array([[ 0., 0., 0.], [ 0., 0., 0.]]) ones(shape) will create an array filled with 1 values. It is identical to zeros in all other respects.
NumPy User Guide, Release 1.9.0 Examples of formats that cannot be read directly but for which it is not hard to convert are those formats supported by libraries like PIL (able to read and write many image formats such as jpg, png, etc). Common ASCII Formats Comma Separated Value files (CSV) are widely used (and an export and import option for programs like Excel). There are a number of ways of reading these files in Python. There are CSV functions in Python and functions in pylab (part of matplotlib).
NumPy User Guide, Release 1.9.0 Splitting the lines into columns The delimiter argument Once the file is defined and open for reading, genfromtxt splits each non-empty line into a sequence of strings. Empty or commented lines are just skipped. The delimiter keyword is used to define how the splitting should take place. Quite often, a single character marks the separation between columns.
NumPy User Guide, Release 1.9.0 >>> data = """# ... # Skip me ! ... # Skip me too ! ... 1, 2 ... 3, 4 ... 5, 6 #This is the third line of the data ... 7, 8 ... # And here comes the last line ... 9, 0 ... """ >>> np.genfromtxt(StringIO(data), comments="#", delimiter=",") [[ 1. 2.] [ 3. 4.] [ 5. 6.] [ 7. 8.] [ 9. 0.]] Note: There is one notable exception to this behavior: if the optional argument names=True, the first commented line will be examined for names.
NumPy User Guide, Release 1.9.0 >>> data = "1 2 3\n4 5 6" >>> np.genfromtxt(StringIO(data), ... names="a, b, c", usecols=("a", "c")) array([(1.0, 3.0), (4.0, 6.0)], dtype=[(’a’, ’>> np.genfromtxt(StringIO(data), ... names="a, b, c", usecols=("a, c")) array([(1.0, 3.0), (4.0, 6.0)], dtype=[(’a’, ’
NumPy User Guide, Release 1.9.0 array([(1.0, 2.0, 3.0), (4.0, 5.0, 6.0)], dtype=[(’A’, ’
NumPy User Guide, Release 1.9.0 deletechars Gives a string combining all the characters that must be deleted from the name. By default, invalid characters are ~!@#$%^&*()-=+~\|]}[{’;: /?.>,<. excludelist Gives a list of the names to exclude, such as return, file, print... If one of the input name is part of this list, an underscore character (’_’) will be appended to it.
NumPy User Guide, Release 1.9.0 >>> data = "1, , 3\n 4, 5, 6" >>> convert = lambda x: float(x.strip() or -999) >>> np.genfromtxt(StringIO(data), delimiter=",", ... converter={1: convert}) array([[ 1., -999., 3.], [ 4., 5., 6.]]) Using missing and filling values Some entries may be missing in the dataset we are trying to import. In a previous example, we used a converter to transform an empty string into a float. However, user-defined converters may rapidly become cumbersome to manage.
NumPy User Guide, Release 1.9.0 >>> data = "N/A, 2, 3\n4, ,???" >>> kwargs = dict(delimiter=",", ... dtype=int, ... names="a,b,c", ... missing_values={0:"N/A", ’b’:" ", 2:"???"}, ... filling_values={0:0, ’b’:0, 2:-999}) >>> np.genfromtxt(StringIO.
NumPy User Guide, Release 1.9.0 2.4.2 Single element indexing Single element indexing for a 1-D array is what one expects. It work exactly like that for other standard Python sequences. It is 0-based, and accepts negative indices for indexing from the end of the array. >>> x = np.arange(10) >>> x[2] 2 >>> x[-2] 8 Unlike lists and tuples, numpy arrays support multidimensional indexing for multidimensional arrays.
NumPy User Guide, Release 1.9.0 array([[ 7, 10, 13], [21, 24, 27]]) Note that slices of arrays do not copy the internal array data but also produce new views of the original data. It is possible to index arrays with other arrays for the purposes of selecting lists of values out of arrays into new arrays. There are two different ways of accomplishing this. One uses one or more arrays of index values. The other involves giving a boolean array of the proper shape to indicate the values to be selected.
NumPy User Guide, Release 1.9.0 >>> y[np.array([0,2,4]), np.array([0,1,2])] array([ 0, 15, 30]) In this case, if the index arrays have a matching shape, and there is an index array for each dimension of the array being indexed, the resultant array has the same shape as the index arrays, and the values correspond to the index set for each position in the index arrays. In this example, the first index value is 0 for both index arrays, and thus the first value of the resultant array is y[0,0].
NumPy User Guide, Release 1.9.0 >>> y[b[:,5]] array([[21, 22, 23, 24, 25, 26, 27], [28, 29, 30, 31, 32, 33, 34]]) Here the 4th and 5th rows are selected from the indexed array and combined to make a 2-D array. In general, when the boolean array has fewer dimensions than the array being indexed, this is equivalent to y[b, ...], which means y is indexed by b followed by as many : as are needed to fill out the rank of y.
NumPy User Guide, Release 1.9.0 >>> y[:,np.newaxis,:].shape (5, 1, 7) Note that there are no new elements in the array, just that the dimensionality is increased. This can be handy to combine two arrays in a way that otherwise would require explicitly reshaping operations. For example: >>> x = np.arange(5) >>> x[:,np.newaxis] + x[np.
NumPy User Guide, Release 1.9.0 array([ 0, 10, 20, 30, 40]) >>> x[np.array([1, 1, 3, 1])] += 1 >>> x array([ 0, 11, 20, 31, 40]) Where people expect that the 1st location will be incremented by 3. In fact, it will only be incremented by 1. The reason is because a new array is extracted from the original (as a temporary) containing the values at 1, 1, 3, 1, then the value 1 is added to the temporary, and then the temporary is assigned back to the original array.
NumPy User Guide, Release 1.9.0 Broadcasting provides a means of vectorizing array operations so that looping occurs in C instead of Python. It does this without making needless copies of data and usually leads to efficient algorithm implementations. There are, however, cases where broadcasting is a bad idea because it leads to inefficient use of memory that slows computation. NumPy operations are usually done on pairs of arrays on an element-by-element basis.
NumPy User Guide, Release 1.9.
NumPy User Guide, Release 1.9.0 >>> x.shape (4,) >>> z.shape (3, 4) >>> (x + z).shape (3, 4) >>> x + z array([[ 1., [ 1., [ 1., 2., 2., 2., 3., 3., 3., 4.], 4.], 4.]]) Broadcasting provides a convenient way of taking the outer product (or any other outer operation) of two arrays. The following example shows an outer addition operation of two 1-d arrays: >>> a = np.array([0.0, 10.0, 20.0, 30.0]) >>> b = np.array([1.0, 2.0, 3.0]) >>> a[:, np.newaxis] + b array([[ 1., 2., 3.], [ 11., 12., 13.], [ 21., 22.
NumPy User Guide, Release 1.9.0 >>> big_end_str = chr(0) + chr(1) + chr(3) + chr(2) >>> big_end_str ’\x00\x01\x03\x02’ We might want to use an ndarray to access these integers. In that case, we can create an array around this memory, and tell numpy that there are two integers, and that they are 16 bit and big-endian: >>> >>> >>> 1 >>> 770 import numpy as np big_end_arr = np.ndarray(shape=(2,),dtype=’>i2’, buffer=big_end_str) big_end_arr[0] big_end_arr[1] Note the array dtype above of >i2.
NumPy User Guide, Release 1.9.0 Note the the array has not changed in memory: >>> fixed_end_dtype_arr.tobytes() == big_end_str True Data and type endianness don’t match, change data to match dtype You might want to do this if you need the data in memory to be a certain ordering. For example you might be writing the memory out to a file that needs a certain byte ordering. >>> fixed_end_mem_arr = wrong_end_dtype_arr.
NumPy User Guide, Release 1.9.0 Here we have created a one-dimensional array of length 2. Each element of this array is a record that contains three items, a 32-bit integer, a 32-bit float, and a string of length 10 or less. If we index this array at the second position we get the second record: >>> x[1] (2,3.,"World") Conveniently, one can access any field of the array by indexing using the string that names that field. In this case the fields have received the default names ‘f0’, ‘f1’ and ‘f2’.
NumPy User Guide, Release 1.9.0 >>> x = np.zeros(3, dtype=’3int8, float32, (2,3)float64’) >>> x array([([0, 0, 0], 0.0, [[0.0, 0.0, 0.0], [0.0, 0.0, 0.0]]), ([0, 0, 0], 0.0, [[0.0, 0.0, 0.0], [0.0, 0.0, 0.0]]), ([0, 0, 0], 0.0, [[0.0, 0.0, 0.0], [0.0, 0.0, 0.0]])], dtype=[(’f0’, ’|i1’, 3), (’f1’, ’>f4’), (’f2’, ’>f8’, (2, 3))]) By using strings to define the record structure, it precludes being able to name the fields in the original definition. The names can be changed as shown later, however.
NumPy User Guide, Release 1.9.0 >>> x.dtype.names (’col1’, ’col2’) >>> x.dtype.names = (’x’, ’y’) >>> x array([(0, 0.0), (0, 0.0), (0, 0.0)], dtype=[((’title 1’, ’x’), ’|i1’), ((’title 2’, ’y’), ’>f4’)]) >>> x.dtype.names = (’x’, ’y’, ’z’) # wrong number of names : must replace all names at once with a sequence of length 2 Accessing field titles The field titles provide a standard place to put associated info for fields. They do not have to be strings. >>> x.dtype.
NumPy User Guide, Release 1.9.0 More information You can find some more information on recarrays and structured arrays (including the difference between the two) here. 2.8 Subclassing ndarray 2.8.1 Credits This page is based with thanks on the wiki page on subclassing by Pierre Gerard-Marchant http://www.scipy.org/Subclasses. 2.8.2 Introduction Subclassing ndarray is relatively simple, but it has some complications compared to other Python objects.
NumPy User Guide, Release 1.9.0 2.8.4 Creating new from template New instances of an ndarray subclass can also come about by a very similar mechanism to View casting, when numpy finds it needs to create a new instance from a template instance. The most obvious place this has to happen is when you are taking slices of subclassed arrays.
NumPy User Guide, Release 1.9.0 >>> c = C(’hello’) Cls in __new__: Args in __new__: (’hello’,) type(self) in __init__: Args in __init__: (’hello’,) When we call C(’hello’), the __new__ method gets its own class as first argument, and the passed argument, which is the string ’hello’. After python calls __new__, it usually (see below) calls our __init__ method, with the output of __new__ as the first argument (now a class instance), and the passed arguments following.
NumPy User Guide, Release 1.9.0 Remember that subclass instances can come about in these three ways: 1. explicit constructor call (obj = MySubClass(params)). This will call the usual sequence of MySubClass.__new__ then (if it exists) MySubClass.__init__. 2. View casting 3. Creating new from template Our MySubClass.__new__ method only gets called in the case of the explicit constructor call, so we can’t rely on MySubClass.__new__ or MySubClass.__init__ to deal with the view casting and new-from-template.
NumPy User Guide, Release 1.9.0 self type is obj type is The signature of __array_finalize__ is: def __array_finalize__(self, obj): ndarray.__new__ passes __array_finalize__ the new object, of our own class (self) as well as the object from which the view has been taken (obj).
NumPy User Guide, Release 1.9.0 # type(obj) is InfoArray # # Note that it is here, rather than in the __new__ method, # that we set the default value for ’info’, because this # method sees all creation of default objects - with the # InfoArray.__new__ constructor, but also with # arr.view(InfoArray). self.info = getattr(obj, ’info’, None) # We do not need to return anything Using the object looks like this: >>> obj = InfoArray(shape=(3,)) # explicit constructor >>> type(obj) >>> obj.
NumPy User Guide, Release 1.9.0 So: >>> arr = np.arange(5) >>> obj = RealisticInfoArray(arr, info=’information’) >>> type(obj) >>> obj.info ’information’ >>> v = obj[1:] >>> type(v) >>> v.info ’information’ 2.8.9 __array_wrap__ for ufuncs __array_wrap__ gets called at the end of numpy ufuncs and other numpy functions, to allow a subclass to set the type of the return value and update attributes and metadata.
NumPy User Guide, Release 1.9.0 MySubClass([1, 3, 5, 7, 9]) >>> ret.info ’spam’ Note that the ufunc (np.add) has called the __array_wrap__ method of the input with the highest __array_priority__ value, in this case MySubClass.__array_wrap__, with arguments self as obj, and out_arr as the (ndarray) result of the addition. In turn, the default __array_wrap__ (ndarray.__array_wrap__) has cast the result to class MySubClass, and called __array_finalize__ hence the copying of the info attribute.
NumPy User Guide, Release 1.9.0 >>> # base points to the view it derived from >>> v2.base is v1 True In general, if the array owns its own memory, as for arr in this case, then arr.base will be None - there are some exceptions to this - see the numpy book for more details. The base attribute is useful in being able to tell whether we have a view or the original array. This in turn can be useful if we need to know whether or not to do some specific cleanup when the subclassed array is deleted.
NumPy User Guide, Release 1.9.0 44 Chapter 2.
CHAPTER THREE PERFORMANCE Placeholder for Improving Performance documentation.
NumPy User Guide, Release 1.9.0 46 Chapter 3.
CHAPTER FOUR MISCELLANEOUS 4.1 IEEE 754 Floating Point Special Values Special values defined in numpy: nan, inf, NaNs can be used as a poor-man’s mask (if you don’t care what the original value was) Note: cannot use equality to test NaNs. E.g.: >>> myarr = np.array([1., 0., np.nan, 3.]) >>> np.where(myarr == np.nan) >>> np.nan == np.nan # is always False! Use special numpy functions instead. False >>> myarr[myarr == np.nan] = 0. # doesn’t work >>> myarr array([ 1., 0., NaN, 3.]) >>> myarr[np.
NumPy User Guide, Release 1.9.0 4.2 How numpy handles numerical exceptions The default is to ’warn’ for invalid, divide, and overflow and ’ignore’ for underflow. But this can be changed, and it can be set individually for different kinds of exceptions. The different behaviors are: • ‘ignore’ : Take no action when the exception occurs. • ‘warn’ : Print a RuntimeWarning (via the Python warnings module). • ‘raise’ : Raise a FloatingPointError.
NumPy User Guide, Release 1.9.0 – Efficient – No dependencies on other tools • Minuses: – Lots of learning overhead: * need to learn basics of Python C API * need to learn basics of numpy C API * need to learn how to handle reference counting and love it. – Reference counting often difficult to get right. * getting it wrong leads to memory leaks, and worse, segfaults – API will change for Python 3.0! 2.
NumPy User Guide, Release 1.9.0 – around a long time – multiple scripting language support – C++ support – Good for wrapping large (many functions) existing C libraries • Minuses: – generates lots of code between Python and the C code – can cause performance problems that are nearly impossible to optimize out – interface files can be hard to write – doesn’t necessarily avoid reference counting issues or needing to know API’s 7. scipy.
NumPy User Guide, Release 1.9.0 5. SIP (used mainly in PyQT) 4.7 Methods vs. Functions Placeholder for Methods vs. Functions documentation. 4.7. Methods vs.
NumPy User Guide, Release 1.9.0 52 Chapter 4.
CHAPTER FIVE USING NUMPY C-API 5.1 How to extend NumPy That which is static and repetitive is boring. That which is dynamic and random is confusing. In between lies art. — John A. Locke Science is a differential equation. Religion is a boundary condition. — Alan Turing 5.1.1 Writing an extension module While the ndarray object is designed to allow rapid computation in Python, it is also designed to be general-purpose and satisfy a wide- variety of computational needs.
NumPy User Guide, Release 1.9.0 multiple modules will be defined by that file. However, there are some tricks to get that to work correctly and it is not covered here.
NumPy User Guide, Release 1.9.0 Functions without keyword arguments Functions that don’t accept keyword arguments should be written as: static PyObject* nokeyword_cfunc (PyObject *dummy, PyObject *args) { /* convert Python arguments */ /* do function */ /* return something */ } The dummy argument is not used in this context and can be safely ignored. The args argument contains all of the arguments passed in to the function as a tuple.
NumPy User Guide, Release 1.9.0 Functions with keyword arguments These functions are very similar to functions without keyword arguments. The only difference is that the function signature is: static PyObject* keyword_cfunc (PyObject *dummy, PyObject *args, PyObject *kwds) { ... } The kwds argument holds a Python dictionary whose keys are the names of the keyword arguments and whose values are the corresponding keyword-argument values. This dictionary can be processed however you see fit.
NumPy User Guide, Release 1.9.0 5.1.4 Dealing with array objects Most extension modules for NumPy will need to access the memory for an ndarray object (or one of it’s sub-classes). The easiest way to do this doesn’t require you to know much about the internals of NumPy. The method is to 1. Ensure you are dealing with a well-behaved array (aligned, in machine byte-order and single-segment) of the correct type and number of dimensions.
NumPy User Guide, Release 1.9.0 NPY_DOUBLE, NPY_LONGDOUBLE, NPY_CLONGDOUBLE, NPY_OBJECT. NPY_CFLOAT, NPY_CDOUBLE, Alternatively, the bit-width names can be used as supported on the platform. For example: NPY_INT8, NPY_INT16, NPY_INT32, NPY_INT64, NPY_UINT8, NPY_UINT16, NPY_UINT32, NPY_UINT64, NPY_FLOAT32, NPY_FLOAT64, NPY_COMPLEX64, NPY_COMPLEX128. The object will be converted to the desired type only if it can be done without losing precision. Otherwise NULL will be returned and an error raised.
NumPy User Guide, Release 1.9.0 NPY_ARRAY_ENSUREARRAY Make sure the resulting object is an actual ndarray and not a sub- class. Note: Whether or not an array is byte-swapped is determined by the data-type of the array. Native byte-order arrays are always requested by PyArray_FROM_OTF and so there is no need for a NPY_ARRAY_NOTSWAPPED flag in the requirements argument. There is also no way to get a byte-swapped array from this routine.
NumPy User Guide, Release 1.9.0 not the striding pattern matches a standard C or Fortran one can be tested Using PyArray_ISCONTIGUOUS (obj) and PyArray_ISFORTRAN (obj) respectively. Most third-party libraries expect contiguous arrays. But, often it is not difficult to support general-purpose striding. I encourage you to use the striding information in your own code whenever possible, and reserve single-segment requirements for wrapping third-party code.
NumPy User Guide, Release 1.9.0 agrees. — Michel de Montaigne Duct tape is like the force. It has a light side, and a dark side, and it holds the universe together. — Carl Zwanzig Many people like to say that Python is a fantastic glue language. Hopefully, this Chapter will convince you that this is true. The first adopters of Python for science were typically people who used it to glue together large application codes running on super-computers.
NumPy User Guide, Release 1.9.0 5.2.2 Hand-generated wrappers Extension modules were discussed in Chapter 1 . The most basic way to interface with compiled code is to write an extension module and construct a module method that calls the compiled code. For improved readability, your method should take advantage of the PyArg_ParseTuple call to convert between Python objects and C data-types. For standard C data-types there is probably already a built-in converter.
NumPy User Guide, Release 1.9.0 You should be able to run this command assuming your search-path is set-up properly. This command will produce an extension module named addmodule.c in the current directory. This extension module can now be compiled and used from Python just like any other extension module. Creating a compiled extension module You can also get f2py to compile add.
NumPy User Guide, Release 1.9.0 subroutine zadd(a,b,c,n) ! in :add:add.f double complex dimension(n) :: a double complex dimension(n) :: b double complex intent(out),dimension(n) :: c integer intent(hide),depend(a) :: n=len(a) end subroutine zadd The intent directive, intent(out) is used to tell f2py that c is an output variable and should be created by the interface before being passed to the underlying code.
NumPy User Guide, Release 1.9.0 interface simply by placing the INTENT(OUT) :: C comment line in the source code. The only difference is that N would be an optional input that would default to the length of A. A filtering example For comparison with the other methods to be discussed. Here is another example of a function that filters a twodimensional array of double precision floating-point numbers using a fixed averaging filter.
NumPy User Guide, Release 1.9.0 valid setup.py file allowing distribution of the add.f module (as part of the package f2py_examples so that it would be loaded as f2py_examples.add) is: def configuration(parent_package=’’, top_path=None) from numpy.distutils.misc_util import Configuration config = Configuration(’f2py_examples’,parent_package, top_path) config.add_extension(’add’, sources=[’add.pyf’,’add.f’]) return config if __name__ == ’__main__’: from numpy.distutils.
NumPy User Guide, Release 1.9.0 asked for (and the array types are the same), the already- compiled shared library will be loaded and used. Because Blitz makes extensive use of C++ templating, it can take a long time to compile the first time. After that, however, the code should evaluate more quickly than the equivalent NumPy expression. This is especially true if your array sizes are large and the expression would require NumPy to create several temporaries.
NumPy User Guide, Release 1.9.0 Variable myvar Nmyvar Smyvar Dmyvar myvar_array Type {dtype}* npy_intp* npy_intp* int PyArrayObject* Contents Pointer to the first element of the array A pointer to the dimensions array A pointer to the strides array The number of dimensions The entire structure for the array The in-lined code can contain references to any of these variables as well as to the standard macros MYVAR1(i), MYVAR2(i,j), MYVAR3(i,j,k), and MYVAR4(i,j,k,l).
NumPy User Guide, Release 1.9.0 Several examples are available in the examples directory where weave is installed on your system. Look particularly at ramp2.py, increment_example.py and fibonacii.py Conclusion Weave is a useful tool for quickly routines in C/C++ and linking them into Python. It’s caching-mechanism allows for on-the-fly compilation which makes it particularly attractive for in-house code.
NumPy User Guide, Release 1.9.0 If you just use Pyrex to compile a standard Python module, then you will get a C-extension module that runs either as fast or, possibly, more slowly than the equivalent Python module. Speed increases are possible only when you use cdef to statically define C variables and use a special construct to create for loops: cdef int i for i from start <= i < stop Let’s look at two examples we’ve seen before to see how they might be implemented using Pyrex.
NumPy User Guide, Release 1.9.0 Here is the Pyrex-file I named image.pyx. cimport c_numpy from c_numpy cimport import_array, ndarray, npy_intp,\ NPY_DOUBLE, NPY_CDOUBLE, \ NPY_FLOAT, NPY_CFLOAT, NPY_ALIGNED \ #We need to initialize NumPy import_array() def filter(object ao): cdef ndarray a, b cdef npy_intp i, j, M, N, oS cdef npy_intp r,rm1,rp1,c,cm1,cp1 cdef double value # Require an ALIGNED array # (but not necessarily contiguous) # We will use strides to access the elements. a = c_numpy.
NumPy User Guide, Release 1.9.0 are getting and how to interface them with C-like constructs. 2. Inappropriate Pyrex syntax or incorrect calls to C-code or type- mismatches can result in failures such as (a) Pyrex failing to generate the extension module source code, (b) Compiler failure while generating the extension module binary due to incorrect C syntax, (c) Python failure when trying to use the module. 3.
NumPy User Guide, Release 1.9.0 available to you). Items to remember are: • A shared library must be compiled in a special way ( e.g. using the -shared flag with gcc). • On some platforms (e.g. Windows) , a shared library requires a .def file that specifies the functions to be exported. For example a mylib.def file might contain. LIBRARY mylib.
NumPy User Guide, Release 1.9.0 2. Set the argtypes attribute to a list whose entries contain objects with a classmethod named from_param that knows how to convert your object to an object that ctypes can understand (an int/long, string, unicode, or object with the _as_parameter_ attribute). NumPy uses both methods with a preference for the second method because it can be safer.
NumPy User Guide, Release 1.9.0 Using an ndpointer class in the argtypes method can make it significantly safer to call a C-function using ctypes and the data- area of an ndarray. You may still want to wrap the function in an additional Python wrapper to make it user-friendly (hiding some obvious arguments and making some arguments output arguments). In this process, the requires function in NumPy may be useful to return the right kind of array from a given input.
NumPy User Guide, Release 1.9.0 S0 = astrides[0]/sizeof(double); S1=astrides[1]/sizeof(double); for (i=1; i
NumPy User Guide, Release 1.9.0 lib.dfilter2d.restype=None lib.dfilter2d.argtypes = [N.ctypeslib.ndpointer(float, ndim=2, flags=’aligned’), N.ctypeslib.ndpointer(float, ndim=2, flags=’aligned, contiguous,’\ ’writeable’), ctypes.POINTER(N.ctypeslib.c_intp), ctypes.POINTER(N.ctypeslib.c_intp)] Next, define a simple selection function that chooses which addition function to call in the shared library based on the data-type: def select(dtype): if dtype.char in [’?bBhHf’]: return lib.sadd, single elif dtype.
NumPy User Guide, Release 1.9.0 It’s disadvantages include • It is difficult to distribute an extension module made using ctypes because of a lack of support for building shared libraries in distutils (but I suspect this will change in time). • You must have shared-libraries of your code (no static libraries). • Very little support for C++ code and it’s different library-calling conventions. You will probably need a Cwrapper around C++ code to use with ctypes (or just use Boost.Python instead).
NumPy User Guide, Release 1.9.0 Boost Python Boost is a repository of C++ libraries and Boost.Python is one of those libraries which provides a concise interface for binding C++ classes and functions to Python. The amazing part of the Boost.Python approach is that it works entirely in pure C++ without introducing a new syntax. Many users of C++ report that Boost.Python makes it possible to combine the best of both worlds in a seamless fashion. I have not used Boost.
NumPy User Guide, Release 1.9.0 PyInline This is a much older module that allows automatic building of extension modules so that C-code can be included with Python code. It’s latest release (version 0.03) was in 2001, and it appears that it is not being updated. PyFort PyFort is a nice tool for wrapping Fortran and Fortran-like C-code into Python with support for Numeric arrays. It was written by Paul Dubois, a distinguished computer scientist and the very first maintainer of Numeric (now retired).
NumPy User Guide, Release 1.9.0 5.3.2 Example Non-ufunc extension For comparison and general edificaiton of the reader we provide a simple implementation of a C extension of logit that uses no numpy. To do this we need two files. The first is the C file which contains the actual code, and the second is the setup.py file used to create the module. #include #include /* * spammodule.
NumPy User Guide, Release 1.9.0 /*This builds the answer back into a python object */ return Py_BuildValue("d", p); } /* This initiates the module using the above definitions.
NumPy User Guide, Release 1.9.0 will install the module in your site-packages file. See the distutils section of ’Extending and Embedding the Python Interpreter’ at docs.python.org for more information. ’’’ from distutils.core import setup, Extension module1 = Extension(’spam’, sources=[’spammodule.c’], include_dirs=[’/usr/local/lib’]) setup(name = ’spam’, version=’1.
NumPy User Guide, Release 1.9.0 #include "numpy/npy_3kcompat.h" /* * single_type_logit.c * This is the C code for creating your own * Numpy ufunc for a logit function. * * In this code we only define the ufunc for * a single dtype. The computations that must * be replaced to create a ufunc for * a different funciton are marked with BEGIN * and END. * * Details explaining the Python-C API can be found under * ’Extending and Embedding’ and ’Python/C API’ at * docs.python.org .
NumPy User Guide, Release 1.9.
NumPy User Guide, Release 1.9.0 ’’’ setup.py file for logit.c Note that since this is a numpy extension we use numpy.distutils instead of distutils from the python standard library. Calling $python setup.py build_ext --inplace will build the extension library in the current file. Calling $python setup.py build will build a file that looks like ./build/lib*, where lib* is a file that begins with lib. The library will be in this file and end with a C library extension, such as .so Calling $python setup.
NumPy User Guide, Release 1.9.0 The places in the code corresponding to the actual computations for the ufunc are marked with /*BEGIN main ufunc computation*/ and /*END main ufunc computation*/. The code in between those lines is the primary thing that must be changed to create your own ufunc. #include #include #include #include #include "Python.h" "math.h" "numpy/ndarraytypes.h" "numpy/ufuncobject.h" "numpy/halffloat.h" /* * multi_type_logit.
NumPy User Guide, Release 1.9.
NumPy User Guide, Release 1.9.
NumPy User Guide, Release 1.9.0 } #else PyMODINIT_FUNC initnpufunc(void) { PyObject *m, *logit, *d; m = Py_InitModule("npufunc", LogitMethods); if (m == NULL) { return; } import_array(); import_umath(); logit = PyUFunc_FromFuncAndData(funcs, data, types, 4, 1, 1, PyUFunc_None, "logit", "logit_docstring", 0); d = PyModule_GetDict(m); PyDict_SetItemString(d, "logit", logit); Py_DECREF(logit); } #endif This is a setup.py file for the above code. As before, the module can be build via calling python setup.
NumPy User Guide, Release 1.9.0 from numpy.distutils.misc_util import Configuration from numpy.distutils.misc_util import get_info #Necessary for the half-float d-type. info = get_info(’npymath’) config = Configuration(’npufunc_directory’, parent_package, top_path) config.add_extension(’npufunc’, [’multi_type_logit.c’], extra_info=info) return config if __name__ == "__main__": from numpy.distutils.
NumPy User Guide, Release 1.9.0 * ufunc computation is carried out are marked * with comments. * * Details explaining the Python-C API can be found under * ’Extending and Embedding’ and ’Python/C API’ at * docs.python.org . * */ static PyMethodDef LogitMethods[] = { {NULL, NULL, 0, NULL} }; /* The loop definition must precede the PyMODINIT_FUNC.
NumPy User Guide, Release 1.9.
NumPy User Guide, Release 1.9.0 5.3.6 Example Numpy ufunc with structured array dtype arguments This example shows how to create a ufunc for a structured array dtype. For the example we show a trivial ufunc for adding two arrays with dtype ‘u8,u8,u8’. The process is a bit different from the other examples since a call to PyUFunc_FromFuncAndData doesn’t fully register ufuncs for custom dtypes and structured array dtypes. We need to also call PyUFunc_RegisterLoopForDescr to finish setting up the ufunc.
NumPy User Guide, Release 1.9.0 z = (uint64_t*)op; z[0] = x[0] + y[0]; z[1] = x[1] + y[1]; z[2] = x[2] + y[2]; i1 += is1; i2 += is2; op += os; } } /* This a pointer to the above function */ PyUFuncGenericFunction funcs[1] = {&add_uint64_triplet}; /* These are the input and return dtypes of add_uint64_triplet.
NumPy User Guide, Release 1.9.
NumPy User Guide, Release 1.9.0 An array of pointers to the actual data for the input and output arrays. The input arguments are given first followed by the output arguments. dimensions A pointer to the size of the dimension over which this function is looping. steps A pointer to the number of bytes to jump to get to the next element in this dimension for each of the input and output arguments. data Arbitrary data (extra arguments, function names, etc.
NumPy User Guide, Release 1.9.0 identity Either PyUFunc_One, PyUFunc_Zero, PyUFunc_None. This specifies what should be returned when an empty array is passed to the reduce method of the ufunc. name A NULL -terminated string providing the name of this ufunc (should be the Python name it will be called). doc A documentation string for this ufunc (will be used in generating the response to {ufunc_name}.__doc__). Do not include the function signature or the name as this is generated automatically.
NumPy User Guide, Release 1.9.0 — Albert Szent-Gyorgi 5.4.1 Iterating over elements in the array Basic Iteration One common algorithmic requirement is to be able to walk over all elements in a multidimensional array. The array iterator object makes this easy to do in a generic way that works for arrays of any dimension. Naturally, if you know the number of dimensions you will be using, then you can always write nested for loops to accomplish the iteration.
NumPy User Guide, Release 1.9.0 Iterating over all but one axis A common algorithm is to loop over all elements of an array and perform some function with each element by issuing a function call. As function calls can be time consuming, one way to speed up this kind of algorithm is to write the function so it takes a vector of data and then write the iteration so the function call is performed for an entire dimension of data at a time.
NumPy User Guide, Release 1.9.0 while(size--) { ptr1 = PyArray_MultiIter_DATA(mobj, 0); ptr2 = PyArray_MultiIter_DATA(mobj, 1); /* code using contents of ptr1 and ptr2 */ PyArray_MultiIter_NEXT(mobj); } The function PyArray_RemoveSmallest ( multi ) can be used to take a multi-iterator object and adjust all the iterators so that iteration does not take place over the largest dimension (it makes that dimension of size 1).
NumPy User Guide, Release 1.9.0 Registering a casting function You may want to allow builtin (and other user-defined) data-types to be cast automatically to your data-type. In order to make this possible, you must register a casting function with the data-type you want to be able to cast from. This requires writing low-level casting functions for each conversion you want to support and then registering these functions with the data-type descriptor. A low-level casting function has the signature.
NumPy User Guide, Release 1.9.0 int PyUFunc_RegisterLoopForType( PyUFuncObject* ufunc, int usertype, PyUFuncGenericFunction function, int* arg_types, void* data) ufunc The ufunc to attach this loop to. usertype The user-defined type this loop should be indexed under. This number must be a user-defined type or an error occurs. function The ufunc inner 1-d loop. This function must have the signature as explained in Section 3 .
NumPy User Guide, Release 1.9.0 1. If needed create a new C-structure to handle each instance of your type. A typical C-structure would be: typedef _new_struct { PyArrayObject base; /* new things here */ } NewArrayObject; Notice that the full PyArrayObject is used as the first entry in order to ensure that the binary layout of instances of the new type is identical to the PyArrayObject. 2.
NumPy User Guide, Release 1.9.0 an operation involving two or more sub-types arises. In operations where different sub-types are being used, the sub-type with the largest __array_priority__ attribute will determine the sub-type of the output(s). If two sub- types have the same __array_priority__ then the sub-type of the first argument determines the output. The default __array_priority__ attribute returns a value of 0.0 for the base ndarray type and 1.0 for a sub-type.
NumPy User Guide, Release 1.9.0 106 Chapter 5.
INDEX Symbols __array_finalize__ (ndarray attribute), 104 __array_priority__ (ndarray attribute), 104 __array_wrap__ (ndarray attribute), 105 A adding new dtype, 101 ufunc, 80, 81, 83, 86, 98 array iterator, 99, 101 B Boost.Python, 79 broadcasting, 100 C ctypes, 72, 77 D dtype adding new, 101 E NPY_ARRAY_OUT_ARRAY (C variable), 58 numpy.doc.basics (module), 9 numpy.doc.broadcasting (module), 26 numpy.doc.byteswapping (module), 29 numpy.doc.creation (module), 11 numpy.doc.howtofind (module), 7 numpy.