User guide
NumPy User Guide, Release 1.9.0
Iterating over all but one axis
A common algorithm is to loop over all elements of an array and perform some function with each element by issuing
a function call. As function calls can be time consuming, one way to speed up this kind of algorithm is to write the
function so it takes a vector of data and then write the iteration so the function call is performed for an entire dimension
of data at a time. This increases the amount of work done per function call, thereby reducing the function-call over-
head to a small(er) fraction of the total time. Even if the interior of the loop is performed without a function call it can
be advantageous to perform the inner loop over the dimension with the highest number of elements to take advantage
of speed enhancements available on micro- processors that use pipelining to enhance fundmental operations.
The PyArray_IterAllButAxis ( array, &dim ) constructs an iterator object that is modified so that it
will not iterate over the dimension indicated by dim. The only restriction on this iterator object, is that the
PyArray_Iter_GOTO1D ( it, ind ) macro cannot be used (thus flat indexing won’t work either if you pass
this object back to Python — so you shouldn’t do this). Note that the returned object from this routine is still usually
cast to PyArrayIterObject *. All that’s been done is to modify the strides and dimensions of the returned iterator to
simulate iterating over array[...,0,...] where 0 is placed on the dim
th
dimension. If dim is negative, then the dimension
with the largest axis is found and used.
Iterating over multiple arrays
Very often, it is desireable to iterate over several arrays at the same time. The universal functions are an example of
this kind of behavior. If all you want to do is iterate over arrays with the same shape, then simply creating several
iterator objects is the standard procedure. For example, the following code iterates over two arrays assumed to be the
same shape and size (actually obj1 just has to have at least as many total elements as does obj2):
/
*
It is already assumed that obj1 and obj2
are ndarrays of the same shape and size.
*
/
iter1 = (PyArrayIterObject
*
)PyArray_IterNew(obj1);
if (iter1 == NULL) goto fail;
iter2 = (PyArrayIterObject
*
)PyArray_IterNew(obj2);
if (iter2 == NULL) goto fail; /
*
assume iter1 is DECREF’d at fail
*
/
while (iter2->index < iter2->size) {
/
*
process with iter1->dataptr and iter2->dataptr
*
/
PyArray_ITER_NEXT(iter1);
PyArray_ITER_NEXT(iter2);
}
Broadcasting over multiple arrays
When multiple arrays are involved in an operation, you may want to use the same broadcasting rules that the math
operations (i.e. the ufuncs) use. This can be done easily using the PyArrayMultiIterObject. This is the
object returned from the Python command numpy.broadcast and it is almost as easy to use from C. The function
PyArray_MultiIterNew ( n, ... ) is used (with n input objects in place of ... ). The input objects can be
arrays or anything that can be converted into an array. A pointer to a PyArrayMultiIterObject is returned. Broad-
casting has already been accomplished which adjusts the iterators so that all that needs to be done to advance to
the next element in each array is for PyArray_ITER_NEXT to be called for each of the inputs. This increment-
ing is automatically performed by PyArray_MultiIter_NEXT ( obj ) macro (which can handle a multiterator
obj as either a PyArrayMultiObject
*
or a PyObject
*
). The data from input number i is available us-
ing PyArray_MultiIter_DATA ( obj, i ) and the total (broadcasted) size as PyArray_MultiIter_SIZE (
obj). An example of using this feature follows.
mobj = PyArray_MultiIterNew(2, obj1, obj2);
size = PyArray_MultiIter_SIZE(obj);
100 Chapter 5. Using Numpy C-API