User guide
NumPy User Guide, Release 1.9.0
>>> data = """#
... # Skip me !
... # Skip me too !
... 1, 2
... 3, 4
... 5, 6 #This is the third line of the data
... 7, 8
... # And here comes the last line
... 9, 0
... """
>>> np.genfromtxt(StringIO(data), comments="#", delimiter=",")
[[ 1. 2.]
[ 3. 4.]
[ 5. 6.]
[ 7. 8.]
[ 9. 0.]]
Note: There is one notable exception to this behavior: if the optional argument names=True, the first commented
line will be examined for names.
Skipping lines and choosing columns
The skip_header and skip_footer arguments
The presence of a header in the file can hinder data processing. In that case, we need to use the skip_header
optional argument. The values of this argument must be an integer which corresponds to the number of lines to skip
at the beginning of the file, before any other action is performed. Similarly, we can skip the last n lines of the file by
using the skip_footer attribute and giving it a value of n:
>>> data = "\n".join(str(i) for i in range(10))
>>> np.genfromtxt(StringIO(data),)
array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])
>>> np.genfromtxt(StringIO(data),
... skip_header=3, skip_footer=5)
array([ 3., 4.])
By default, skip_header=0 and skip_footer=0, meaning that no lines are skipped.
The usecols argument
In some cases, we are not interested in all the columns of the data but only a few of them. We can select which
columns to import with the usecols argument. This argument accepts a single integer or a sequence of integers
corresponding to the indices of the columns to import. Remember that by convention, the first column has an index of
0. Negative integers behave the same as regular Python negative indexes.
For example, if we want to import only the first and the last columns, we can use usecols=(0, -1):
>>> data = "1 2 3\n4 5 6"
>>> np.genfromtxt(StringIO(data), usecols=(0, -1))
array([[ 1., 3.],
[ 4., 6.]])
If the columns have names, we can also select which columns to import by giving their name to the usecols
argument, either as a sequence of strings or a comma-separated string:
2.3. I/O with Numpy 15