User guide

NumPy User Guide, Release 1.9.0

array([(1.0, 2.0, 3.0), (4.0, 5.0, 6.0)],

dtype=[(’A’, ’<f8’), (’B’, ’<f8’), (’C’, ’<f8’)])

In the example above, we used the fact that by default, dtype=float. By giving a sequence of names, we are

forcing the output to a structured dtype.

We may sometimes need to deﬁne the column names from the data itself. In that case, we must use the names

keyword with a value of True. The names will then be read from the ﬁrst line (after the skip_header ones), even

if the line is commented out:

>>> data = StringIO("So it goes\n#a b c\n1 2 3\n 4 5 6")

>>> np.genfromtxt(data, skip_header=1, names=True)

array([(1.0, 2.0, 3.0), (4.0, 5.0, 6.0)],

dtype=[(’a’, ’<f8’), (’b’, ’<f8’), (’c’, ’<f8’)])

The default value of names is None. If we give any other value to the keyword, the new names will overwrite the

ﬁeld names we may have deﬁned with the dtype:

>>> data = StringIO("1 2 3\n 4 5 6")

>>> ndtype=[(’a’,int), (’b’, float), (’c’, int)]

>>> names = ["A", "B", "C"]

>>> np.genfromtxt(data, names=names, dtype=ndtype)

array([(1, 2.0, 3), (4, 5.0, 6)],

dtype=[(’A’, ’<i8’), (’B’, ’<f8’), (’C’, ’<i8’)])

The defaultfmt argument

If names=None but a structured dtype is expected, names are deﬁned with the standard NumPy default of "f%i",

yielding names like f0, f1 and so forth:

>>> data = StringIO("1 2 3\n 4 5 6")

>>> np.genfromtxt(data, dtype=(int, float, int))

array([(1, 2.0, 3), (4, 5.0, 6)],

dtype=[(’f0’, ’<i8’), (’f1’, ’<f8’), (’f2’, ’<i8’)])

In the same way, if we don’t give enough names to match the length of the dtype, the missing names will be deﬁned

with this default template:

>>> data = StringIO("1 2 3\n 4 5 6")

>>> np.genfromtxt(data, dtype=(int, float, int), names="a")

array([(1, 2.0, 3), (4, 5.0, 6)],

dtype=[(’a’, ’<i8’), (’f0’, ’<f8’), (’f1’, ’<i8’)])

We can overwrite this default with the defaultfmt argument, that takes any format string:

>>> data = StringIO("1 2 3\n 4 5 6")

>>> np.genfromtxt(data, dtype=(int, float, int), defaultfmt="var_%02i")

array([(1, 2.0, 3), (4, 5.0, 6)],

dtype=[(’var_00’, ’<i8’), (’var_01’, ’<f8’), (’var_02’, ’<i8’)])

Note: We need to keep in mind that defaultfmt is used only if some names are expected but not deﬁned.

Validating names

Numpy arrays with a structured dtype can also be viewed as recarray, where a ﬁeld can be accessed as if it were an

attribute. For that reason, we may need to make sure that the ﬁeld name doesn’t contain any space or invalid character,

or that it does not correspond to the name of a standard attribute (like size or shape), which would confuse the

interpreter. genfromtxt accepts three optional arguments that provide a ﬁner control on the names:

2.3. I/O with Numpy 17