Why does np.array([1, "a"]) consume Unicode String of 21 characters?

715 Views Asked by Turtle At 25 December 2020 at 11:20

When checking the data type of string with one character, i am getting dtype as <U1 as expected.

print(numpy.array(["a"]).dtype)

Output : <U1

But after adding an integer to the array, why does it consume 21 characters ?

print(numpy.array([1,"a"]).dtype)

Output : <U21

Original Q&A

There are 1 best solutions below

Dani Mesejo On 25 December 2020 at 11:48 BEST ANSWER

Why does it consume 21 characters?

Because the elements are being promoted, this means numpy transforms the elements to

the smallest size and smallest scalar kind to which both type1 and type2 may be safely cast.

For example if we use promote_types:

print(np.promote_types('i8', '<U1'))

Output

<U21

Regarding the U21, it consists of two parts, as you already know, the U which denotes Unicode and the 21 denotes the number of elements it can hold, see more on this answer.

So as 8 can be cast to int64, and it can hold at least 20 characters (platform dependent though), it's being transformed to U21. The know the number of characters a number can have you can do:

ii64 = np.iinfo(np.int64)
print(ii64)

Output

Machine parameters for int64
---------------------------------------------------------------
min = -9223372036854775808
max = 9223372036854775807
---------------------------------------------------------------

In particular:

print(len(str(ii64.min)))

Output

You can keep U1, by doing:

print(np.array(["a", 1]).dtype) # put the string first

Output

<U1

See more on this GitHub issue.

Why does np.array([1, "a"]) consume Unicode String of 21 characters?

There are 1 best solutions below

Why does it consume 21 characters?

Related Questions in PYTHON

Related Questions in NUMPY

Related Questions in NUMPY-DTYPE

Trending Questions

Popular # Hahtags

Popular Questions