1

Say I have a 1D array:

import numpy as np
my_array = np.arange(0,10)
my_array.shape
(10, )

In Pandas I would like to create a DataFrame with only one row and 10 columns using this array. FOr example:

import pandas as pd
import random, string
# Random list of characters to be used as columns
cols = [random.choice(string.ascii_uppercase) for x in range(10)]

But when I try:

pd.DataFrame(my_array, columns = cols)

I get:

ValueError: Shape of passed values is (1,10), indices imply (10,10)

I presume this is because Pandas expects a 2D array, and I have a (flat) 1D array. Is there a way to inflate my 1D array into a 2D array or have Panda use a 1D array in the creation of the dataframe?

Note: I am using the latest stable version of Pandas (0.11.0)

Amelio Vazquez-Reina
  • 91,494
  • 132
  • 359
  • 564

4 Answers4

3

Your value array has length 9, (values from 1 till 9), and your cols list has length 10.

I dont understand your error message, based on your code, i get:

ValueError: Shape of passed values is (1, 9), indices imply (10, 9)

Which makes sense.

Try:

my_array = np.arange(10).reshape(1,10)

cols = [random.choice(string.ascii_uppercase) for x in range(10)]

pd.DataFrame(my_array, columns=cols)

Which results in:

   F  H  L  N  M  X  B  R  S  N
0  0  1  2  3  4  5  6  7  8  9
Rutger Kassies
  • 61,630
  • 17
  • 112
  • 97
2

Either these should do it:

my_array2 = my_array[None] # same as myarray2 = my_array[numpy.newaxis]

or

my_array2 = my_array.reshape((1,10)) 
Paul
  • 42,322
  • 15
  • 106
  • 123
1

A single-row, many-columned DataFrame is unusual. A more natural, idiomatic choice would be a Series indexed by what you call cols:

pd.Series(my_array, index=cols)

But, to answer your question, the DataFrame constructor is assuming that my_array is a column of 10 data points. Try DataFrame(my_array.reshape((1, 10)), columns=cols). That works for me.

Dan Allan
  • 34,073
  • 6
  • 70
  • 63
1

By using one of the alternate DataFrame constructors it is possible to create a DataFrame without needing to reshape my_array.

import numpy as np
import pandas as pd
import random, string
my_array = np.arange(0,10)
cols = [random.choice(string.ascii_uppercase) for x in range(10)]
pd.DataFrame.from_records([my_array], columns=cols)

Out[22]: 
   H  H  P  Q  C  A  G  N  T  W
0  0  1  2  3  4  5  6  7  8  9
Wouter Overmeire
  • 65,766
  • 10
  • 63
  • 43