1

I have two lists to be merged as a pandas dataframe. The columns would be the header of the CSV and the data contains the rows of data as a single list.

import pandas as pd
columns = [column[0] for column in cursor.description]
len(columns)
>5

data = cursor.fetchall()
len(data)
>2458

len(data[0])
>5

df = pd.DataFrame(data=data, index=None, columns=columns)
>ValueError: Shape of passed values is (1, 2458), indices imply (5, 2458).

Can someone help me merging these two lists as a pandas dataframe? Please let me know if I am missing on any other details. Thank you!

Scott Boston
  • 147,308
  • 15
  • 139
  • 187
Rishik Mani
  • 490
  • 8
  • 27

2 Answers2

1

The presence of a cursos indicates you're using pyodbc. data contains pyodbc.Row objects and hence the pd.DataFrame constructor fails to split the data.

Try this

df = pandas.DataFrame([tuple(t) for t in cursor.fetchall()], columns=columns)
Yuca
  • 6,010
  • 3
  • 22
  • 42
  • Yes, indeed. This helped definitely. For any future travellers refer to the thorough explanation on this question [PYODBC to Pandas - DataFrame not working - Shape of passed values is (x,y), indices imply (w,z)](https://stackoverflow.com/questions/20055257/pyodbc-to-pandas-dataframe-not-working-shape-of-passed-values-is-x-y-indi) – Rishik Mani Oct 17 '18 at 13:56
0

Your csv file apparently has 5 columns, but your data is a single list of values. That means that you also only need 1 column header. Pandas complains right now because the dimension of the column list (5) does not match the number of columns in your data (1). You could fix this for example by saying:

df = pd.DataFrame(data=data, index=None, columns=[columns[0]])

That is assuming that you want to use the first column name.

rje
  • 6,388
  • 1
  • 21
  • 40