How to merge two lists of different lengths as a Pandas dataframe?

Question

I have two lists to be merged as a pandas dataframe. The columns would be the header of the CSV and the data contains the rows of data as a single list.

import pandas as pd
columns = [column[0] for column in cursor.description]
len(columns)
>5

data = cursor.fetchall()
len(data)
>2458

len(data[0])
>5

df = pd.DataFrame(data=data, index=None, columns=columns)
>ValueError: Shape of passed values is (1, 2458), indices imply (5, 2458).

Can someone help me merging these two lists as a pandas dataframe? Please let me know if I am missing on any other details. Thank you!

@Yuca, yes it is. There are total of 2458 records with 5 different attributes and all the rows have been wrapped as a list into a list. — Rishik Mani, Oct 17 '18 at 13:24
just to make sure i'm understanding correctly, doing `df = pd.DataFrame(data)` works, no? — Yuca, Oct 17 '18 at 13:27

score 1 · Accepted Answer · answered Oct 17 '18 at 13:51

1

The presence of a cursos indicates you're using pyodbc. data contains pyodbc.Row objects and hence the pd.DataFrame constructor fails to split the data.

Try this

df = pandas.DataFrame([tuple(t) for t in cursor.fetchall()], columns=columns)

answered Oct 17 '18 at 13:51

Yuca

6,010
3
22
42

Yes, indeed. This helped definitely. For any future travellers refer to the thorough explanation on this question [PYODBC to Pandas - DataFrame not working - Shape of passed values is (x,y), indices imply (w,z)](https://stackoverflow.com/questions/20055257/pyodbc-to-pandas-dataframe-not-working-shape-of-passed-values-is-x-y-indi) – Rishik Mani Oct 17 '18 at 13:56

score 0 · Answer 2 · answered Oct 17 '18 at 13:18

0

Your csv file apparently has 5 columns, but your data is a single list of values. That means that you also only need 1 column header. Pandas complains right now because the dimension of the column list (5) does not match the number of columns in your data (1). You could fix this for example by saying:

df = pd.DataFrame(data=data, index=None, columns=[columns[0]])

That is assuming that you want to use the first column name.

answered Oct 17 '18 at 13:18

rje

6,388
1
21
40

that only assigns the first column attribute to all the rows. – Rishik Mani Oct 17 '18 at 13:28

How to merge two lists of different lengths as a Pandas dataframe?

2 Answers2