1

I'm a new user of python and pandas, and I don't understand how assigning a slice of a pandas df1 to another df2 via the iloc "slicer", copies the data from df1 via reference.

The example code below illustrates my problem. If I use the assignment y = 123, then x stays unchanged. However, if I modify y.iloc[:,:] then by "magic" also x changes. Is y.iloc a pointer to x? if y.iloc is a pointer to x, then why does x stay unchanged if I modify y directly via assignment y = 123?

d = {'col1': [5]}
x = pd.DataFrame(data=d)
print('x equals')
print(x)
y = x.iloc[:,:]
print('y equals')
print(y)

#why does the y.iloc[:,:]=123 also modify x?
y.iloc[0,0] = 123 
#y = 123 #this way x is unchanged.

print('y equals')
print(y)
print('x equals')
print(x)
Abhi
  • 4,068
  • 1
  • 16
  • 29
user2337857
  • 123
  • 6
  • After some googling, my noob intuition tells me that it might be related to deep and shallow copies. If z_shallow = x.copy(deep=False) then changes to z_shallow also changes x. While z_deep = x.copy(deep=True) does not modify the original x. However, still not 100% sure how the .iloc is related to deep and shallow copies. Anybody? – user2337857 Aug 29 '18 at 22:01
  • 1
    Simple assignment, `x = ` **never copies** and simply says "the name `x` now refers to ". Read https://nedbatchelder.com/text/names.html – juanpa.arrivillaga Aug 29 '18 at 22:03
  • 1
    `y = x.iloc[:,:]` assigns a *view* of `x` to `y`, not a copy at all. So when you mutate `y`, you will see this effect on `x`. However, doing `y = 123` is **not a mutation**, it simply assigns `y` to another object. – juanpa.arrivillaga Aug 29 '18 at 22:05
  • Thank you for explaining. Thus the problem came from the fact that initially y was a view on x.iloc and then we did a mutation (versus a different new assignment)? – user2337857 Aug 29 '18 at 22:11
  • 1
    Read this link: https://nedbatchelder.com/text/names.html for how assignment works in Python. – juanpa.arrivillaga Aug 29 '18 at 22:12
  • Note, slicing generally returns *copies* for built-in python objects, e.g. with `list` objects. However, `numpy` and `pandas` objects will return views if possible – juanpa.arrivillaga Aug 29 '18 at 22:14

1 Answers1

2

This might answer your question: why should I make a copy of a data frame in pandas

'In Pandas, indexing a DataFrame returns a reference to the initial DataFrame'

ipramusinto
  • 2,310
  • 2
  • 14
  • 24