Editing multi-dimensional arrays in Python

Question

I am a bit confused about the behaviour of multi-dimensional Python "arrays" (actually lists of lists, not numpy arrays). Suppose I have a 4 x 4 array whose entries are 1 x 2 arrays (that is, entries of the 4 x 4 array are lists containing 2 lists). I can initialize such an empty array with the command:

b = [[[[], ]*2, ]*4, ]*4

This creates the following multidimensional empty array:

[[[[], []], [[], []], [[], []], [[], []]],
[[[], []], [[], []], [[], []], [[], []]],
[[[], []], [[], []], [[], []], [[], []]],
[[[], []], [[], []], [[], []], [[], []]]]

Now I want to modify a single entry of this 4 x 4 array; for instance I want to make the component b[0][0] to be equal to [[1],[2]]:

[[[[1], [2]], [[], []], [[], []], [[], []]],
[[[], []], [[], []], [[], []], [[], []]],
[[[], []], [[], []], [[], []], [[], []]],
[[[], []], [[], []], [[], []], [[], []]]]

My expectation was that the following command would do that job:

b[0][0] = [[1],[2]]

However, this leads instead to the following matrix b:

[[[[1], [2]], [[], []], [[], []], [[], []]],
[[[1], [2]], [[], []], [[], []], [[], []]],
[[[1], [2]], [[], []], [[], []], [[], []]],
[[[1], [2]], [[], []], [[], []], [[], []]]]

What is the proper way to achieve this?

If the arrays you're using really are that simple and uniform, you should give *strong* consideration to numpy. In that case, you would use `b = numpy.empty((4,4,2), dtype=int)`. Even the simple step of creating the array is 20--30 times faster. And if you do anything interesting with the data, numpy is frequently hundreds or thousands of times faster than MightyPork's (entirely valid) solution, and basically never slower (as far as I've seen). — Mike, Jan 10 '15 at 00:31

score 6 · Accepted Answer · edited May 23 '17 at 11:58

6

The problem is really that when you do

b = [[[[], ]*2, ]*4, ]*4

You are copying references to the same arrays, so your new large multi-dimensional array contains many references to the same arrays.

Here's an example with iPython:

In [22]: a = [[],]*10

In [23]: a
Out[23]: [[], [], [], [], [], [], [], [], [], []]

In [24]: a[1].append(12)

In [25]: a
Out[25]: [[12], [12], [12], [12], [12], [12], [12], [12], [12], [12]]

For a better way to create your array, head over here

Possible solution to what you want:

m = [[[[] for x in range(2)] for x in range(4)] for x in range(4)]

edited May 23 '17 at 11:58

Community

1
1

answered Jan 10 '15 at 00:16

MightyPork

18,270
10
79
133

Oh, I see. And what is the proper way to initialize such an empty array so that its components are actually independent? Maybe using `numpy` and then converting back to a list of lists? – STU Jan 10 '15 at 00:20
you can do it using list comprehension, see the question I linked. It'd be silly to just copy it here. NumPy would work too, of course. – MightyPork Jan 10 '15 at 00:20
Why would you need to convert to a list of lists ? – Andrew_Lvov Jan 10 '15 at 00:21
@STU see now I added somethign that might work for you – MightyPork Jan 10 '15 at 00:24
Thanks, now it is clear! Andrew_Lvov, the reason I want to use lists of lists is because this is a small piece of a long module that uses `+` to concatenate lists, `append`, etc. So it should take me some important effort to modify the code so that it works with `numpy` arrays instead of lists of lists (and performance of the module is not an important aspect here). – STU Jan 10 '15 at 00:31

score 0 · Answer 2 · answered Jan 10 '15 at 00:20

[[]]*2 will create an array of two references to the same empty array. When you try to change one of them, the other reference reflects the change since it points to the same one. In your case each row points to the same array. When you change the first one, you're changing the array all four rows point to.

Feel free to get acquainted with NumPy library

Editing multi-dimensional arrays in Python

2 Answers2