Shallow and Deep Copy
Introduction
In this chapter, we will cover the question of how to copy lists and nested lists. Trying to copy lists can be a stumping experience for newbies. But before we to summarize some insights from the previous chapter "Data Types and Variables". Python even shows a strange behaviour for beginners of the language - in copmparison with some other trditional programming languages - when assigning and copying simple data types like intgers and strings. The difference between shallow and deep copying is only relevant for compound objects, i.e. objects containing other objects, like lists or class instances.
In the follwing code snippet, y points to the same memory location than X. We can see this by applying the id() function on x and y. But unlike "real" pointers like those in C and C++, things change, when we assign a new value to y. In this case y will receive a separate memory location, as we have seen in the chapter "Data Types and Variables" and can see in the following example:
>>> x = 3 >>> y = x >>> print(id(x), id(y)) 9251744 9251744 >>> y = 4 >>> print(id(x), id(y)) 9251744 9251776 >>> print(x,y) 3 4 >>>But even if this internal behaviour appears strange compared to programming languages like C, C++ and Perl, yet the observable results of the assignments answer our expectations. But it can be problematic, if we copy mutable objects like lists and dictionaries.
Python creates only real copies, if it has to, i.e. if the user, the programmer, explicitly demands it.
We will introduce you the most crucial problems, which can occur when copying mutable objects, i.e. when copying lists and dictionaries.
Copying a list
>>> colours1 = ["red", "green"] >>> colours2 = colours1 >>> print(colours1) ['red', 'green'] >>> print(colours2) ['red', 'green'] >>> print(id(colours1),id(colours2)) 43444416 43444416 >>> colours2 = ["rouge", "vert"] >>> print(colours1) ['red', 'green'] >>> print(colours2) ['rouge', 'vert'] >>> print(id(colours1),id(colours2)) 43444416 43444200 >>>
The id() function shows us, that both variables point to the same list object, i.e. they share this object.
Now we want to see, what happens, if we assign a new list object to colours2.
As we have expected, the values of colours1 remained unchanged. Like it was in our example in the chapter "Data types and variables" a new memory location had been allocated for colours2, because we have assigned a complete new list, i.e. a new list object, to this variable.
>>> colours1 = ["red", "green"] >>> colours2 = colours1 >>> print(id(colours1),id(colours2)) 14603760 14603760 >>> colours2[1] = "blue" >>> print(id(colours1),id(colours2)) 14603760 14603760 >>> print(colours1) ['red', 'blue'] >>> print(colours2) ['red', 'blue'] >>>
In the example above, we assign a new value to the second element of colours2, i.e. the element with the index 1. Lots of beginners will be stunned, that the list of colours1 has been "automatically" changed as well.
The explanation is, that there has been no new assignment to colours2, only to one of its elements. Both variables still point to the same list object.
Copy with the Slice Operator
It's possible to completely copy shallow list structures with the slice operator without having any of the side effects, which we have described above:
>>> list1 = ['a','b','c','d'] >>> list2 = list1[:] >>> list2[1] = 'x' >>> print(list2) ['a', 'x', 'c', 'd'] >>> print(list1) ['a', 'b', 'c', 'd'] >>>But as soon as a list contains sublists, we have the same difficulty, i.e. just pointers to the sublists.
>>> lst1 = ['a','b',['ab','ba']] >>> lst2 = lst1[:]
This behaviour is depicted in the following diagram:
If you assign a new value to the 0th Element of one of the two lists, there will be no side effect. Problems arise, if you change one of the elements of the sublist.
>>> lst1 = ['a','b',['ab','ba']] >>> lst2 = lst1[:] >>> lst2[0] = 'c' >>> lst2[2][0] = 'd' >>> print(lst1) ['a', 'b', ['d', 'ba']]
The following diagram depicts what happens, if one of the elements of a sublist will be changed: Both the content of lst1 and lst2 are changed.
Using the Method deepcopy from the Module copy
A solution to the described problems provides the module "copy". This module provides the method "copy", which allows a complet copy of a arbitrary list, i.e. shallow and other lists.The following script uses our example above and this method:
from copy import deepcopy lst1 = ['a','b',['ab','ba']] lst2 = deepcopy(lst1) lst2[2][1] = "d" lst2[0] = "c"; print(lst2) print(lst1)If we save this script under the name of deep_copy.py ab and if we call ist with "python deep_copy.py", we will receive the following output:
$ python deep_copy.py ['c', 'b', ['ab', 'd']] ['a', 'b', ['ab', 'ba']]

