15-112 Lecture 7 (Tuesday, June 4, 2013)

Sorting in Python

Among the collections in Python, only tuples and lists are ordered. Of the two, only lists can be changed. Tuples are immutable. One consequence of this is that lists, unlike the other collections, can be sorted.

The two ways of sorting a list are via the sort() method, and the sorted() function. In the simple case, the examples below do exactly what one would expect:


sl = sorted ( [5, 7, 3, 4, 1] )
print sl                            # prints [1, 3, 4, 5, 7]

usl = [5, 7, 3, 4, 1]
usl.sort()
print usl                           # prints [1, 3, 4, 5, 7]
  

The example below highlights an important difference. sorted() returns a new sorted list, whereas the sort() method modifies the list in place and doesn't return anything. This is consistent with the idea of a function being a mapping from the input(s) to an output, whereas a method is an an action of and upon an object that includes data and behaviors.


  

numbers = [5, 7, 3, 4, 1] print sorted (numbers) # Prints [1, 3, 4, 5, 7] print numbers # Prints [5, 7, 3, 4, 1] numbers = [5, 7, 3, 4, 1] print numbers.sort() # Prints None print numbers # Prints [1, 3, 4, 5, 7]

Sorting w/Key Functions

The sort method() and the sorted() function also accept an option key function parameter. If you pass in a key function as a parameter, it passes each key into this function and sorts the keys based upon the value returned by the function, rather than based upon the key, itself.

We haven't talked about optional arguments to functions, yet. We will soon. But, for now, just note that since the key function isn't required, we need to tell python that the optiona argument we are passing in is the optional "key", hence the "key=" syntax.

The exmaple below illustrates how the lower() method can be used to force a case-insensitive sort, by sorting lower-case versions of the strings:


#!/usr/bin/python

# Notice the case-based sorting
names = [ "barak", "George", "bill", "George", "Ron", "jimmy" ]
names.sort()
print names

# Notice the case-based sorting
names = [ "barak", "George", "bill", "George", "Ron", "jimmy" ]
names  = sorted(names)
print names


# Notice that the key argument foiled by attempt to break the 
# Sorting by blowercasing democrats -- but didn't change the capitalization
names = [ "barak", "George", "bill", "George", "Ron", "jimmy" ]
names  = sorted(names, key=str.lower)
print names

# And...with sorted()
names = [ "barak", "George", "bill", "George", "Ron", "jimmy" ]
names  = sorted(names, key=str.lower)
print names

Parallel Assignments

One of my favorite features of Python is its ability to perform parallel assignments. It is a pretty unique feature, which isn't available in any of the other languages I regularly use.

Essentially, it works like this. Python can copy from one tuple to another, one list to another, or between lists and tuples. This is because lists and tuples are both fully ordered collections with very similar properties The essential difference, you might remember, is that tuples are immutable. The ability to do this type of copy, in one way or another, is pretty common fair for scripting and higher level languages, to be honest.

But, what is unique is that Python takes it one step further an lets you treat variables as if they are temporarily part of a tuple, placing them in an order -- so that you can copy from a list or a tuple to or from them. Check out the examples below:


#!/usr/bin/python

personTuple = ( "Gregory Kesden", "GHC 7711", "412-268-1590" )
personList = ( "Barack Obama", "1600 Pennsylvania Ave", "202-456-1111" )


(name, address, phoneNumber) =  personTuple
print name
print address
print phoneNumber
print ""

(name, address, phoneNumber) =  personList
print name
print address
print phoneNumber
print ""

(name, address, phoneNumber) = ("Gregory Kesden", "GHC 7711", "412-268-1590")
print name
print address
print phoneNumber
print ""

personList = personTuple
print name
print address
print phoneNumber
print ""

Stripping White Space

People tend to be very insensitive to white space in strings -- we just don't notice it very easily. As a result, when processing strings entered by humans, we often want to strip out the extra white space; for example spaces, tabs, etc; leaving the rest of the string. There are three methods of help to us:

Please consider the example below:


#!/usr/bin/python

spacedPhrase = "   Greetings and      Welcome     "

print "phrase: ---" + spacedPhrase + "---"
print "strip: ---" + spacedPhrase.strip() + "---"
print "lstrip: ---" + spacedPhrase.lstrip() + "---"
print "rstrip: ---" + spacedPhrase.rstrip() + "---"
print "lstrip + rstrip: ---" + spacedPhrase.lstrip().rstrip() + "---"

Slicing Strings, Lists, and Tuples

We've played with three important ordered collections: strings, lists, and tuples. Although these structures differ in their properties and their intended uses, they are all indexable and maintain a totally ordered list of elements. In other words, without generality, we can ask for the 0th, 1st, 2nd, 3rd, ..., (n-1)th element of any of them. We do this with the []-bracket notation, e.g. namesList[3], or personaAtrributedTuple[1].

But, recall form earlier in the semester, that we can also specify a range within the brackets, as [startInclusive, endExclusive], which returns a new tuple/list/string with the items including those in the range startInclusive up to, but not including, endExclusive. The resulting slice is of the same type as the original, be it string, tuple, or list. In other words, slicing a list results in a list, slicing a tuple results in a tuple, and slicing a string results in a string.

Please find a quick example below:


#!/usr/bin/python

# Prints ("Gregory", "Michael" )
# Notice the ()-parens -- the slice is stull a tuple
# Notice the 0th element is included, but the 3rd is not. 
nameTuple = ("Gregory", "Michael", "Kesden" )
print nameTuple[0:2] 


# Prints ["Gregory", "Michael" ]
# Notice the []-parens -- the slice is still a list
# Notice the 0th element is included, but the 3rd is not. 
nameList = ["Gregory", "Michael", "Kesden"]
print nameList[0:2] 


# Prints "Gr" ]
# Notice the ""-quotes -- the slice is still a string
# Notice the 0th element is included, but the 3rd is not. 
nameString = "Gregory Michael Kesden"
print nameString[0:2]