Way of storing an ordered series of values in a structure referenced by a single variable.
#!/usr/bin/python#initialize a couple of lists
li =['duck','season']
li2 =['wabbit','season']
li3 =['...Fire']#print listsprint li
print li2
print li3
print#access single item in listprint li[0], li2[0], li3[0]print li[:]print
Here, we have put something in a variable, we ask for it back and we get what we put in. We now, by virtue of enclosing a series of values separated by commas in a pair of square brackets, have a list.
Lists are special and, as such, have lots of really useful features. You can access individual items in a list or entire sections. You can also perform functions on your list using the '.' operator.
Adding to a list
#add single item to list
li.append(li2[0])print li
li.append(li2[1])print li
print#combine two lists by extension
li.extend(li2)print li2
print li
print#combine two list by concatenation
li4 = li + li3
print li
print li3
print li4
print#print out slice of listprint li4
print li4[2:]print li4[:4]print li4[1:3]print
This script illustrates several of the most common ways to grow lists:
1) The list method append(), which adds a single item to the end of a list
2) The list method extend(), which adds a whole list to the end of the list you ask to extend itself
3) The list concatenation operator, which stitches two things together to make a new whole, without changing either one
Shrinking a list
#sick of old quote#initialize new list
li =['Why','do','superheroes','wear','spandex?']print li
print#remove a particular item from the listdel li[3]print li
print#remove the last item from the list and save it
word = li.pop()print li
print word
print
Here, we are removing things from lists in two ways:
1) The list operator del removes a particular item from the list
2) The list operator pop() removes the last item from the list and returns the variable
Changing lists in place
In addition to adding things to lists and taking them away again, we can also change lists in place.
#and now for some numbers#create list of zeros
noLi =4*[0]print noLi
print#modify items in list
Matt =10
Rich =20
Terry =500
noLi[2]= Matt
noLi[1]= Terry
noLi[3]= Rich
print noLi
print#sort list
noLi.sort()print noLi
#reverse order
noLi.reverse()print noLi
print#sort string listprint li
li.sort()print li
li.reverse()print li
Now, we've modified the list in a couple of important ways:
1) Overwrite items in the list using slices
2) Sorted the list using the operator sort()
3) Reversed the order of the list using the operator reverse()
These are, by the way, all demonstrations of the Mutability of lists--unlike when you change a string or a number, you don't have to perform an operation and store the result--these operations change the list in place.
Characterizing lists
#figure out how long a list isprint li
printlen(li)print noLi
printlen(noLi)print'Max =',max(noLi)print'Min =',min(noLi)print#iterate over listfor x in li: print x
print#find where something is stored
ind = li.index('Why')print ind
giving:
$./data_structures_lists.py
['superheroes', 'do', 'Why']
3
[500, 20, 10, 0]
4
Max = 500
Min = 0
superheroes
do
Why
2
Here, we have started to characterize our lists.
1) The list operator len(), max() and min() tells us how many items in the list and the maximum and minimum values in the list.
2) The list operator index() tells us where an item is in the list.
3) We can iterate over each item in the list and print it using the syntax for X in []:
Lists are the MOST complicated data structure. It gets easier from here :O)
Dictionaries
You can imagine a dictionary as just that-- a dictionary. To retrieve information out of it, you look up a word, called a key, and you find information associated with that word, called the key's value.
#!/usr/bin/python#initialize a couple of dictionaries
names ={'terry':'lang','matt':'davis','rich':'lusk'}
drinks ={'terry':'coffee','brian':'tea','peter':'soda'}
wines ={'red':'cabernet','white':'pinot grigio', \
'sparkling':'blanc de noirs','sticky':'muscato'}#print dictionariesprint names
print drinks
print wines
print#find a particular valueprint names['terry']print drinks['peter']print wines['sparkling']print
We have changed two values in place by modifying the key:value pair. Why won't the list operator sort() work?
Characterizing dictionaries
#identify components of the list
keys = wines.keys()
values = wines.values()print keys
print values
print#iterate over keysfor x in keys:
print'The category is', x,'and the varietal is', wines[x]print#sort keys and iterate over keys
keys.sort()for x in keys:
print'The category is', x,'and the varietal is', wines[x]print#find if something is storedif wines.has_key('red'): print wines['red']print
The category is white and the varietal is pinot grigio
The category is sparkling and the varietal is blanc de noirs
The category is red and the varietal is cabernet
The category is sticky and the varietal is muscato
The category is red and the varietal is cabernet
The category is sparkling and the varietal is blanc de noirs
The category is sticky and the varietal is muscato
The category is white and the varietal is pinot grigio
cabernet
Here, we have started to characterize our dictionaries.
1) The dictionary operators keys() and values() return lists contain keys or values. These lists can be stored and acted on as lists.
2) We can iterate over the keys and print the values using the syntax for X in []:
3) We can quickly find if a particularly key already exists. NOTE: each key MUST be unique but multiple keys can have the same value
You will use dictionaries and lists almost exclusively in your coding. However, there are two remaining data structures that you should know about to make your life a little easier.
Sets
Sets are unordered and unique bags of variables-- you can't index them, and you can't have more than one of any given object.
#!/usr/bin/python#create list and then convert it into a set
list_of_letters =['a','a','b','c','c','c','d','e']
set_of_letters =set(list_of_letters)
competing_set_of_letters =set(['c','d','e','f'])print list_of_letters
print set_of_letters
print competing_set_of_letters
print
Here we have created several sets from lists. How are lists different from sets?
Adding to a set
#add a new element to the set
set_of_letters.add('q')print set_of_letters
print#combine two sets
set_of_letters.update(competing_set_of_letters)print set_of_letters
print
1) Using the set operator add(), we have added a single element
2) Using the set operator update(), we have created a single nonredundant list from two separate lists
Shrinking a set
#remove an item from a set
set_of_letters.discard('q')print set_of_letters
set_of_letters.discard('q')print set_of_letters
print
set_of_letters.remove('f')print set_of_letters
set_of_letters.remove('f')print set_of_letters
set(['a', 'c', 'b', 'e', 'd'])
Traceback (most recent call last):
File "./data_structures_sets.py", line 36, in <module>
set_of_letters.remove('f')
KeyError: 'f'
We have deleted items from a set using two set operators, discard() and remove(). The difference between these two operators becomes evident when the item to be deleted is not in the set. When the item is not in the set, the remove() operator prints out an error message. We will learn a lot more about these error messages and their uses later in the week
Changing sets in place
NOT going to happen. Why?
Characterizing sets
You can treat sets like Ven diagrams
set_of_letters =set(list_of_letters)#combination of sets
union = competing_set_of_letters.union(set_of_letters)print union
#where the sets overlap
intersection = competing_set_of_letters.intersection(set_of_letters)print intersection
#in one set but not in the other
difference1 = set_of_letters.difference(competing_set_of_letters)print difference1
difference2 = competing_set_of_letters.difference(set_of_letters)print difference2
#in either set but not both
symmetric_difference = \
competing_set_of_letters.symmetric_difference(\
set_of_letters)print symmetric_difference
#subset
subset = competing_set_of_letters.issubset(set_of_letters)print subset
competing_set_of_letters.remove('f')
subset = competing_set_of_letters.issubset(set_of_letters)print subset
Here we have used a variety of mathematical descriptors to compare two sets. We can also use the methods we used to characterize lists to explore sets.
#length and iterating over setsprint set_of_letters,len(set_of_letters)printfor x in set_of_letters: print x
OK. One more data structure and then we're done...
Tuples
A tuple is an immutable list that can neither be grown nor shrunk but can be combined
#!/usr/bin/pythonprint"Welcome to lists..."#initialize a tuple
li =('duck','season')
li2 ='wabbit','season'#print tupleprint li
print li2
print#access single item in tupleprint li[0], li2[0]printprint"Making tuple longer..."#combine two tuples by concatenation
li4 = li + li2
print li
print li2
print li4
print#print out slice of tupleprint li4
print li4[2:]print li4[:4]print li4[1:3]print li4[-1]printprint"Characterize tuple..."#figure out how long a tuple isprint li4
printlen(li4)print'Max =',max(li4)print'Min =',min(li4)print#iterate over tuplefor x in li4: print x
print
giving:
$./data_structures_tuples.py
Welcome to tuples...
('duck', 'season')
('wabbit', 'season')
Characterize tuple...
('duck', 'season', 'wabbit', 'season')
4
Max = wabbit
Min = duck
duck
season
wabbit
season
Phew! All done :O) Now, let's review...
Summary
Lists:
1) Ordered collection of arbitrary variables
2) Accessible by slicing
3) Can be grown or shrunk in place
4) Mutable (can be changed in place)
5) list = [X,Y]
Dictionaries:
1) Unordered collection of arbitrary variables
2) Accessible by keys
3) Can be grown or shrunk in place
4) Mutable
5) dict = {X:Y}
Sets:
1) Unordered collection of arbitrary variables
2) Not accessible by indexing or slicing
3) Can be grown or shrunk in place
4) Immutable
5) Set = set([X,Y])
Tuples:
1) Ordered collection of arbitrary variables
2) Accessible by slicing
3) Fixed length (but nestable -- see this afternoon)
3) Immutable
4) tuple = (X,Y) OR X,Y
Exercises
1) Get comfortable with new data structures (adapted from Learning Python)
Create a new file. Type in the commands below. Execute the script. Add comments to your script describing what is happening in each line.
print ('x',)[0]
print ('x','y')[1]
L = [1,2,3] + [4,5,6]
print L, L[:], L[:0], L[-2], L[-2:]
print ([1,2,3]+[4,5,6])[2:4]
L.reverse()
print L
L.sort()
print L
L.index(4)
print ('x',)[0]
#print the first value in this tuple -> x
print ('x','y')[1]
#print the second value in this tuple -> y
L = [1,2,3] + [4,5,6]
#concatenate these two lists L=[1,2,3,4,5,6]
print L, L[:], L[:0], L[-2], L[-2:]
#print the full list, print all indices in the list (print L[:] gives the same results as print L),
#Print the second element from the end (in this case, 5).
#print from the second to last element to the end of the list ([5, 6])
print ([1,2,3]+[4,5,6])[2:4]
#concatenate these two lists and print the 3rd and 4th elements ([3,4])
L.reverse()
#reverse L, now L=[6,5,4,3,2,1]
print L
#[6,5,4,3,2,1]
L.sort()
#sort L in numerical order, now L=[1,2,3,4,5,6]
print L
#[1,2,3,4,5,6]
L.index(4)
#3, the index of the element '4'
print {'a':1, 'b':2}['b']
#Print the value of the key 'b' from this dictionary (2)
D = {'x':1, 'y':2, 'z':3}
#make a dictionary with 3 key/value pairs
D['w'] = 0
#add another key value pair to the dictionary
print D['x'] + D['w']
#print the value of adding the values D['x'] and D['w']. Since the values of this
#dictionary are integers, they are added together. If the values were '1' and '0',
#"+" would concatenate them, and 10 would be printed
print D.keys(), D.values(), D.has_key('z')
#print the keys of the dictionary ['w', 'y', 'x', 'z'], print the values of
#the dictionary [0, 2, 1, 3], see if the dictionary has a key 'z' (True)
2) Getting to know your TAs (adapted from Learning Python)
Introduce yourselves to your TAs and find out their names. Create a new file. Define a new list containing the names of the TAs and the names of your teachers. Add appropriate print statements to the file to answer the following:
a. What happens when you try to index out of bounds (eg. L[10])?
b. What about slicing out of bounds (eg. L[-100:100])?
c. What happens when you try to extract a sequence in reverse--with the lower bound greater than the hight bound (eg. L[3:1])? Hint: try assigning to this slice (L[3:1] = ['?']) and see where the value is put. Do you think this may be the same phenomenon you saw when slicing out of bounds?
Solution:
teachers =['Rich','Angela','Rose','Aaron','Rich','Terry','Matt']print"teachers=%s" % teachers
printprint"teachers[10] will give an error, because the list index only goes to 7"printprint"""teachers[-100:100] will return %s""" % teachers
printprint"Python does an out-of-bounds check when you index a single item"print"but it truncates out-of-bounds indices when you take a slice."printprintprint"teachers[3:1] returns [], an empty list"printprint"Slicing will return the items in the list STARTING with the first index"print"UP TO, but not including, the last index. It will not slice"print"in a reverse direction, unless specified to do so: try teachers[3:1:-1]."printprint"This phenomenon is different than slicing out of bounds. Try teachers[100:-100]"
3) Getting to know your neighbors (adapted from Learning Python)
Introduce yourself to four of your neighbors and find out what lab they work in. Create a new file. Define a new dictionary containing the names of your neighbors as keys and their labs as values. Add appropriate print statements to the file to answer the following:
a. What happens if you try to index a non-existent key (eg print D['Terry'])?
b. What happens if you try to assign to a non-existent key (eg D['Terry'] = 'Alber')?
c. How does this compare to out-of-bound assignments for lists?
Solution:
#!/usr/bin/env python#create an empty dictionary to store neighbors information
neighbor_dict={}#get name of neighbor and store it in the variable 'name'
name=raw_input("Enter neighbors name: ")#get the lab of your neighbor and store in 'lab'
lab=raw_input("Enter neighbors lab: ")#add key:value pair to dictionary
neighbor_dict[name]=lab
#repeat for three more neighbors
name=raw_input("Enter neighbors name: ")
lab=raw_input("Enter neighbors lab: ")
neighbor_dict[name]=lab
name=raw_input("Enter neighbors name: ")
lab=raw_input("Enter neighbors lab: ")
neighbor_dict[name]=lab
name=raw_input("Enter neighbors name: ")
lab=raw_input("Enter neighbors lab: ")
neighbor_dict[name]=lab
# Now lets print out dictionaryfor neighbor in neighbor_dict:
print"%s works in the %s lab" %(neighbor,neighbor_dict[neighbor])# every key must have a value# indexing with key 'Terry' produces an error# because 'Terry' has no valueprint neighbor_dict['Terry']# bad# by associating a value with 'Terry', 'Terry' now exists as a key
neighbor_dict['Terry']='Alber'print neighbor_dict['Terry']# good# dictionary keys are created during assignment and are not confined to# 'boundaries' as is the case with list assignments, which must be within# the boundary of the list
4) Pulling it all together
Your boss has asked you to do a small bioinformatics project on LeuT (pdb code 2Q6H), which is the neurotransmitters responsible for transporting antidepressants. To help you out, I am providing a script that will read in a file and save the protein sequence to a list called protSeq. I have commented the code, but there are many things in here you haven't learned yet (which is why I'm giving it to you for free).
a. How many total amino acids are in the protein?
b. Print out total count of each amino acid in alphabetical order.
HINT: Protein Databank (PDB) structure files are stored at http://www.pdb.org/. Use the pdb code to find the structure. The structure file can be downloaded from Download Files >> PDB text. The structure file must be in the same directory as the script for the script to run properly.
#!/usr/bin/python#initialize list to store sequence
protSeq =[]#open pdb file
f1 =open('2Q6H.pdb','r')#loop over lines in filefor next in f1:
#identify lines that contain sequencesif next[:6]=='SEQRES':
#strip away white space and#convert line into list
line = next.strip().split()#delete descriptor information#at beginning of each linedel line[:4]#loop over amino acids in linefor aa in line:
#add to sequence list
protSeq.append(aa)#close file
f1.close()
ANSWER:
There are 519 amino acids in the protein
ALA = 54
ARG = 21
ASN = 14
ASP = 12
GLN = 6
GLU = 24
GLY = 45
HIS = 6
ILE = 54
LEU = 61
LYS = 19
MET = 13
PHE = 50
PRO = 25
SER = 18
THR = 27
TRP = 16
TYR = 17
VAL = 37
Solution:
#!/usr/bin/python#initialize list to store sequence
protSeq =[]#open pdb file
f1 =open('2Q6H.pdb','r')#loop over lines in filefor next in f1:
#identify lines that contain sequencesif next[:6]=='SEQRES':
#strip away white space and#convert line into list
line = next.strip().split()#delete descriptor information#at beginning of each linedel line[:4]#loop over amino acids in linefor aa in line:
#add to sequence list
protSeq.append(aa)print"Number of amino acids: %d" % len(protSeq)
aa_set =set(protSeq)
aa_list =list(aa_set)# Iniitialize dictionary where the key is the amino acid and the value is a# counter for how many times the amino acid is in the sequences
aa_dict ={}for aa in aa_set:
aa_dict[aa]=0# aa_dict = {}.fromkeys(aa_set, 0)...a shortcut for initializing a dictionaryprint aa_dict
for aa in protSeq:
aa_dict[aa] +=1# Alternatively# aa_dict[aa] = aa_dict[aa] + 1print"After all the counts are done, aa_dict = ",print aa_dict
aa_list.sort()for aa in aa_list:
print"%s = %d" % (aa, aa_dict[aa])#close file
f1.close()
Introduction to Data Structures
Lists
Way of storing an ordered series of values in a structure referenced by a single variable.giving:
$./data_structures_lists.py
['duck', 'season']
['wabbit', 'season']
['...Fire']
duck wabbit ...Fire
Here, we have put something in a variable, we ask for it back and we get what we put in. We now, by virtue of enclosing a series of values separated by commas in a pair of square brackets, have a list.
Lists are special and, as such, have lots of really useful features. You can access individual items in a list or entire sections. You can also perform functions on your list using the '.' operator.
Adding to a list
giving:$./data_structures_lists.py
['duck', 'season', 'wabbit']
['duck', 'season', 'wabbit', 'season']
['wabbit', 'season']
['duck', 'season', 'wabbit', 'season', 'wabbit', 'season']
['duck', 'season', 'wabbit', 'season', 'wabbit', 'season']
['...Fire']
['duck', 'season', 'wabbit', 'season', 'wabbit', 'season', '...Fire']
['duck', 'season', 'wabbit', 'season', 'wabbit', 'season', '...Fire']
['wabbit', 'season', 'wabbit', 'season', '...Fire']
['duck', 'season', 'wabbit', 'season']
['season', 'wabbit']
...Fire
This script illustrates several of the most common ways to grow lists:
1) The list method append(), which adds a single item to the end of a list
2) The list method extend(), which adds a whole list to the end of the list you ask to extend itself
3) The list concatenation operator, which stitches two things together to make a new whole, without changing either one
Shrinking a list
giving:$ ./data_structures_lists.py
['Why', 'do', 'superheroes', 'wear', 'spandex?']
['Why', 'do', 'superheroes', 'spandex?']
['Why', 'do', 'superheroes']
spandex?
Here, we are removing things from lists in two ways:
1) The list operator del removes a particular item from the list
2) The list operator pop() removes the last item from the list and returns the variable
Changing lists in place
In addition to adding things to lists and taking them away again, we can also change lists in place.giving:
$ ./data_structures_lists.py
[0, 0, 0, 0]
[0, 500, 10, 20]
[0, 10, 20, 500]
[500, 20, 10, 0]
['Why', 'do', 'superheroes']
['superheroes', 'do', 'Why']
Now, we've modified the list in a couple of important ways:
1) Overwrite items in the list using slices
2) Sorted the list using the operator sort()
3) Reversed the order of the list using the operator reverse()
These are, by the way, all demonstrations of the Mutability of lists--unlike when you change a string or a number, you don't have to perform an operation and store the result--these operations change the list in place.
Characterizing lists
giving:$ ./data_structures_lists.py
['superheroes', 'do', 'Why']
3
[500, 20, 10, 0]
4
Max = 500
Min = 0
superheroes
do
Why
2
Here, we have started to characterize our lists.
1) The list operator len(), max() and min() tells us how many items in the list and the maximum and minimum values in the list.
2) The list operator index() tells us where an item is in the list.
3) We can iterate over each item in the list and print it using the syntax for X in []:
Lists are the MOST complicated data structure. It gets easier from here :O)
Dictionaries
You can imagine a dictionary as just that-- a dictionary. To retrieve information out of it, you look up a word, called a key, and you find information associated with that word, called the key's value.
giving:
$./data_structures_dictionaries.py
{'matt': 'davis', 'rich': 'lusk', 'terry': 'lang'}
{'brian': 'tea', 'peter': 'soda', 'terry': 'coffee'}
{'white': 'pinot grigio', 'sparkling': 'blanc de noirs', 'red': 'cabernet', 'sticky': 'muscato'}
lang
soda
blanc de noirs
To create a dictionary, you write each key-value pair as key:value, divide the pairs with commas, and surround the entire structure with curly braces.
Just like lists, dictionaries are special and have their own useful features.
Adding to a dictionary
giving:$./data_structures_dictionaries.py
{'brian': 'tea', 'peter': 'soda', 'rich': 'tea', 'terry': 'coffee'}
{'matt': 'davis', 'rich': 'lusk', 'terry': 'lang'}
{'brian': 'tea', 'peter': 'soda', 'rich': 'tea', 'terry': 'coffee'}
{'brian': 'tea', 'matt': 'davis', 'rich': 'tea', 'terry': 'coffee', 'peter': 'soda'}
We are adding to the dictionaries in two ways:
1) Add a single new item to the dictionary by defining a new key:value pair
2) Merge two lists together using the update() operator. How is this different from the list operator append()?
Shrinking a dictionary
giving:$./data_structures_dictionaries.py
{'brian': 'tea', 'rich': 'tea', 'terry': 'coffee', 'peter': 'soda'}
We have now deleted a key:value pair from the dictionary. Why can't we use something like the list operator pop()?
Changing dictionaries in place
giving:$./data_structures_dictionaries.py
{'brian': 'tea', 'rich': 'lusk', 'terry': 'lang', 'peter': 'soda'}
We have changed two values in place by modifying the key:value pair. Why won't the list operator sort() work?
Characterizing dictionaries
giving:$./data_structures_dictionaries.py
['white', 'sparkling', 'red', 'sticky']
['pinot grigio', 'blanc de noirs', 'cabernet', 'muscato']
The category is white and the varietal is pinot grigio
The category is sparkling and the varietal is blanc de noirs
The category is red and the varietal is cabernet
The category is sticky and the varietal is muscato
The category is red and the varietal is cabernet
The category is sparkling and the varietal is blanc de noirs
The category is sticky and the varietal is muscato
The category is white and the varietal is pinot grigio
cabernet
Here, we have started to characterize our dictionaries.
1) The dictionary operators keys() and values() return lists contain keys or values. These lists can be stored and acted on as lists.
2) We can iterate over the keys and print the values using the syntax for X in []:
3) We can quickly find if a particularly key already exists. NOTE: each key MUST be unique but multiple keys can have the same value
You will use dictionaries and lists almost exclusively in your coding. However, there are two remaining data structures that you should know about to make your life a little easier.
Sets
Sets are unordered and unique bags of variables-- you can't index them, and you can't have more than one of any given object.giving:
$./data_structures_sets.py
['a', 'a', 'b', 'c', 'c', 'c', 'd', 'e']
set(['a', 'c', 'b', 'e', 'd'])
set(['c', 'e', 'd', 'f'])
Here we have created several sets from lists. How are lists different from sets?
Adding to a set
giving:$./data_structures_sets.py
set(['a', 'c', 'b', 'e', 'd', 'q'])
set(['a', 'c', 'b', 'e', 'd', 'f', 'q'])
We have added to the set in two ways:
1) Using the set operator add(), we have added a single element
2) Using the set operator update(), we have created a single nonredundant list from two separate lists
Shrinking a set
giving:$./data_structures_sets.py
set(['a', 'c', 'b', 'e', 'd', 'f'])
set(['a', 'c', 'b', 'e', 'd', 'f'])
set(['a', 'c', 'b', 'e', 'd'])
Traceback (most recent call last):
File "./data_structures_sets.py", line 36, in <module>
set_of_letters.remove('f')
KeyError: 'f'
We have deleted items from a set using two set operators, discard() and remove(). The difference between these two operators becomes evident when the item to be deleted is not in the set. When the item is not in the set, the remove() operator prints out an error message. We will learn a lot more about these error messages and their uses later in the week
Changing sets in place
NOT going to happen. Why?
Characterizing sets
You can treat sets like Ven diagramsgiving:
$./data_structures_sets.py
set(['a', 'c', 'b', 'e', 'd', 'f'])
set(['c', 'e', 'd'])
set(['a', 'b'])
set(['f'])
set(['a', 'b', 'f'])
False
True
Here we have used a variety of mathematical descriptors to compare two sets. We can also use the methods we used to characterize lists to explore sets.
giving:
$./data_structures_sets.py
set(['a', 'c', 'b', 'e', 'd']) 5
a
c
b
e
d
which should make sense at this point.
OK. One more data structure and then we're done...
Tuples
A tuple is an immutable list that can neither be grown nor shrunk but can be combinedgiving:
$./data_structures_tuples.py
Welcome to tuples...
('duck', 'season')
('wabbit', 'season')
duck wabbit
Making tuple longer...
('duck', 'season')
('wabbit', 'season')
('duck', 'season', 'wabbit', 'season')
('duck', 'season', 'wabbit', 'season')
('wabbit', 'season')
('duck', 'season', 'wabbit', 'season')
('season', 'wabbit')
season
Characterize tuple...
('duck', 'season', 'wabbit', 'season')
4
Max = wabbit
Min = duck
duck
season
wabbit
season
Phew! All done :O) Now, let's review...
Summary
Lists:
1) Ordered collection of arbitrary variables2) Accessible by slicing
3) Can be grown or shrunk in place
4) Mutable (can be changed in place)
5) list = [X,Y]
Dictionaries:
1) Unordered collection of arbitrary variables2) Accessible by keys
3) Can be grown or shrunk in place
4) Mutable
5) dict = {X:Y}
Sets:
1) Unordered collection of arbitrary variables2) Not accessible by indexing or slicing
3) Can be grown or shrunk in place
4) Immutable
5) Set = set([X,Y])
Tuples:
1) Ordered collection of arbitrary variables2) Accessible by slicing
3) Fixed length (but nestable -- see this afternoon)
3) Immutable
4) tuple = (X,Y) OR X,Y
Exercises
1) Get comfortable with new data structures (adapted from Learning Python)Create a new file. Type in the commands below. Execute the script. Add comments to your script describing what is happening in each line.
print ('x',)[0]
print ('x','y')[1]
L = [1,2,3] + [4,5,6]
print L, L[:], L[:0], L[-2], L[-2:]
print ([1,2,3]+[4,5,6])[2:4]
L.reverse()
print L
L.sort()
print L
L.index(4)
print {'a':1, 'b':2}['b']
D = {'x':1, 'y':2, 'z':3}
D['w'] = 0
print D['x'] + D['w']
print D.keys(), D.values(), D.has_key('z')
Solution:
print ('x',)[0] #print the first value in this tuple -> x print ('x','y')[1] #print the second value in this tuple -> y L = [1,2,3] + [4,5,6] #concatenate these two lists L=[1,2,3,4,5,6] print L, L[:], L[:0], L[-2], L[-2:] #print the full list, print all indices in the list (print L[:] gives the same results as print L), #Print the second element from the end (in this case, 5). #print from the second to last element to the end of the list ([5, 6]) print ([1,2,3]+[4,5,6])[2:4] #concatenate these two lists and print the 3rd and 4th elements ([3,4]) L.reverse() #reverse L, now L=[6,5,4,3,2,1] print L #[6,5,4,3,2,1] L.sort() #sort L in numerical order, now L=[1,2,3,4,5,6] print L #[1,2,3,4,5,6] L.index(4) #3, the index of the element '4' print {'a':1, 'b':2}['b'] #Print the value of the key 'b' from this dictionary (2) D = {'x':1, 'y':2, 'z':3} #make a dictionary with 3 key/value pairs D['w'] = 0 #add another key value pair to the dictionary print D['x'] + D['w'] #print the value of adding the values D['x'] and D['w']. Since the values of this #dictionary are integers, they are added together. If the values were '1' and '0', #"+" would concatenate them, and 10 would be printed print D.keys(), D.values(), D.has_key('z') #print the keys of the dictionary ['w', 'y', 'x', 'z'], print the values of #the dictionary [0, 2, 1, 3], see if the dictionary has a key 'z' (True)2) Getting to know your TAs (adapted from Learning Python)
Introduce yourselves to your TAs and find out their names. Create a new file. Define a new list containing the names of the TAs and the names of your teachers. Add appropriate print statements to the file to answer the following:
a. What happens when you try to index out of bounds (eg. L[10])?
b. What about slicing out of bounds (eg. L[-100:100])?
c. What happens when you try to extract a sequence in reverse--with the lower bound greater than the hight bound (eg. L[3:1])? Hint: try assigning to this slice (L[3:1] = ['?']) and see where the value is put. Do you think this may be the same phenomenon you saw when slicing out of bounds?
Solution:
3) Getting to know your neighbors (adapted from Learning Python)
Introduce yourself to four of your neighbors and find out what lab they work in. Create a new file. Define a new dictionary containing the names of your neighbors as keys and their labs as values. Add appropriate print statements to the file to answer the following:
a. What happens if you try to index a non-existent key (eg print D['Terry'])?
b. What happens if you try to assign to a non-existent key (eg D['Terry'] = 'Alber')?
c. How does this compare to out-of-bound assignments for lists?
Solution:
4) Pulling it all together
Your boss has asked you to do a small bioinformatics project on LeuT (pdb code 2Q6H), which is the neurotransmitters responsible for transporting antidepressants. To help you out, I am providing a script that will read in a file and save the protein sequence to a list called protSeq. I have commented the code, but there are many things in here you haven't learned yet (which is why I'm giving it to you for free).
a. How many total amino acids are in the protein?
b. Print out total count of each amino acid in alphabetical order.
HINT: Protein Databank (PDB) structure files are stored at http://www.pdb.org/. Use the pdb code to find the structure. The structure file can be downloaded from Download Files >> PDB text. The structure file must be in the same directory as the script for the script to run properly.
ANSWER:
There are 519 amino acids in the protein
ALA = 54
ARG = 21
ASN = 14
ASP = 12
GLN = 6
GLU = 24
GLY = 45
HIS = 6
ILE = 54
LEU = 61
LYS = 19
MET = 13
PHE = 50
PRO = 25
SER = 18
THR = 27
TRP = 16
TYR = 17
VAL = 37
Solution: