I spoke a couple days ago about treating your computer as a beast of labor that responds to your commands: "Do my work" you say, and it is done. I then talked about 'if,' and then Matt and Terry talked about loops and files. Clearly, although we have come a some distance, we have a long way to go until our beast becomes worth its while. However, by the end of this morning's section, you will be much farther along-- you'll be able to train your beast to have complex thoughts. After that, with a little practice, "Do my work" becomes very workable.
For an analogy, let's move our beast from the computational world to the physical world for just a paragraph or two. I'm basically imagining my desktop with short furry legs at this point, but you can imagine R2D2, Optimus Prime, whatever (Mechagodzilla?). You want this beast to keep you supplied with ramen. Each week, you say:
"Beast, go to the store. Exit my building, go northeast on Warfield, then northwest on Mandana, and then northeast on Grand until the Safeway. Go into the Safeway. If my checking account has less than $10, buy ramen. Else, if my checking account has less than $20, buy better ramen. Else, if my checking account has less than $30, buy..."
This goes on for a while. In any case, you say this week after week. Not only does this wear upon your patience and tongue, but sometimes, when you're tired, a couple beers in, or just not paying attention, you slip up the wording and your poor beast winds up bringing back the wrong ramen, emptying your checking account, or, worse still, just wandering off into East Oakland, being found days later unconscious at 66th & International.
You would like to prevent this. Instead of repeating lengthy and complex instructions time and time again, which is both annoying and likely to go wrong, you'd just like to say:
"Beast, get me ramen."
and have it do the right thing every time. And in fact, once you're able to turn complex instructions into simple ones, you can start to nest them together to create even more complexity: "Beast, on Wednesdays you must get me ramen, do my laundry, write a publishable paper with my name on it, and give me a haircut more fashionable than the one I currently have. It is Wednesday: do your thing."
Today, you will learn how to train your beast.
Functions
Functions are the basic means to manage complexity in your programs, allowing you to avoid nesting and repeating large chunks of code that could otherwise make your tasks unmanageable. They allow you to bundle code with a known input and a known output into single lines, and you should use them frequently from now on. We will start with the syntax:
#!/usr/bin/env python
# define the function
def hello(name):
greeting = "Hello %s!" % (name)
return greeting
# use the function
functionInput = 'Zaphod Beeblebrox'
functionOutput = hello(functionInput)
print functionOutput
To define a function, you use the word 'def.' Then follows the function name, in this case 'hello,' with parentheses containing any input the function might need. In this case, it needs a name to form a proper greeting, so we're giving it a variable 'name.' After that, the function does its thing-- executes the indented block of code immediately below. In this case, it creates a greeting "Hello <name>!". The last thing that it does is return that greeting to the rest of the program.
Note that the variable names are different on the inside and the outside of the function: I give it 'functionInput', although it takes 'name', and it returns 'greeting', although that return value is fed into 'functionOutput.' I did this on purpose-- I want to emphasize that the function only knows to expect something, which it then internally refers to as 'name', and then to give something else back. In fact, there is some insulation against the outside world, as you can see in this example:
While something called 'testVariable' was assigned to 3 inside the function, nothing happened to that variable outside the function. Variables created inside functions have their own space in memory distinct from variables outside functions, and so sharing names between the two can be done without you having to keep track of it. This way, you can use functions that other people wrote without having to look inside them-- they can't mess up your own code. Aside from a subtlety or two-- if you're interested, you can see last year's equivalent lecture-- for the most part, what happens in the function stays in the function (an important exception lies with lists and dictionaries, which you will examine in the exercises).
Let's have another example with a more pressing subject:
#!/usr/bin/env python
def whichFood(balance):
if balance < 10:
return 'ramen'
elif balance < 100:
return 'good ramen'
elif balance < 200:
return 'better ramen'
else:
return 'ramen that is truly profound in its goodness'
print whichFood(14)
Here we've made a slightly more complicated function-- it has some control statements in there, and there is more than one way for it to return. We also never explicitly create an input variable, like 'functionInput' in the last example, and we don't create an output variable either. However, it functions just like that block of code that we saw earlier. Finally, functions don't necessarily need to take anything as input, and certainly not just one thing, and they don't need to return anything back to the program (or just one thing). They can even have other functions nested inside them! For a few more examples of the syntax:
# functions can do their thing without taking input or returning output
def giveRichABetterHaircut():
print 'Impossible!'
# this is not because my hair is any good-- it's just that computers don't have arms
# functions can take multiple items in and return multiple items out
def doLaundry(amtDetergent,dirtyLoads):
cleanLoads = []
for load in dirtyLoads:
amtDetergent -= 1
cleanLoads.append(load)
return (amtDetergent,cleanLoads)
amtTide = 5
dirtyLaundry = ['socks','shirts','pants']
(amtTide,cleanLoads) = doLaundry(amtTide,dirtyLaundry)
print amtTide
print cleanLoads
We should go into a little more detail on returning values. Above, in 'doLaundry', I returned a tuple-- two variables enclosed in parenthesis. You could also return a list, which works much the same way. However, the real complication comes in what variable you store that value in:
def returnStuff:
a = 3
b = 4
return [a,b]
(x,y) = returnStuff()
print x,y
both = returnStuff()
print both
# works both ways!
So, how do we tend to use them? We tend to use functions to break difficult tasks into a number of easier tasks, and then these easier tasks into ones easier still, and so on. Large 'raw' code blocks, with few function calls, are only tens of lines long, and many functions are only a handful of lines. This allows us to program in large, structural sweeps, rather than getting lost in the details. This makes programs both easier to write and easier to read:
def publishAPaper(authors,topic,journal):
data = doWork(topic)
analysis = analyze(data)
paper = writePaper(data,analysis)
submit(authors,paper,journal)
And, a big part of that that ease comes with the use of:
Modules
In all of the examples above, we defined our functions right above the code that we hoped to execute. If you have many functions, you can see how this would get messy in a hurry. Furthermore, part of the benefit of functions is that you can call them multiple times within a program to execute the same operations without tiresomely writing them all out again-- wouldn't it be nice to share functions across programs, too? For example, I work in evolutionary biology, and that means that I spend a lot of my item parsing annotations, getting sequence out of fasta files, and in general shuttling sequence from program to program. While I of course write different programs to accomplish different tasks, most of my programs overlap to a significant degree with others-- many need to parse fasta files, others need to calculate evolutionary rates, most need to interface with our lab's cluster-- which means that most of them share functions. And if the same function exists in two or more different programs, we hit the same problems that we hit before: complex debugging, decreased readability, and, of course, just plain Too Much Typing.
Modules solve these problems. In short, they're collections of functions and variables (and often objects, which we'll get to towards the end of the class) that are kept together in a single file that can be read and imported by any number of programs. We already saw them before, and now you have all been using re and sys-- you know they're easy to use.
Using a module: the basics
To illustrate the basics, I'm going to go through the use of two modules, sys and math, one of which we use basically all the time. In fact, it's a very, very rare program indeed (of mine) that doesn't use the sys module. sys contains a lot of really, really esoteric functions, but it also contains a very simple, everyday thing-- what you typed on the command line. To illustrate, if you type:
$ ./testprogram.py argument1 argument2 argument3
then the sys module contains a list that contains './testprogram.py', 'argument1', 'argument2', and 'argument3.' This list is called argv.
#!/usr/bin/env pythonimportsys# gaining access to the module# you can access variables stored in the module by using a dot# to get at the variable 'argv' which is stored in 'sys', type:
commmandLine =sys.argvprint commandLine
We can also use functions stored inside other modules. To demonstrate this, I'll use the module 'math.'
#!/usr/bin/env pythonimportsysimportmath# sys.argv contains only strings, even if you type integers.# And, remember, the first element is the command itself-- usually# not very useful.
x =float(sys.argv[1])
logX =math.log(x)print logX
Great! Not so hard. It turns out that they're easy to write, too:
Making a module
greeting_module.py
def hello(name):
greeting ="Hello %s!" % name
return greeting
def ahoy_hoy(name):
greeting ="Ahoy-hoy %s!" % name
return greeting
test.py
#!/usr/bin/python
import greeting_module
hi = greeting_module.hello('class')
print hi
And that's it! See-- no more messy function declarations at the beginning of your script. And if you need another program to say hi to you, then all you need to do is import the greeting module.
Using modules: slightly more than just 'import'
Although creating a basic module is easy, sometimes you want more than just the basics. And although using a module in the most basic manner is easy, it's best to get a more thorough picture of how modules behave.
First, what if you only want one function from a given module? Let's say, as an antique telephone enthusiast, you really only dealt in ahoys rather than hellos. What you can do is to say "import ahoy_hoy from greeting module." This also changes the syntax for accessing the function:
test.py
#!/usr/bin/python
from greeting_module import ahoy_hoy
hi = ahoy_hoy('class') # if grabbed with a 'from' statement, you don't need to use the <module>.<function> syntax
print hi
We see that we can now write ahoy_hoy('class') directly, instead of having to write greeting_module.ahoy_hoy('class'). What if we wanted to do this with every function in the module? Rather than writing out all of their import statements one at a time-- there could be a lot of them-- you can just use the "*" symbol to refer to them, just as it acts as a wildcard at the linux command prompt.
test.py
#!/usr/bin/python
from greeting_module import * # equivalent to: import hello from greeting_module
# import ahoy_hoy from greeting_module
hi = ahoy_hoy('class')
hi2 = hello('class')
So, with this under our belts, why don't we start using an example module? This one here is handy:
Pickling
There are many modules that come with a default installation of python, and one of the more useful ones is 'pickle.' It allows you to store data from a python script very easily into a file-- and then when you want it again, you can use another script to 'unpickle' the very same stuff!
Although like many built-in pieces of python, there's a lot to it, here we'll just cover the basic functionality, which comprises most of its use anyway.
1. Make the following functions:
a) Takes an integer x as input, prints x**2
b) Takes integers x and y as input, prints x * y
c) Takes a list xs as input, prints xs[0] * xs[1]
#!/usr/bin/evn pythyondef square(x):
print x ** 2def product(x, y):
print x * y
def multiplyList(xs):
print xs[0] * xs[1]
x =5
y =10
xs =[1,2,3]
square(x)
product(x,y)
multiplyList(xs)
2. Modify the above programs so that the function returns the result instead of printing it. This result is then printed by the program that called the function.
#!/usr/bin/evn pythyondef square(x):
return x ** 2def product(x, y):
return x * y
def multiplyList(xs):
return xs[0] * xs[1]
x =5
y =10
xs =[1,2,3]print square(x)print product(x,y)print multiplyList(xs)
3. As promised, most things that happen in functions stay in the functions, but there are important exceptions. Make the following functions, which should illustrate this property:
a) The function takes an integer as input, and it increments that integer by one using the '+=' operator. Print the value of the integer before and after the function is called.
b) The function takes a list as input, and it changes the first element of the list to the string 'x'. Print the value of the list before and after the function is called.
c) The function takes a dictionary as input, and it adds the key 'x' with value 'y' to this dictionary. Print the dictionary before and after the function is called.
#!/usr/bin/env pythondef increment(x):
x +=1
x =5print x
increment(x)print x
printdef changeList(List):
List[0]='x'
List =[0,1,2]print List
changeList(List)print List
printdef addToDict(dictionary):
dictionary['x']='y'
dictionary ={'a' : 'b','c' : 'd'}print dictionary
addToDict(dictionary)print dictionary
4. Create a module called 'exercises.py.' Put your functions from exercise two into this module. Now write two programs that each call all of the functions in the module:
a) A program that uses the 'import exercises' line.
b) A program that uses the 'from exercises import *' line
#!/usr/bin/env pythonimport exercises
x =5
y =10
xs =[1,2,3]print exercises.square(x)print exercises.product(x,y)print exercises.multiplyList(xs)
//#!/usr/bin/env pythonfrom exercises import *
x =5
y =10
xs =[1,2,3]print square(x)print product(x,y)print multiplyList(xs)//
5. Make a fasta parser function. This function should take a fasta-format file as input, read through the file using 'open', distinguish between ID-containing lines and sequence-containing lines, and return a dictionary with IDs as keys and sequences as values. Put this function in your exercises.py module.
//#!/usr/bin/env pythondef square(x):
return x ** 2def product(x, y):
return x * y
def multiplyList(xs):
return xs[0] * xs[1]def fastaParser(fasta_file):
geneDict ={}
key =''
fh =open(fasta_file,'r')
lines = fh.readlines()for line in lines:
line = line.strip()if line[0]=='>':
key = line[1:]
geneDict[key]=''else:
geneDict [key] += line
return geneDict//
Copy and paste the following lines into a file called 'testFasta.fa.' Create a program that imports the exercises.py module and prints the sequence corresponding to the gene ID 'gene3.'
6. Modify your program from (5) such that instead of printing the data, it pickles it. Now write another program that unpickles that pickle file and prints the sequence of gene3.
7. (bonus) Create an ORF finder. For our purposes, we will define an open reading frame (ORF) as a start codon followed at some distance by a stop codon in the same frame. This program should take a pickled fasta file as in (6) as input and outputs a pickled dictionary of gene name->ORF sequence key-value pairs. If the sequence does not contain an ORF, then the gene name should not be in the dictionary.
#!/usr/bin/pythonimport exercises
importpickle# section 4.2 exercise 7
start_codon_list=['ATG']
stop_codon_list=['TAG','TAA','TGA']
pickle_in=open('pickle_file','r')
fasta_dict=pickle.load(pickle_in)
pickle_in.close()def get_codon_positions(seq,codon):
''' takes sequence string and codon string and returns a list of indices for
codon in sequence '''
position=0
codon_list=[]while position <len(seq):
start_pos=seq[position:].find(codon)if start_pos==-1:
breakelse:
codon_list.append(position+start_pos)
position+=start_pos+1return(codon_list)def build_orf_list(starts,stops,seq):
'''takes two lists of indices (starts and stops) and a sequence string and c
onstructs a list of substrings'''
orf_list=[]for start in starts:
for stop in stops:
if(stop-start)>0and(stop-start)%3==0:
orf_list.append(seq[start:stop+3])breakelse:
continuereturn orf_list
def make_orf_dict(fasta_d):
''' takes a fasta dictionary and returns a dictionary of lists with id's as
keys and a list of sequences as values'''
orf_dict={}foridin fasta_d:
start_positions=[]
stop_positions=[]for codon in start_codon_list:
start_positions.extend(get_codon_positions(fasta_d[id],codon))for codon in stop_codon_list:
stop_positions.extend(get_codon_positions(fasta_d[id],codon))
orf_list=build_orf_list(start_positions,stop_positions,fasta_d[id])iflen(orf_list)>0:
orf_dict[id]=orf_list
return(orf_dict)
orf_dict=make_orf_dict(fasta_dict)for gene in orf_dict:
print gene
for orf in orf_dict[gene]:
print orf
pickle_out=open('orf_pickle','w')pickle.dump(orf_dict,pickle_out)
pickle_out.close()
Functions, modules, and pickles
Introduction
I spoke a couple days ago about treating your computer as a beast of labor that responds to your commands: "Do my work" you say, and it is done. I then talked about 'if,' and then Matt and Terry talked about loops and files. Clearly, although we have come a some distance, we have a long way to go until our beast becomes worth its while. However, by the end of this morning's section, you will be much farther along-- you'll be able to train your beast to have complex thoughts. After that, with a little practice, "Do my work" becomes very workable.For an analogy, let's move our beast from the computational world to the physical world for just a paragraph or two. I'm basically imagining my desktop with short furry legs at this point, but you can imagine R2D2, Optimus Prime, whatever (Mechagodzilla?). You want this beast to keep you supplied with ramen. Each week, you say:
"Beast, go to the store. Exit my building, go northeast on Warfield, then northwest on Mandana, and then northeast on Grand until the Safeway. Go into the Safeway. If my checking account has less than $10, buy ramen. Else, if my checking account has less than $20, buy better ramen. Else, if my checking account has less than $30, buy..."
This goes on for a while. In any case, you say this week after week. Not only does this wear upon your patience and tongue, but sometimes, when you're tired, a couple beers in, or just not paying attention, you slip up the wording and your poor beast winds up bringing back the wrong ramen, emptying your checking account, or, worse still, just wandering off into East Oakland, being found days later unconscious at 66th & International.
You would like to prevent this. Instead of repeating lengthy and complex instructions time and time again, which is both annoying and likely to go wrong, you'd just like to say:
"Beast, get me ramen."
and have it do the right thing every time. And in fact, once you're able to turn complex instructions into simple ones, you can start to nest them together to create even more complexity: "Beast, on Wednesdays you must get me ramen, do my laundry, write a publishable paper with my name on it, and give me a haircut more fashionable than the one I currently have. It is Wednesday: do your thing."
Today, you will learn how to train your beast.
Functions
Functions are the basic means to manage complexity in your programs, allowing you to avoid nesting and repeating large chunks of code that could otherwise make your tasks unmanageable. They allow you to bundle code with a known input and a known output into single lines, and you should use them frequently from now on. We will start with the syntax:#!/usr/bin/env python # define the function def hello(name): greeting = "Hello %s!" % (name) return greeting # use the function functionInput = 'Zaphod Beeblebrox' functionOutput = hello(functionInput) print functionOutputTo define a function, you use the word 'def.' Then follows the function name, in this case 'hello,' with parentheses containing any input the function might need. In this case, it needs a name to form a proper greeting, so we're giving it a variable 'name.' After that, the function does its thing-- executes the indented block of code immediately below. In this case, it creates a greeting "Hello <name>!". The last thing that it does is return that greeting to the rest of the program.
Note that the variable names are different on the inside and the outside of the function: I give it 'functionInput', although it takes 'name', and it returns 'greeting', although that return value is fed into 'functionOutput.' I did this on purpose-- I want to emphasize that the function only knows to expect something, which it then internally refers to as 'name', and then to give something else back. In fact, there is some insulation against the outside world, as you can see in this example:
While something called 'testVariable' was assigned to 3 inside the function, nothing happened to that variable outside the function. Variables created inside functions have their own space in memory distinct from variables outside functions, and so sharing names between the two can be done without you having to keep track of it. This way, you can use functions that other people wrote without having to look inside them-- they can't mess up your own code. Aside from a subtlety or two-- if you're interested, you can see last year's equivalent lecture-- for the most part, what happens in the function stays in the function (an important exception lies with lists and dictionaries, which you will examine in the exercises).
Let's have another example with a more pressing subject:
#!/usr/bin/env python def whichFood(balance): if balance < 10: return 'ramen' elif balance < 100: return 'good ramen' elif balance < 200: return 'better ramen' else: return 'ramen that is truly profound in its goodness' print whichFood(14)Here we've made a slightly more complicated function-- it has some control statements in there, and there is more than one way for it to return. We also never explicitly create an input variable, like 'functionInput' in the last example, and we don't create an output variable either. However, it functions just like that block of code that we saw earlier. Finally, functions don't necessarily need to take anything as input, and certainly not just one thing, and they don't need to return anything back to the program (or just one thing). They can even have other functions nested inside them! For a few more examples of the syntax:
# functions can do their thing without taking input or returning output def giveRichABetterHaircut(): print 'Impossible!' # this is not because my hair is any good-- it's just that computers don't have arms # functions can take multiple items in and return multiple items out def doLaundry(amtDetergent,dirtyLoads): cleanLoads = [] for load in dirtyLoads: amtDetergent -= 1 cleanLoads.append(load) return (amtDetergent,cleanLoads) amtTide = 5 dirtyLaundry = ['socks','shirts','pants'] (amtTide,cleanLoads) = doLaundry(amtTide,dirtyLaundry) print amtTide print cleanLoadsWe should go into a little more detail on returning values. Above, in 'doLaundry', I returned a tuple-- two variables enclosed in parenthesis. You could also return a list, which works much the same way. However, the real complication comes in what variable you store that value in:
def returnStuff: a = 3 b = 4 return [a,b] (x,y) = returnStuff() print x,y both = returnStuff() print both # works both ways!So, how do we tend to use them? We tend to use functions to break difficult tasks into a number of easier tasks, and then these easier tasks into ones easier still, and so on. Large 'raw' code blocks, with few function calls, are only tens of lines long, and many functions are only a handful of lines. This allows us to program in large, structural sweeps, rather than getting lost in the details. This makes programs both easier to write and easier to read:
def publishAPaper(authors,topic,journal): data = doWork(topic) analysis = analyze(data) paper = writePaper(data,analysis) submit(authors,paper,journal)And, a big part of that that ease comes with the use of:
Modules
In all of the examples above, we defined our functions right above the code that we hoped to execute. If you have many functions, you can see how this would get messy in a hurry. Furthermore, part of the benefit of functions is that you can call them multiple times within a program to execute the same operations without tiresomely writing them all out again-- wouldn't it be nice to share functions across programs, too? For example, I work in evolutionary biology, and that means that I spend a lot of my item parsing annotations, getting sequence out of fasta files, and in general shuttling sequence from program to program. While I of course write different programs to accomplish different tasks, most of my programs overlap to a significant degree with others-- many need to parse fasta files, others need to calculate evolutionary rates, most need to interface with our lab's cluster-- which means that most of them share functions. And if the same function exists in two or more different programs, we hit the same problems that we hit before: complex debugging, decreased readability, and, of course, just plain Too Much Typing.
Modules solve these problems. In short, they're collections of functions and variables (and often objects, which we'll get to towards the end of the class) that are kept together in a single file that can be read and imported by any number of programs. We already saw them before, and now you have all been using re and sys-- you know they're easy to use.
Using a module: the basics
To illustrate the basics, I'm going to go through the use of two modules, sys and math, one of which we use basically all the time. In fact, it's a very, very rare program indeed (of mine) that doesn't use the sys module. sys contains a lot of really, really esoteric functions, but it also contains a very simple, everyday thing-- what you typed on the command line. To illustrate, if you type:
then the sys module contains a list that contains './testprogram.py', 'argument1', 'argument2', and 'argument3.' This list is called argv.
We can also use functions stored inside other modules. To demonstrate this, I'll use the module 'math.'
Great! Not so hard. It turns out that they're easy to write, too:
Making a module
greeting_module.py
test.py
#!/usr/bin/python import greeting_module hi = greeting_module.hello('class') print hiAnd that's it! See-- no more messy function declarations at the beginning of your script. And if you need another program to say hi to you, then all you need to do is import the greeting module.Using modules: slightly more than just 'import'
Although creating a basic module is easy, sometimes you want more than just the basics. And although using a module in the most basic manner is easy, it's best to get a more thorough picture of how modules behave.
First, what if you only want one function from a given module? Let's say, as an antique telephone enthusiast, you really only dealt in ahoys rather than hellos. What you can do is to say "import ahoy_hoy from greeting module." This also changes the syntax for accessing the function:
test.py
#!/usr/bin/python from greeting_module import ahoy_hoy hi = ahoy_hoy('class') # if grabbed with a 'from' statement, you don't need to use the <module>.<function> syntax print hiWe see that we can now write ahoy_hoy('class') directly, instead of having to write greeting_module.ahoy_hoy('class'). What if we wanted to do this with every function in the module? Rather than writing out all of their import statements one at a time-- there could be a lot of them-- you can just use the "*" symbol to refer to them, just as it acts as a wildcard at the linux command prompt.
test.py
#!/usr/bin/python from greeting_module import * # equivalent to: import hello from greeting_module # import ahoy_hoy from greeting_module hi = ahoy_hoy('class') hi2 = hello('class')So, with this under our belts, why don't we start using an example module? This one here is handy:
Pickling
There are many modules that come with a default installation of python, and one of the more useful ones is 'pickle.' It allows you to store data from a python script very easily into a file-- and then when you want it again, you can use another script to 'unpickle' the very same stuff!Although like many built-in pieces of python, there's a lot to it, here we'll just cover the basic functionality, which comprises most of its use anyway.
programOne.py
programTwo.py
And there you have it! Pickles! Delicious! You can also store more complicated data structures:
Exercises
1. Make the following functions:
a) Takes an integer x as input, prints x**2
b) Takes integers x and y as input, prints x * y
c) Takes a list xs as input, prints xs[0] * xs[1]
2. Modify the above programs so that the function returns the result instead of printing it. This result is then printed by the program that called the function.
3. As promised, most things that happen in functions stay in the functions, but there are important exceptions. Make the following functions, which should illustrate this property:
a) The function takes an integer as input, and it increments that integer by one using the '+=' operator. Print the value of the integer before and after the function is called.
b) The function takes a list as input, and it changes the first element of the list to the string 'x'. Print the value of the list before and after the function is called.
c) The function takes a dictionary as input, and it adds the key 'x' with value 'y' to this dictionary. Print the dictionary before and after the function is called.
4. Create a module called 'exercises.py.' Put your functions from exercise two into this module. Now write two programs that each call all of the functions in the module:
a) A program that uses the 'import exercises' line.
b) A program that uses the 'from exercises import *' line
5. Make a fasta parser function. This function should take a fasta-format file as input, read through the file using 'open', distinguish between ID-containing lines and sequence-containing lines, and return a dictionary with IDs as keys and sequences as values. Put this function in your exercises.py module.
exercises.py
Copy and paste the following lines into a file called 'testFasta.fa.' Create a program that imports the exercises.py module and prints the sequence corresponding to the gene ID 'gene3.'
6. Modify your program from (5) such that instead of printing the data, it pickles it. Now write another program that unpickles that pickle file and prints the sequence of gene3.
7. (bonus) Create an ORF finder. For our purposes, we will define an open reading frame (ORF) as a start codon followed at some distance by a stop codon in the same frame. This program should take a pickled fasta file as in (6) as input and outputs a pickled dictionary of gene name->ORF sequence key-value pairs. If the sequence does not contain an ORF, then the gene name should not be in the dictionary.