All pages
Powered by GitBook
1 of 15

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Types

Integers

1,2,3,-1,-100,-200

Floats

0.01, 1e-6,-3.14

Bools

True, False

Interactive Python

ipython

Using Python as a calculater

Print "Hello World!"

Log in to Luria.

# Land on a compute node
srun --pty bash

# Make a directory for the course
mkdir test_python

# Go to the directory for the course
cd test_python

# Invoke Python
module load python

# Start Python interpreter
ipython
In [1]: 1+2

Out[1]: 3

In [2]: 1+2*3+4*5

Out[2]: 27
In [1]: print("Hello World!")
Hello World!

Lists

  • List consist of a sequnece of different types delimited by square barckets [ and ]

  • Unlike strings which only contain characters, list elements can be anything, including other lists

In [1]: alist=['The','principal','author','of','Python','is',['Guido','van','Rossum']]

In [2]: alist[6]
Out[2]: ['Guido', 'van', 'Rossum']

In [3]: alist[6][0]
Out[3]: 'Guido'
  • lists are mutable and strings are immutable

In [4]: alist[1]='first'

In [5]: alist
Out[5]: ['The', 'first', 'author', 'of', 'Python', 'is', ['Guido', 'van', 'Rossum']]

In [6]: astring="Van Rossum is a big fan of Monty Python's Flying Circus"

In [7]: astring[1]='x'
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/net/rowley/ifs/data/bcc/duan/test_python/<ipython-input-7-7abb36ddd2dd> in <module>()
----> 1 astring[1]='x'

TypeError: 'str' object does not support item assignment
  • We can grow lists in several ways - using insert, append and list concatenation

In [8]: blist=['Python','was','conceived','in','1989']

In [9]: blist
Out[9]: ['Python', 'was', 'conceived', 'in', '1989']

In [10]: blist.append('around Christmas')

In [11]: blist
Out[11]: ['Python', 'was', 'conceived', 'in', '1989', 'around Christmas']

In [12]: blist=blist+["as","a","hobby","project"]

In [13]: blist
Out[13]: 
['Python',
 'was',
 'conceived',
 'in',
 '1989',
 'around Christmas',
 'as',
 'a',
 'hobby',
 'project']

In [14]: blist[4:4]=["December",]

In [15]: blist
Out[15]: 
['Python',
 'was',
 'conceived',
 'in',
 'December',
 '1989',
 'around Christmas',
 'as',
 'a',
 'hobby',
 'project']
  • We can remove items from a list by using pop, del or assigning a slice to the empty list

In [16]: blist.pop()
Out[16]: 'project'
In [17]: blist
Out[17]: 
['Python',
 'was',
 'conceived',
 'in',
 'December',
 '1989',
 'around Christmas',
 'as',
 'a',
 'hobby']

In [18]: del blist[4]

In [19]: blist
Out[19]: 
['Python',
 'was',
 'conceived',
 'in',
 '1989',
 'around Christmas',
 'as',
 'a',
 'hobby']

In [20]: del blist[6:9]

In [21]: blist
Out[21]: ['Python', 'was', 'conceived', 'in', '1989', 'around Christmas']

In [22]: blist[3:5]=[]

In [22]: blist
Out[22]: ['Python', 'was', 'conceived', 'around Christmas']

Introduction to Python for Biologists

Strings

  • Strings are anything within 'single quotes', "double quotes", """triple quotes""" of a combination of quotes such as "'python2.7.2'"

In [1]: s="Python is named after the British comedy skit Monty Python"

In [2]: s[0]
Out[2]: 'P'

In [3]: s[0:6]
Out[3]: 'Python'

In [4]: s[0:6:2]
Out[4]: 'Pto'

In [5]: s[-1]
Out[5]: 'n'

In [6]: s[::-1]
Out[6]: 'nohtyP ytnoM tiks ydemoc hsitirB eht retfa deman si nohtyP'

In [7]: s.upper()
Out[7]: 'PYTHON IS NAMED AFTER THE BRITISH COMEDY SKIT MONTY PYTHON'

In [8]: s.lower()
Out[8]: 'python is named after the british comedy skit monty python'

In [9]: s.(tab)
s.capitalize  s.endswith    s.isalnum     s.istitle     s.lstrip      s.rjust       s.splitlines  s.translate   
s.center      s.expandtabs  s.isalpha     s.isupper     s.partition   s.rpartition  s.startswith  s.upper       
s.count       s.find        s.isdigit     s.join        s.replace     s.rsplit      s.strip       s.zfill       
s.decode      s.format      s.islower     s.ljust       s.rfind       s.rstrip      s.swapcase    
s.encode      s.index       s.isspace     s.lower       s.rindex      s.split       s.title       

In [10]: help(s.islower)
Help on built-in function islower:

islower(...) method of builtins.str instance
    S.islower() -> bool
    
    Return True if all cased characters in S are lowercase and there is
    at least one cased character in S, False otherwise.

In [11]: s.islower()
Out[11]: False

Loops

Previous: Control Flow

Next: Control Flow and Loops

While Loops

While Loops

In [1]: i=1

In [2]: while (i<5):
   ...:     print(i)
   ...:     i=i+1
   ...:     
1
2
3
4

In [3]: i=0

In [4]: instructors=["Duan","Charlie","Stuart","Allen"]

In [5]: while (i<4):
   ....:     print("Hello,", instructors[i])
   ....:     i=i+1
   ....:     
Hello, Duan
Hello, Charlie
Hello, Stuart
Hello, Allen

Dictionaries

A dictionary is a fancy list
  • A dictionary consists of (key,value) pairs

  • The key is an immutable type(e.g. a number, a string, a tuple)

  • The value can be anything

  • We retrieve the value in a dictionary by using the associated key

  • Dictionaries are fancy lists that are not restricted to consecutive integers for indexing

  • We create dictionaries with curly braces { }

  • We assign elements to and retrieve elements from dictionaries with square brackets [key]

In [1]: emails={}

In [2]: emails['Duan']='[email protected]'

In [3]: emails['Charlie']='[email protected]'

In [4]: emails['Stuart']='[email protected]'

In [5]: emails['Allen']='[email protected]'

In [6]: emails.keys()
Out[6]: ['Allen', 'Charlie', 'Duan', 'Stuart']

In [7]: emails.values()
Out[7]: ['[email protected]', '[email protected]', '[email protected]', '[email protected]']

In [8]: emails
Out[8]: 
{'Charlie': '[email protected]',
 'Duan': '[email protected]',
 'Allen': '[email protected]',
 'Stuart': '[email protected]'}

In [9]: emails['Duan']
Out[9]: '[email protected]'
  • Dictionaries can be constructed from a list of (key,value) pairs (or 2-turples)from two matching lists or keys and values

In [10]: instructors=['Duan','Charlie','Stuart','Allen']

In [11]: email=['[email protected]','[email protected]','[email protected]','[email protected]']

In [12]: adict=dict(zip(instructors,email))

In [13]: adict
Out[14]: 
{'Charlie': '[email protected]',
 'Duan': '[email protected]',
 'Allen': '[email protected]',
 'Stuart': '[email protected]'}

Tuples

  • For the most part, we can just consider tuples to be immutable lists

  • Tuples are defined by ( )

  • The items of tuples are separated by commas

  • a neat trick we can do with tuples is unpacking

Control Flows and Loops

  • Example 1: deciding palindrome sequences from many candidates

  • Example 2: print odd numbers 1 to 20

In [1]: atuple=('Python','2.0', ('was','released'), ['on','October','16th','2000'])

In [2]: atuple[3][3]
Out[2]: '2000'

In [3]: atuple[1]='3.0'
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/net/rowley/ifs/data/bcc/duan/test_python/<ipython-input-3-6f783710c508> in <module>()
----> 1 atuple[1]='3.0'

TypeError: 'tuple' object does not support item assignment
In [1]: manyseqs=['ACTGATG','ACTGGTCA','ATGATG','TCGAAGCT','GCAGGCG','GATCCTAG','CATGTCGT','CTCTATCTC']
In [2]: for seq in manyseqs:
   ...:     if (seq==seq[::-1]):
   ...:         print(seq)
   ...:         
ACTGGTCA
TCGAAGCT
GATCCTAG
CTCTATCTC
In [1]: for i in range(1,21):
   ...:     if (i%2==1) :
   ...:         print("%02d" %i)
   ...:         
01
03
05
07
09
11
13
15
17
19

For Loops

For Loops

In [1]: for number in range(1,5):
   ...:     print(number)
   ...:     
1
2
3
4


In [2]: for nt in "ACTG":
   ...:     print(nt)
   ...:     
A
C
T
G


In [3]: for number in range(1,11):
   ....:     print('Hello,World!\n')
   ....:     
Hello,World!

Hello,World!

Hello,World!

Hello,World!

Hello,World!

Hello,World!

Hello,World!

Hello,World!

Hello,World!

Hello,World!

In [5]: instructors=["Duan","Charlie","Stuart","Allen"]

In [6]: for name in instructors:
   ...:     print("Hello", name)
   ...:     
Hello Duan
Hello Charlie
Hello Stuart
Hello Allen

Nested loops

Labeling 96 well plates
in [10]: wells=[]
In [11]: for rows in 'ABCDEF':
   ....:     for columns in range(1,13):
   ....:         wells.append('%s%02d'%(rows,columns))
   ....:         

In [12]: wells
Out[12]: 
['A01',
 'A02',
 'A03',
 'A04',
 'A05',
 'A06',
 'A07',
 'A08',
 'A09',
 'A10',
 'A11',
 'A12',
 'B01',
 'B02',
 'B03',
 'B04',
 'B05',
 'B06',
 'B07',
 'B08',
 'B09',
 'B10',
 'B11',
 'B12',
 'C01',
 'C02',
 'C03',
 'C04',
 'C05',
 'C06',
 'C07',
 'C08',
 'C09',
 'C10',
 'C11',
 'C12',
 'D01',
 'D02',
 'D03',
 'D04',
 'D05',
 'D06',
 'D07',
 'D08',
 'D09',
 'D10',
 'D11',
 'D12',
 'E01',
 'E02',
 'E03',
 'E04',
 'E05',
 'E06',
 'E07',
 'E08',
 'E09',
 'E10',
 'E11',
 'E12',
 'F01',
 'F02',
 'F03',
 'F04',
 'F05',
 'F06',
 'F07',
 'F08',
 'F09',
 'F10',
 'F11',
 'F12']

Storing Programs for Re-use

  • Using ipython is great for learning because of the instant feedback, but at some point you will want to save your code to use for another day. To do so, we use our text editor to write code in a file that ends with the extension '.py'.

  • All python scripts and test files used in the course can be downloaded

  • Login and password will be provided during class

Examples

  • Example 1

  • Example 2

Open a text editor
type: print("I can program in Python!\n")
Save the file as expert.py

1) in Unix, type python expert.py
python expert.py 
I can program in Python!

2) in ipython type run expert.py
In [1]: run expert.py
I can program in Python!

3) include the following in the first line of your script and type ./expert.py in unix
#!/usr/bin/env python

4) in Canopy
Open expert.py
run expert.py
In text editor, let's type our palindrome code from example1 of "Control flow and loops" section.
Name the file palindrome.py. The script should look like the following:

manyseqs=['ACTGATG','ACTGGTCA','ATGATG','TCGAAGCT','GCAGGCG','GATCCTAG','CATGTCGT','CTCTATCTC']
for seq in manyseqs:
        if (seq==seq[::-1]):
                print("%s" %seq)

1) in Unix: python palindrome.py
python palindrome.py 
ACTGGTCA
TCGAAGCT
GATCCTAG
CTCTATCTC

2) in ipython:run palindrome.py
In [1]: run palindrome.py
ACTGGTCA
TCGAAGCT
GATCCTAG
CTCTATCTC

3) include the following in the first line of your script and type palindrome.py in unix
#!/usr/bin/env python
https://ki-data.mit.edu/bcc/teaching/IntroToPython.tgz

Reading and Writing Files

  • We open files using the built-in open function. We need to tell the function if the file is to be used for reading, writing, or appending with the r, w, and a flags.

  • All the test files for the course are located at https://ki-data.mit.edu/bcc/teaching/IntroToPython.tgz.

  • If you are on our cluster, you can copy them all to the current directory by typing:

cp /net/bmc-pub15/data/bmc/public/BCC/external/teaching/IntroToPython/* ./

Examples

  • Reading file seq.txt

In [1]: fin=open('seq.txt')

In [2]: fin=open('/net/rowley/ifs/data/bcc/dropbox/teaching_python/seq.txt')

In [3]: fin=open('seq.txt','r')
  • Writing to file seq2.txt

In [1]: Aa="GLECDGRTNLCCRQQFF"
In [2]: fo=open('seq2.txt','w')
In [3]: fo.write(Aa)
In [4]: fo.close()
Out[4]: <function close>
In [5]: less seq2.txt
GLECDGRTNLCCRQQFF

*no need to remember to close the file handler if using "with" statement 
In [1]: with open('seq2.txt','w') as fo:
   ...:     fo.write("ABC")
   ...:     
In [2]: less seq2.txt
ABC

Note: writing to a file will delete the existing content of the file
  • Appending to file seq2.txt

In [1]: with open('seq2.txt','a') as fo:
   ...:     fo.write("CDE")
   ...:     
In [2]: less seq2.txt
ABCCDE
  • Reading files with read() and readlines()

read and readlines methods both store the contents of the read in file for further processing
The difference is that read returns the content as a single string, 
while readlines returns it as a list of lines

In [1]: seq=open('seq.txt','r').read()

In [2]: seq
Out[2]: 'ACTGATG\nACTGGTCA\nATGATG\nTCGAAGCT\nGCAGGCG\nGATCCTAG\nCATGTCGT\nCTCTATCTC\n'

In [3]: type(seq)
Out[3]: str


In [1]: seq=open ('seq.txt','r').readlines() 

In [2]: seq
Out[2]: 
['ACTGATG\n',
 'ACTGGTCA\n',
 'ATGATG\n',
 'TCGAAGCT\n',
 'GCAGGCG\n',
 'GATCCTAG\n',
 'CATGTCGT\n',
 'CTCTATCTC\n']

In [3]: type(seq)
Out[3]: list
  • We can read in a file using our Python script, process it, and output the results to an output file

    • Let's read in file seq.txt

    • find the palindrome sequences using our python script

    • Then output the palindrome sequences to file palindrome.txt

example1: select palindrome sequences

write palindrome2.py using a text editor:

manyseqs=open ('seq.txt','r').readlines()
for seq in manyseqs:
     s=seq.strip()
     if (s==s[::-1]):
          with open ('palindrome.txt','a') as fo:
               fo.write(s)
               fo.write("\n")



In Unix:
python palindrome2.py 
less palindrome.txt 
ACTGGTCA
TCGAAGCT
GATCCTAG
CTCTATCTC
  • Let's do an exercise by writing a Python script to say hello to the class

    • First read in file class_list as a list

    • Then output our greetings to file greetings

example2: Say Hi to our class

write Hello_class.py using a text editor:

classlist=open('class_list.txt','r').readlines()
for student in classlist:
        with open('greetings','a') as fo:
                fo.write("Hello,")
                fo.write(student)


In Unix:
python hello_class.py
less greetings
Hello,Manijeh
Hello,Shawn
Hello,Giorgio
Hello,Shuyu
Hello,Britt
Hello,Tu
Hello,Benjamin
Hello,Priyanka
Hello,Sabrina
Hello,Eric
  • To avoid changing scripts, we can use arguments to read input files and to write output files

    • ./hello_class2.py class_list greetings_again

hello_class2.py


#!/usr/bin/env python
import sys

InFileName=sys.argv[1]
OutFileName=sys.argv[2]

#open input file
classlist=open(InFileName,'r').readlines()

for students in classlist:
        student=students.strip()
        with open(OutFileName,'a') as fo:
                fo.write("Hello,")
                fo.write(student)
                fo.write(". It is nice to have you here!\n")
  • Input another class list to hello_class2.py will output greetings to another class

    • ./hello_class2.py future_class_list greetings_to_future_class

Functions

Functions are like machines

  • Learning to write your own functions will greatly increase the complexity of the programs that you can write

  • A function is a black box-it takes some input,does something with it, and spits out some output

  • Functions hide details away, allowing you to solve problems at a higher level without getting bogged down

Examples

  • A typical function looks like this:

def function_name(function_arguments)
   """optional string decribing the function"""
   statements ...
   return result

  • A function example: sum

In [1]: numbers=[1,2,3,4,5]

In [2]: sum(numbers)
Out[2]: 15

  • Anatomy of sum function

1.initialize the sum to zero
2.loop over each number while adding the number to the sum variable
3.return the value of sum

def sum(xs):
   """Given a sequence of numbers, return the sum."""
   s=0
   for x in xs:
      s=s+x
   return s

  • Writing your own function

Open a text editor, type the following and save it as MyMathFunctions.py:
def mysum(numbers):
        """Given a sequence of numbers, return the sum"""
        s=0
        for x in range(numbers):
                s=s+x
        return s

def myproduct(numbers):
        """Given a sequence of numbers, return the product"""
        p=1
        for x in range(numbers):
                p=p*x
        return p

  • Importing a function from a file (module)

    • Once we import MyMathFunctions module we just wrote, we can use the mysum function and myproduct function just like the built-in function

    • The way to call a function is to give the function name followed by parenthesis with values for the number of arguments expected

In [1]: import MyMathFunctions

In [2]: numbers=[1,2,3,4]

In [3]: MyMathFunctions.mysum(numbers)
Out[3]: 10

In [4]: MyMathFunctions.myproduct(numbers)
Out[4]: 24
Too many typing strokes? Try the following:

In [5]: import MyMathFunctions as f

In [6]: f.mysum(numbers)
Out[6]: 10
Still too many typing strokes? Try the following:
In [7]: import MyMathFunctions

In [8]: a=MyMathFunctions.mysum

In [9]: a(numbers)
Out[9]: 10
  • Function arguments

    • We can define functions with more than one arguments

Example: restrcition.py
def finder(DNA,enzyme):
        db={}
        name=['ECOR1','BAMH1','HINDIII']
        site=['GAATTC','GGATCC','AAGCTT']
        db=dict(zip(name,site))
        DNA=DNA.upper()
        enzyme=enzyme.upper()
        index=DNA.find(db[enzyme])
        if (index>-1):
                print ("The restriction site starts at base pair %d\n" %index)
        else:
                print ("No such restriction site\n")

In [1]: import restriction
In [2]: restriction.finder('ATGGAATTCCGT','EcoR1')
The restriction site starts at base pair 3

In [3]: restriction.finder('ATGGAATTCCGT','BamH1')
No such restriction site
  • Arguments with default values do not need to be supplied when calling a function. But if provided, will overwrite the default values

Example: restrcition_BamH1_default.py
def finder(DNA,enzyme='BamH1'):
        db={}
        name=['ECOR1','BAMH1','HINDIII']
        site=['GAATTC','GGATCC','AAGCTT']
        db=dict(zip(name,site))
        DNA=DNA.upper()
        enzyme=enzyme.upper()
        index=DNA.find(db[enzyme])
        if (index>-1):
                print ("The restriction site starts at base pair %d" %index)
        else:
                print ("No such restriction site\n")

In [1]: import restriction_BamH1_default

In [2]: restriction_BamH1_default.finder('ATGGAATTCCGT')
No such restriction site


In [3]: restriction_BamH1_default.finder('ATGGAATTCCGT','Ecor1')
The restriction site starts at base pair 3
  • Some existing Python modules

    • The os module provides a platform independent way to work with the operating system, make or remove files and directories

    • The csv module provides readers and writers for comma separated value data

    • The sys module contains many objects and functions for dealing with how python was complied or called when executed

    • The glob module proves the glob function to perform file globbing similar to what the unix shell provides

    • The math module provides common algebra and trigonometric function along with several math constants

    • The re module provides access to powerful regular expression

    • The datetime module provides time and datetime objects. allowing easy comparison of times and dates

    • The time module provides simple estimates for how long a command takes

    • The pickle module provides a way to save python objects to a file that you can unpickle later in a different program

    • The pypi module helps package installation

    • The numpy module is the de facto standard for numerical computing

    • The pandas module is useful for tabular data managing

    • The matplotlib module is the most frequently used plotting package in Python

    • The seaborn module is a module based on matplotlib. It provides a high-level interface for drawing attractive graphics.

Control Flow

If_else determines the logic flow of the program
  • The structure of the if-else statement has the form:

if (condition1 is true):
  do A
elif (condition2 is true):
  do B
else:
  do C
  • example 1: grading test scores

In [1]: grade=None

In [2]: score=86

In [3]: if (score>93):
   ...:     grade='A'
   ...: elif (score>85):
   ...:     grade='B'
   ...: else:
            grade='C'    

In [4]: grade
Out[4]: 'B'
  • example 2: palindrome

In [8]: seq1="ACTCA"

In [9]: if seq1==seq1[::-1]:
   ...:     print("%s is a palindrome!" % seq1)
   ...:     
ACTCA is a palindrome!