Reading and Writing Files

  • We open files using the built-in open function. We need to tell the function if the file is to be used for reading, writing, or appending with the r, w, and a flags.

  • All the test files for the course are located at https://ki-data.mit.edu/bcc/teaching/IntroToPython.tgz.

  • If you are on our cluster, you can copy them all to the current directory by typing:

cp /net/bmc-pub15/data/bmc/public/BCC/external/teaching/IntroToPython/* ./

Examples

  • Reading file seq.txt

In [1]: fin=open('seq.txt')

In [2]: fin=open('/net/rowley/ifs/data/bcc/dropbox/teaching_python/seq.txt')

In [3]: fin=open('seq.txt','r')
  • Writing to file seq2.txt

In [1]: Aa="GLECDGRTNLCCRQQFF"
In [2]: fo=open('seq2.txt','w')
In [3]: fo.write(Aa)
In [4]: fo.close()
Out[4]: <function close>
In [5]: less seq2.txt
GLECDGRTNLCCRQQFF

*no need to remember to close the file handler if using "with" statement 
In [1]: with open('seq2.txt','w') as fo:
   ...:     fo.write("ABC")
   ...:     
In [2]: less seq2.txt
ABC

Note: writing to a file will delete the existing content of the file
  • Appending to file seq2.txt

In [1]: with open('seq2.txt','a') as fo:
   ...:     fo.write("CDE")
   ...:     
In [2]: less seq2.txt
ABCCDE
  • Reading files with read() and readlines()

read and readlines methods both store the contents of the read in file for further processing
The difference is that read returns the content as a single string, 
while readlines returns it as a list of lines

In [1]: seq=open('seq.txt','r').read()

In [2]: seq
Out[2]: 'ACTGATG\nACTGGTCA\nATGATG\nTCGAAGCT\nGCAGGCG\nGATCCTAG\nCATGTCGT\nCTCTATCTC\n'

In [3]: type(seq)
Out[3]: str


In [1]: seq=open ('seq.txt','r').readlines() 

In [2]: seq
Out[2]: 
['ACTGATG\n',
 'ACTGGTCA\n',
 'ATGATG\n',
 'TCGAAGCT\n',
 'GCAGGCG\n',
 'GATCCTAG\n',
 'CATGTCGT\n',
 'CTCTATCTC\n']

In [3]: type(seq)
Out[3]: list
  • We can read in a file using our Python script, process it, and output the results to an output file

    • Let's read in file seq.txt

    • find the palindrome sequences using our python script

    • Then output the palindrome sequences to file palindrome.txt

example1: select palindrome sequences

write palindrome2.py using a text editor:

manyseqs=open ('seq.txt','r').readlines()
for seq in manyseqs:
     s=seq.strip()
     if (s==s[::-1]):
          with open ('palindrome.txt','a') as fo:
               fo.write(s)
               fo.write("\n")



In Unix:
python palindrome2.py 
less palindrome.txt 
ACTGGTCA
TCGAAGCT
GATCCTAG
CTCTATCTC
  • Let's do an exercise by writing a Python script to say hello to the class

    • First read in file class_list as a list

    • Then output our greetings to file greetings

example2: Say Hi to our class

write Hello_class.py using a text editor:

classlist=open('class_list.txt','r').readlines()
for student in classlist:
        with open('greetings','a') as fo:
                fo.write("Hello,")
                fo.write(student)


In Unix:
python hello_class.py
less greetings
Hello,Manijeh
Hello,Shawn
Hello,Giorgio
Hello,Shuyu
Hello,Britt
Hello,Tu
Hello,Benjamin
Hello,Priyanka
Hello,Sabrina
Hello,Eric
  • To avoid changing scripts, we can use arguments to read input files and to write output files

    • ./hello_class2.py class_list greetings_again

hello_class2.py


#!/usr/bin/env python
import sys

InFileName=sys.argv[1]
OutFileName=sys.argv[2]

#open input file
classlist=open(InFileName,'r').readlines()

for students in classlist:
        student=students.strip()
        with open(OutFileName,'a') as fo:
                fo.write("Hello,")
                fo.write(student)
                fo.write(". It is nice to have you here!\n")
  • Input another class list to hello_class2.py will output greetings to another class

    • ./hello_class2.py future_class_list greetings_to_future_class

Last updated