Reading and Writing Files
We open files using the built-in open function. We need to tell the function if the file is to be used for reading, writing, or appending with the r, w, and a flags.
All the test files for the course are located at https://ki-data.mit.edu/bcc/teaching/IntroToPython.tgz.
If you are on our cluster, you can copy them all to the current directory by typing:
cp /net/bmc-pub15/data/bmc/public/BCC/external/teaching/IntroToPython/* ./
Examples
Reading file
seq.txt
In [1]: fin=open('seq.txt')
In [2]: fin=open('/net/rowley/ifs/data/bcc/dropbox/teaching_python/seq.txt')
In [3]: fin=open('seq.txt','r')
Writing to file
seq2.txt
In [1]: Aa="GLECDGRTNLCCRQQFF"
In [2]: fo=open('seq2.txt','w')
In [3]: fo.write(Aa)
In [4]: fo.close()
Out[4]: <function close>
In [5]: less seq2.txt
GLECDGRTNLCCRQQFF
*no need to remember to close the file handler if using "with" statement
In [1]: with open('seq2.txt','w') as fo:
...: fo.write("ABC")
...:
In [2]: less seq2.txt
ABC
Note: writing to a file will delete the existing content of the file
Appending to file
seq2.txt
In [1]: with open('seq2.txt','a') as fo:
...: fo.write("CDE")
...:
In [2]: less seq2.txt
ABCCDE
Reading files with
read()
andreadlines()
read and readlines methods both store the contents of the read in file for further processing
The difference is that read returns the content as a single string,
while readlines returns it as a list of lines
In [1]: seq=open('seq.txt','r').read()
In [2]: seq
Out[2]: 'ACTGATG\nACTGGTCA\nATGATG\nTCGAAGCT\nGCAGGCG\nGATCCTAG\nCATGTCGT\nCTCTATCTC\n'
In [3]: type(seq)
Out[3]: str
In [1]: seq=open ('seq.txt','r').readlines()
In [2]: seq
Out[2]:
['ACTGATG\n',
'ACTGGTCA\n',
'ATGATG\n',
'TCGAAGCT\n',
'GCAGGCG\n',
'GATCCTAG\n',
'CATGTCGT\n',
'CTCTATCTC\n']
In [3]: type(seq)
Out[3]: list
We can read in a file using our Python script, process it, and output the results to an output file
Let's read in file
seq.txt
find the palindrome sequences using our python script
Then output the palindrome sequences to file
palindrome.txt
example1: select palindrome sequences
write palindrome2.py using a text editor:
manyseqs=open ('seq.txt','r').readlines()
for seq in manyseqs:
s=seq.strip()
if (s==s[::-1]):
with open ('palindrome.txt','a') as fo:
fo.write(s)
fo.write("\n")
In Unix:
python palindrome2.py
less palindrome.txt
ACTGGTCA
TCGAAGCT
GATCCTAG
CTCTATCTC
Let's do an exercise by writing a Python script to say hello to the class
First read in file
class_list
as a listThen output our greetings to file
greetings
example2: Say Hi to our class
write Hello_class.py using a text editor:
classlist=open('class_list.txt','r').readlines()
for student in classlist:
with open('greetings','a') as fo:
fo.write("Hello,")
fo.write(student)
In Unix:
python hello_class.py
less greetings
Hello,Manijeh
Hello,Shawn
Hello,Giorgio
Hello,Shuyu
Hello,Britt
Hello,Tu
Hello,Benjamin
Hello,Priyanka
Hello,Sabrina
Hello,Eric
To avoid changing scripts, we can use arguments to read input files and to write output files
./hello_class2.py class_list greetings_again
hello_class2.py
#!/usr/bin/env python
import sys
InFileName=sys.argv[1]
OutFileName=sys.argv[2]
#open input file
classlist=open(InFileName,'r').readlines()
for students in classlist:
student=students.strip()
with open(OutFileName,'a') as fo:
fo.write("Hello,")
fo.write(student)
fo.write(". It is nice to have you here!\n")
Input another class list to
hello_class2.py
will output greetings to another class./hello_class2.py future_class_list greetings_to_future_class
Last updated
Was this helpful?