We open files using the built-in open function. We need to tell the function if the file is to be used for reading, writing, or appending with the r, w, and a flags.
All the test files for the course are located at .
If you are on our cluster, you can copy them all to the current directory by typing:
In [1]: fin=open('seq.txt')
In [2]: fin=open('/net/rowley/ifs/data/bcc/dropbox/teaching_python/seq.txt')
In [3]: fin=open('seq.txt','r')
Writing to file seq2.txt
In [1]: Aa="GLECDGRTNLCCRQQFF"
In [2]: fo=open('seq2.txt','w')
In [3]: fo.write(Aa)
In [4]: fo.close()
Out[4]: <function close>
In [5]: less seq2.txt
GLECDGRTNLCCRQQFF
*no need to remember to close the file handler if using "with" statement
In [1]: with open('seq2.txt','w') as fo:
...: fo.write("ABC")
...:
In [2]: less seq2.txt
ABC
Note: writing to a file will delete the existing content of the file
Appending to file seq2.txt
In [1]: with open('seq2.txt','a') as fo:
...: fo.write("CDE")
...:
In [2]: less seq2.txt
ABCCDE
Reading files with read() and readlines()
read and readlines methods both store the contents of the read in file for further processing
The difference is that read returns the content as a single string,
while readlines returns it as a list of lines
In [1]: seq=open('seq.txt','r').read()
In [2]: seq
Out[2]: 'ACTGATG\nACTGGTCA\nATGATG\nTCGAAGCT\nGCAGGCG\nGATCCTAG\nCATGTCGT\nCTCTATCTC\n'
In [3]: type(seq)
Out[3]: str
In [1]: seq=open ('seq.txt','r').readlines()
In [2]: seq
Out[2]:
['ACTGATG\n',
'ACTGGTCA\n',
'ATGATG\n',
'TCGAAGCT\n',
'GCAGGCG\n',
'GATCCTAG\n',
'CATGTCGT\n',
'CTCTATCTC\n']
In [3]: type(seq)
Out[3]: list
We can read in a file using our Python script, process it, and output the results to an output file
Let's read in file seq.txt
find the palindrome sequences using our python script
Then output the palindrome sequences to file palindrome.txt
example1: select palindrome sequences
write palindrome2.py using a text editor:
manyseqs=open ('seq.txt','r').readlines()
for seq in manyseqs:
s=seq.strip()
if (s==s[::-1]):
with open ('palindrome.txt','a') as fo:
fo.write(s)
fo.write("\n")
In Unix:
python palindrome2.py
less palindrome.txt
ACTGGTCA
TCGAAGCT
GATCCTAG
CTCTATCTC
Let's do an exercise by writing a Python script to say hello to the class
First read in file class_list as a list
Then output our greetings to file greetings
example2: Say Hi to our class
write Hello_class.py using a text editor:
classlist=open('class_list.txt','r').readlines()
for student in classlist:
with open('greetings','a') as fo:
fo.write("Hello,")
fo.write(student)
In Unix:
python hello_class.py
less greetings
Hello,Manijeh
Hello,Shawn
Hello,Giorgio
Hello,Shuyu
Hello,Britt
Hello,Tu
Hello,Benjamin
Hello,Priyanka
Hello,Sabrina
Hello,Eric
To avoid changing scripts, we can use arguments to read input files and to write output files
./hello_class2.py class_list greetings_again
hello_class2.py
#!/usr/bin/env python
import sys
InFileName=sys.argv[1]
OutFileName=sys.argv[2]
#open input file
classlist=open(InFileName,'r').readlines()
for students in classlist:
student=students.strip()
with open(OutFileName,'a') as fo:
fo.write("Hello,")
fo.write(student)
fo.write(". It is nice to have you here!\n")
Input another class list to hello_class2.py will output greetings to another class