Manipulate alignment files using UNIX commands
# copy the file from the stock directory to the current directory
cp /net/ostrom/data/rowleybcc/charliew/old_teaching/Biol2Bioinfo/SAM/myfile.sam .What is in the file?
# display the content of the file on the screen, one page at a time
more myfile.sam# display the content of the whole file on the screen
cat myfile.sam# display the first 10 lines
head myfile.sam# open the file with the text editor nano
nano myfile.sam
How many alignments are in the file?

How many reads are in the file?

How many distinct sequences are in the file?

How many reads are uniquely mapped?

How many reads are multi-hits?
How many alignments are reported for each read?

What are the top 10 multi-hit reads?

What chromosomes are represented in the file?


List the chromosomes according to the read distribution


Which are the most and least represented chromosomes?

What mapping flags are in the file?

What reads and alignments are mapped on the reverse strand?

Split the alignments to chromosome 1 and chromosome 3 into separate files

What is the relationship between mapping qualities and number of multiple hits in the file?

How many reported alignments are gapless?


Last updated
Was this helpful?
