LogoLogo
LogoLogo
  • The Barbara K. Ostrom (1978) Bioinformatics and Computing Facility
  • Computing Resources
    • Active Data Storage
    • Archive Data Storage
    • Luria Cluster
      • FAQs
    • Other Resources
  • Bioinformatics Topics
    • Tools - A Basic Bioinformatics Toolkit
      • Getting more out of Microsoft Excel
      • Bioinformatics Applications of Unix
        • Unix commands applied to bioinformatics
        • Manipulate NGS files using UNIX commands
        • Manipulate alignment files using UNIX commands
      • Alignments and Mappers
      • Relational databases
        • Running Joins on Galaxy
      • Spotfire
    • Tasks - Bioinformatics Methods
      • UCSC Genome Bioinformatics
        • Interacting with the UCSC Genome Browser
        • Obtaining DNA sequence from the UCSC Database
        • Obtaining genomic data from the UCSC database using table browser queries
        • Filtering table browser queries
        • Performing a BLAT search
        • Creating Custom Tracks
        • UCSC Intersection Queries
        • Viewing cross-species alignments
        • Galaxy
          • Intro to Galaxy
          • Galaxy NGS Illumina QC
          • Galaxy NGS Illumina SE Mapping
          • Galaxy SNP Interval Data
        • Editing and annotation gene structures with Argo
      • GeneGO MetaCore
        • GeneGo Introduction
        • Loading Data Into GeneGO
        • Data Management in GeneGO
        • Setting Thresholds and Background Sets
        • Search And Browse Content Tab
        • Workflows and Reports Tab
        • One-click Analysis Tab
        • Building Network for Your Experimental Data
      • Functional Annotation of Gene Lists
      • Multiple Sequence Alignment
        • Clustalw2
      • Phylogenetic analysis
        • Neighbor Joining method in Phylip
      • Microarray data processing with R/Bioconductor
    • Running Jupyter notebooks on luria cluster nodes
  • Data Management
    • Globus
  • Mini Courses
    • Schedule
      • Previous Teaching
    • Introduction to Unix and KI Computational Resources
      • Basic Unix
        • Why Unix?
        • The Unix Tree
        • The Unix Terminal and Shell
        • Anatomy of a Unix Command
        • Basic Unix Commands
        • Output Redirection and Piping
        • Manual Pages
        • Access Rights
        • Unix Text Editors
          • nano
          • vi / vim
          • emacs
        • Shell Scripts
      • Software Installation
        • Module
        • Conda Environment
      • Slurm
    • Introduction to Unix
      • Why Unix?
      • The Unix Filesystem
        • The Unix Tree
        • Network Filesystems
      • The Unix Shell
        • About the Unix Shell
        • Unix Shell Manual Pages
        • Using the Unix Shell
          • Viewing the Unix Tree
          • Traversing the Unix Tree
          • Editing the Unix Tree
          • Searching the Unix Tree
      • Files
        • Viewing File Contents
        • Creating and Editing Files
        • Manipulating Files
        • Symbolic Links
        • File Ownership
          • How Unix File Ownership Works
          • Change File Ownership and Permissions
        • File Transfer (in-progress)
        • File Storage and Compression
      • Getting System Information
      • Writing Scripts
      • Schedule Scripts Using Crontab
    • Advanced Utilization of IGB Computational Resources
      • High Performance Computing Clusters
      • Slurm
        • Checking the Status of Computing Nodes
        • Submitting Jobs / Slurm Scripts
        • Interactive Sessions
      • Package Management
        • The System Package Manager
        • Environment Modules
        • Conda Environments
      • SSH Port Forwarding
        • SSH Port Forwarding Jupyter Notebooks
      • Containerization
        • Docker
          • Docker Installation
          • Running Docker Images
          • Building Docker Images
        • Singularity
          • Differences from Docker
          • Running Images in Singularity
      • Running Nextflow / nf-core Pipelines
    • Python
      • Introduction to Python for Biologists
        • Interactive Python
        • Types
          • Strings
          • Lists
          • Tuples
          • Dictionaries
        • Control Flow
        • Loops
          • For Loops
          • While Loops
        • Control Flows and Loops
        • Storing Programs for Re-use
        • Reading and Writing Files
        • Functions
      • Biopython
        • About Biopython
        • Quick Start
          • Basic Sequence Analyses
          • SeqRecord
          • Sequence IO
          • Exploration of Entrez Databases
        • Example Projects
          • Coronavirus Exploration
          • Translating a eukaryotic FASTA file of CDS entries
        • Further Resources
      • Machine Learning with Python
        • About Machine Learning
        • Hands-On
          • Project Introduction
          • Supervised Approaches
            • The Logistic Regression Model
            • K-Nearest Neighbors
          • Unsupervised Approaches
            • K-Means Clustering
          • Further Resources
      • Data Processing with Python
        • Pandas
          • About Pandas
          • Making DataFrames
          • Inspecting DataFrames
          • Slicing DataFrames
          • Selecting from DataFrames
          • Editing DataFrames
        • Matplotlib
          • About Matplotlib
          • Basic Plotting
          • Advanced Plotting
        • Seaborn
          • About Seaborn
          • Basic Plotting
          • Visualizing Statistics
          • Visualizing Proteomics Data
          • Visualizing RNAseq Data
    • R
      • Intro to R
        • Before We Start
        • Getting to Know R
        • Variables in R
        • Functions in R
        • Data Manipulation
        • Simple Statistics in R
        • Basic Plotting in R
        • Advanced Plotting in R
        • Writing Figures to a File
        • Further Resources
    • Version Control with Git
      • About Version Control
      • Setting up Git
      • Creating a Repository
      • Tracking Changes
        • Exercises
      • Exploring History
        • Exercises
      • Ignoring Things
      • Remotes in Github
      • Collaborating
      • Conflicts
      • Open Science
      • Licensing
      • Citation
      • Hosting
      • Supplemental
Powered by GitBook

MIT Resources

  • https://accessibility.mit.edu

Massachusetts Institute of Technology

On this page

Was this helpful?

Export as PDF
  1. Mini Courses
  2. Introduction to Unix

Writing Scripts

Instead of running a single command at a time, you can combine multiple commands into a file to create a script. Scripts can simplify multi-step processes into a single invocation, saving you time in the long-run.

Writing a script is as simple as writing a file. Usually, we make the first line of the script #!/bin/bash to tell the shell to use bash when running the script.

You can create variables in a bash script using the = operator. So to make a variable named myname, you'd write myname="Allen". To use the variable later in the script, you'd prefix it with a $. For example, $myname.

You can also ask for a user's input and store that into a variable by using the read command. Preface it with echo "question?" to give context for what the user is inputting. For example:

echo "What is your name?"
read name

echo "Hello, $name"

You can run shell commands inside of a script and store their results in a variable. To do so, you wrap the command with $(). For example, to get the size of a file and store it in a variable, you'd do file_size = $(du file).

Knowing this, let's create a script that looks through the files in a directory, and moves any files above a certain size to a new folder. Name it sizewatcher.sh

#!/bin/bash

# Make variables for directory to loop through
# and directory where files should be moved.
directory="/home/asoberan/unixclass"
new_directory="/home/asoberan/bigfiles"

# If the new directory doesn't exist, then
# create the directory
if [ ! -d $new_directory]; then
    mkdir $new_directory;
fi;

# Loop through the files in the directory.
# Store the size of the file (in KB), in
# file_size. If the file size is greater
# than 400000 KB, move the file to the
# new directory 
for file in $(ls $directory);
do
    file_size=$(du $file | cut -f1);
    if [ $file_size -gt "400000" ]; then
        mv $file $new_directory;
    fi;
done;

Once you've created and saved this file, make sure to modify its permissions to let it be executed. An easy way of doing this is running the following command:

chmod +x sizewatcher.sh

Then run the script:

./sizewatcher.sh

Let's create another script that asks a user for a directory, then sorts the files in that directory into folders that correspond to the year and month that the file was created. Name it sorter.sh:

#!/bin/bash

# Ask for directory and store it in a variable
# called "directory"
echo "What directory to check?"
read directory

# Check if the provided argument is a directory
if [ ! -d "$directory" ]; then
    echo "Error: $directory is not a directory."
    exit 1
fi

# Iterate over files in the directory
for file in "$directory"/*; do
    # Check to see if the file is really file and
    # not a directory, etc.
    if [ -f "$file" ]; then
        # Get the year and month of creation for the file
        year=$(date -r "$file" +%Y)
        month=$(date -r "$file" +%m)

        # Create directory for the year and month if it doesn't exist
        mkdir -p "$directory/$year-$month"

        # Move the file to the corresponding directory
        mv "$file" "$directory/$year-$month"
        echo "Moved $file to $directory/$year-$month"
    fi
done

echo "File sorting complete."

Make it executable and then run it.

PreviousGetting System InformationNextSchedule Scripts Using Crontab

Last updated 1 year ago

Was this helpful?