All pages
Powered by GitBook
1 of 26

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Introduction to Unix

Creating and Editing Files

  • nano

    • A basic text editor found on most UNIX operating systems.

    • Gives instructions on how to use it at the bottom of the screen.

      • ^ stands for Ctrl

Example output from running nano arrayDat.txt. Save any changes by exiting with ^X:

  GNU nano 2.3.1                                      File: arrayDat.txt                                                                                    

ProbeID Sample1 Sample2 Sample3 Sample4
1007_s_at       10.93   11.44   11.19   11.64
1053_at 8.28    7.54    8.06    7.32
117_at  3.31    3.41    3.13    3.13
121_at  4.42    4.32    4.46    4.63
1255_g_at       1.8     1.7     1.75    1.81

^G Get Help               ^O WriteOut               ^R Read File              ^Y Prev Page              ^K Cut Text               ^C Cur Pos
^X Exit                   ^J Justify                ^W Where Is               ^V Next Page              ^U UnCut Text             ^T To Spell

If you recall, you can mount the storage servers to your local machine using SMB. Since mounting the server makes it look like a local directory on your computer, you can use whatever text editing software that you're comfortable with to edit text files on the computing cluster.

Schedule Scripts Using Crontab

Many Unix systems come preinstalled with crontab, a tool that generates a file where each row is a cron job, or a program/script set to run at a specific time or interval.

Creating cron jobs is useful for periodically running programs. For example, you could create a script that searches your directory for files older than 1 year, and then create a cron job that runs this script once a month.

To specify the schedule for which a cron job should run, you need to use cron schedule expression. Cron schedule expressions have the following format:

  *        *              *                 *              *
(min)    (hour)    (day of the month)    (month)    (day of the week)

The first place represents the minute that the cron job should run at. The second place the hour, the third place the day of the month, the fourth the month, the fifth the day of the week.

A * indicates to run at "any" time. So a * over the month position would mean run on any month.

Prepending a number with */ means "every" . So a */5 over the minute position would mean run every 5th minute, A.K.A. every 5 minutes.

The following expression would dictate that the cron job run at 6:00 AM everyday, A.K.A. the 0th minute of the 6th hour on any day of the month, any month, and any day of the week.

0 6 * * *

The following expression would dictate that the cron job run every 30 minutes on Saturday, A.K.A. every 30th minute of any hour on any day of the month, any month, and on the 6th day of the week.

*/30 * * * 6

It's easy to get confused making cron schedule expressions, so I recommend using a site such as https://crontab.guru/ that makes it easy to construct the expressions and describe them to you.

Once you have a cron schedule expression, you can create a cron job in your crontab using the command crontab -e, which will open an editor for you the add your cron job. Put your cron job on an empty line with the following format:

0 6 * * * /home/asoberan/your/script.sh

The Unix Tree

The UNIX file system consists of files and directories organized in a hierarchical structure. When visualized, this hierarchical structure looks like a tree, with roots and many branches.

tree -L 1 / command, showing the Unix filesystem tree:

Every file or directory in a UNIX operating system is somewhere on this "tree." / is referred to as "root," because it's the root of the tree which every other file or directory is inside.

For example, every UNIX user's home directory is in home, which is in /. In other words, every user's home directory is in /home.

Unix has some shortcuts for referring to directories.

  • . stands for "my current directory."

  • .. stands for "my parent directory," a.k.a. the directory one branch higher in the tree

  • ~ stands for "my home directory."

Example of Unix directory shortcuts:

The organization of the filesystem is not set in stone. However, there is a standard that many UNIX operating systems follow called the Filesystem Hierarchy Standard (FHS). The FHS defines a standard filesystem layout for greater uniformity and easier documentation across UNIX-like operating systems.

Some of what the FHS dictates includes:

  • / must have everything to boot, restore, recover, or repair a system.

  • /etc holds configuration files for the system and programs present on the system. For example, if I install the SSH server, I can reasonably expect configuration files for it to be found at /etc/ssh/

  • /home are user's home directories

  • /tmp holds temporary files that can be deleted on reboot

  • /bin holds essential programs that are needed for system recovery

  • /usr/bin holds non-essential programs

Thanks to the FHS, you can expect most UNIX-like operating systems to look like this.

/
├── bin
├── boot
├── dev
├── etc
├── home
├── lib
├── lib64
├── mnt
├── nfs
├── opt
├── proc
├── root
├── run
├── srv
├── sys
├── tmp
├── usr
└── var

19 directories, 0 files
/
├── home
│   ├── asoberan ( ~ )
│   │   └──Documents ( .. )
│   │      └──unix_class (I am here) ( . )
│   ├── duan
│   └── yourusername
├── etc
├── bin
├── tmp
...

Symbolic Links

UNIX filesystems have a feature called symbolic links, which are files that point to another file, called the target. They're similar to shortcuts in Windows.

There are two kinds of symbolic links: hard links and soft links.

Hard Links

Hard links create a file which points to the same space on the hard drive as the file which is being linked to, rather than to that file itself.

/home/me/file      /home/you/file
    |                    |
 hard link           hard link
    |                   ,'
    V               _.-`
HARD DRIVE <....--``

A file won't be deleted until every hard link to it is deleted.

To create a hard link, you use the ln command with the source file, and the target hard linked file:

ln arrayDat.txt arrayHard.txt

ls

Notice that any changes you make to arrayDat.txt will be reflected in arrayHard.txt.

Soft Links

Soft links create a file which points to the original file or directory.

/home/me/file <---Soft link--- /home/you/file
     |
     |
     V
HARD DRIVE

Soft links break if the original file is deleted.

To create a soft link, you use the ln command with the -s flag.

ln -s arrayDat.txt arraySoft.txt

ls -l

Notice that any changes you make to arrayDat.txt will be reflected in arraySoft.txt. The ls command will show that arraySoft.txt points to the file arrayDat.txt.

Your Luria home folder has a couple of soft links automatically set up pointing to storage servers so that common programs don't take up too much space on the head node.

File Transfer (in-progress)

  • scp

    • Allows you to transfer files between a local and remote host. It does this by leveraging ssh.

    • Shares the same syntax as the normal cp command.

# Transfer from local computer to remote computer

scp <source file> <username>@<remote host>:<destination>

scp agenda.txt [email protected]:/home/asoberan/unxiclass

# Transfer from remote computer to local computer

scp <username>@<remote host>:<source file> <destination>

scp [email protected]:/home/asoberan/arrayDat.txt .
Home Directory

Why Unix?

Unix is an operating system (suite of programs), originally developed in 1969 at the Computing Science Research Group at Bell Labs. In 1991, Linus Torvalds released a Unix-like kernel called the Linux kernel that he subsequently open-sourced. Linux-based operating system are the most widely used Unix-like operating system, and what we use in our local high performance computing cluster.

From here on out, any mention of UNIX should be regarded as meaning both UNIX and UNIX-like operating systems, primary Linux-based operating systems.

PC vs. Shared System

There is a distinction between the computing paradigm popular at the time of UNIX's creation and the one present within UNIX. Personal computing (PC) vs. a shared system.

Whereas previous systems were meant to be used by one person at a time, UNIX has multi-user and multi-tasking support.

Aside from that feature, UNIX also had these features that made it popular:

  • network-ready (built-in TCP /IP networking makes easy to communicate between computers).

  • very powerful programming environments (free of the many limits imposed by other operating systems).

  • robust and stable.

  • scalable, portable, flexible.

  • open source.

About the Unix Shell

One very important utility is the shell.

Shell - user interface to the system, lets you communicate with a UNIX computer via the keyboard.

The shell interfaces you with the computer through three channels: standard input (STDIN), standard output (STDOUT), and standard error (STDERR). STDERR is the channel used for communicating program errors. We'll focus on STDIN and STDOUT.

STDIN is the channel, or stream, from which a program reads your input. When you type in your terminal, you're typing into the STDIN.

STDOUT is the channel, or stream, from which a program outputs data. When you run echo 'Hello, World!', the shell is printing the output into STDOUT. In this case, STDOUT is the terminal screen.

[asoberan@luria net]$ echo 'Hello, World!'
Hello, World!

STDIN and STDOUT are the ways that you and the shell communicate. But what is the shell doing when you tell it to run a command?

Process of Running a Shell Command

Let's say I want to run the echo command, as we did above. We type echo in the terminal, then press Enter. What the shell does is read the PATH environment variable, which is a system variable that lists the different paths that programs are stored in.

Using printenv to print the PATH environment variable. Notice how many paths in this environment variable are ones dictated by the FHS:

[asoberan@luria net]$ printenv PATH
/home/asoberan/.conda/envs/class/bin:/home/software/conda/miniconda3/condabin:/home/software/conda/miniconda3/bin:/home/asoberan/.local/bin:/opt/ohpc/pub/mpi/mvapich2-gnu/2.2/bin:/opt/ohpc/pub/compiler/gcc/5.4.0/bin:/opt/ohpc/pub/prun/1.1:/opt/ohpc/pub/autotools/bin:/opt/ohpc/pub/bin:/home/asoberan/.local/bin:/home/software/google-cloud-sdk/google-cloud-sdk-193.0.0/google-cloud-sdk/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin

Then, it will go through these paths and look for the program you are trying to run. For example, the echo command is stored in /usr/bin, a directory we can see in the PATH environment variable:

Using which to find the location of a program:

[asoberan@luria net]$ which echo
/usr/bin/echo

Once it finds it, it will run the program in a new process, wait for the program to give an exit status, then print any output of the program into STDOUT, in this case our terminal screen. Now, the shell is ready to run another command.

SSH

The shell is used to interface with a UNIX computer, but we're not going to go to the server room, plug in a keyboard, and start typing commands for our shell. We're going to use the Secure Shell (SSH) to remotely connect to Luria.

SSH is a program that lets you open a remote shell to your UNIX system. It's secure because your connection to the system is kept encrypted. For instructions on using SSH to connect to Luria, refer to the page below:

Accessing Luria

Using the Unix Shell

Editing the Unix Tree

Make sure to run all of the copy commands below, as we'll be using files from the Ostrom server later in the course.

  • mkdir

    • This command name stands for "make a directory".

    • It creates a new folder (or directory). If no path is specified, the new directory is created in the current directory.

# start in your home directory
cd ~

# create a directory named "unixclass"
# with a subdirectory named "testdir"
mkdir unixclass
mkdir unixclass/testdir

# change current directory directly to "testdir"
cd unixclass/testdir 

# go to the parent directory (i.e. unixclass)
# and print the working directory
cd ..
pwd
  • touch

    • This command creates an empty file with the given name.

# Go to /home/<your username>/unixclass
cd ~/unixclass

# Create an empty file named "hello.txt"
touch hello.txt

# List the files in the directory to verify that worked
ls
  • cp and mv

    • These commands stand for "copy" and "move," respectively.

    • They copy / move files and directories to the specified location.

    • Wildcards symbols such as "*" or "?" are commonly used to copy multiple files with a single command.

      • The symbol "*" stands for any number of alphanumeric characters.

      • The symbol "?" stands for a single alphanumeric character.

# Start in ~/unixclass
cd ~/unixclass

# copy the file named arrayDat.txt into your unix_class directory
cp /net/ostrom/data/dropbox/arrayDat.txt .
ls
# copy all the files with suffix "array”
# into the current directory 
cp /net/ostrom/data/dropbox/array* .
ls

# copy any file whose extension is "txt" 
cp /net/ostrom/data/dropbox/*.txt .
ls

# copy all files
cp /net/ostrom/data/dropbox/* .
ls
  • rmdir and rm

    • rmdir only removes empty directories, rm removes both directories and files.

    • rm needs -r flag to remove directories.

# Start in ~/unixclass
cd ~/unixclass

# Create a temporary directory
mkdir trash

# create copies of arrayDat.txt in the temporary directory
cp arrayDat.txt trash/arrayDat1.txt
cp arrayDat.txt trash/arrayDat2.txt
cp arrayDat.txt trash/arrayDat3.txt
cp arrayDat.txt trash/arrayDat4.txt

ls trash
# Try to delete the directory with `rmdir`
rmdir trash

# Try to delete the directory with `rm -r`
rm -r trash

File Storage and Compression

  • quota

    • Luria has a disk usage quota on the head node. You can use the quota command to check what the size of the quota is and how much space you're currently using.

    • The -s flag will display the quota in a human-readable format. (e.g. 18K, 14M, 65G).

quota -s

Disk quotas for user asoberan (uid 247789): 
     Filesystem   space   quota   limit   grace   files   quota   limit   grace
/dev/mapper/cl-home
                  4391M  18432M    100G           24925       0       0        

# Space column shows how much space you're currently using on the head node
# Quota column is total amount of space allotted to you
  • du

    • Stands for "disk usage". Reports how much disk space a directory is taken up.

    • Defaults to displaying the disk usage in kilobytes, but passing it the -h flag will return disk usage in a human-readable format. (e.g. 18K, 14M, 65G).

    • Will search the entire depth of the Unix tree starting from the directory you give it. You can control the depth which it searches by passing the -d <num> flag.

# Displays the disk usage of every file under your home directory
# The last entry will be the disk usage of your entire home directory
du -h ~

# Displays the disk usage of every file and directory immediately
# under your home directory
du -h -d 1 ~

# Displays the disk usage of one file, file.txt
du -h file.txt
  • tar

    • Combines multiple files or directories into one archive for easy sharing. Similar to "zipping" files, however tar does not compress by default.

    • Create an archive by passing the -cf flags

    • Can compress multiple files/folders using gzip by passing the -z flag when creating a new archive.

    • Useful when you have files that take up a lot of space and you want to save space.

    • Extract an archive by passing the -xf flags. Un-compress an archive by passing the -z flag to those two flags.

# Create an archive of a directory and name it my_directory.tar
tar -cf my_directory.tar <directory>

# Create a compressed archive of a directory and
# name it my_compressed_directory.tar.gz
tar -czf my_compressed_directory.tar.gz <directory>

# Extract archive my_directory.tar
tar -xf my_directory.tar

# Uncompress the archive my_compressed_directory.tar.gz
tar -xzf my_compressed_directory.tar.gz
  • zip

    • Zip multiple files and directories into one file. Zipping is similar to archiving with tar, but zipped files are easier to deal with on Windows machines.

# Zip a directory to the file my_directory.zip
zip my_directory.zip <directory>

# Unzip my_directory.zip
unzip my_directory.zip

Searching the Unix Tree

  • find

    • Searches directories to find a directory or file.

find ~ -name arrayDat*
  • grep

    • Searches a file for a specific pattern.

    • Can be used similarly to find, but it also searches file contents.

grep -r "arrayDat*" .

Files

The following sections explain how to view, create, and edit files all through the Unix shell. While it is possible to do these things straight from the shell, if you are using Visual Studio Code set up as described on the page linked below, you can view, create, and edit files through it instead of the shell.

Visual Studio Code

Unix Shell Manual Pages

Unix comes with a preloaded documentation, also known as "man pages".

Anatomy of a man page

Each page is a self-contained document consisting of the following sections:

  • NAME: the name and a brief description of the command.

  • SYNOPSIS: how to use the command, including a listing of the various options and arguments you can use with the command. Square brackets ([ ]) are often used to indicate optional arguments. Any arguments or options that are not in square brackets are required.

  • DESCRIPTION: a more detailed description of the command including descriptions of each option.

  • SEE ALSO: References to other man pages that may be helpful in understanding how to use the command in question.

Search for man pages

When you are not sure of the exact name of a command, you can use the apropos command to see all the commands with the given keyword on their man page.

apropos keyword 

For example, to see which commands are relevant to the task of copying, type the following:

apropos copy

View man pages

To view a manual page, use the command man.

man command_name

When viewing a man page, use the following keys to navigate the pages:

  • spacebar- view the next screen;

  • b (for "back") - view the previous screen;

  • arrow keys - navigate within the pages;

  • q (for "quit") - quit and return to the shell prompt.

For example, to view the man page associated with the command cp, type the following:

man cp

The Unix Shell

How Unix File Ownership Works

UNIX is a multi-user environment, how does it maintain security inside of itself?

Every file has an owner and permissions.

There are three levels of ownership:

  • User

  • Group

  • Other

Three levels of permissions:

  • Read

  • Write

  • Execute

How is this useful? Well imagine a lab! There are files that an entire lab should have access to. So put all users in a lab into a lab group, then sharing a file between a lab just means making the lab group the owner of a file. This is already what we do on Luria!

Read File Ownership and Permissions

You can view the ownership and permissions of a file by running ls -l. Here's an example of the output of ls -l:

[asoberan@luria unixclass]$ ls -l
total 40
-rwxr-xr-x 1 asoberan ki-bcc 3845 Apr 28 21:48 arrayAnnot.txt
-rwxr-xr-x 2 asoberan ki-bcc 3134 Apr 28 22:11 arrayDat.txt
-rwxr-xr-x 2 asoberan ki-bcc 3134 Apr 28 22:11 arrayHard.txt
-rwxr-xr-x 1 asoberan ki-bcc 1634 Apr 28 21:48 arraylen.txt
lrwxrwxrwx 1 asoberan ki-bcc   12 Apr 28 22:13 arraySoft.txt -> arrayDat.txt
-rwxr-xr-x 1 asoberan ki-bcc 3128 Apr 28 21:48 beep.txt
-rw-r--r-- 1 asoberan ki-bcc  528 Apr 28 21:48 ex1.sh
-rw-r--r-- 1 asoberan ki-bcc  479 Apr 28 21:48 ex2.sh
-rw-r--r-- 1 asoberan ki-bcc  368 Apr 28 21:48 ex3.sh
-rwxr-xr-- 1 asoberan ki-bcc  340 Apr 28 21:48 test_1.fastq
-rwxr-xr-- 1 asoberan ki-bcc  340 Apr 28 21:48 test_2.fastq

Let's focus on the arrayDat.txt file.

-rwxr-xr-x 2 asoberan ki-bcc 3134 Apr 28 22:11 arrayDat.txt

asoberan ki-bcc describes the ownership of a file. In this case, the user asoberan and the group ki-bcc own the file.

-rwxr-xr-x describes the permissions that the owners of the file have.

The permissions can be broken down into three parts:

  • The user's permissions

    • -rwx

    • The user asoberan has read (r), write (w), and execute (x) permissions for this file.

  • The group's permissions

    • r-x

    • The group ki-bcc has read (r) and execute (x) permissions for this file.

  • Everyone's else's permissions

    • r-x

    • Anyone who isn't asoberan or in the group ki-bcc has read (r) and execute (x) permissions for this file.

To check what group you are in, you can use the groups command:

[asoberan@luria unixclass]$ groups
ki-bcc

Traversing the Unix Tree

  • cd

    • This command name stands for "change directory".

    • It changes your current working directory to the specified location.

    • The last visited directory is referred to with a hyphen ("-").

# go to root directory of the system and print the working directory
cd /
pwd

# go to the home directory and print the working directory
cd ~
pwd

# change directory using the absolute path and print the working directory
cd /net/bmc-pub14/data/
pwd

Viewing File Contents

  • cat

    • Concatenates two files together, then prints content to STDOUT.

    • Colloquially, it's used to print out the contents of a single file.

  • head

    • By default, prints the first 10 lines of a file to STDOUT.

    • You can pass the -n flag to specify the number of lines to print.

  • tail

    • By default, prints the last 10 lines of a file to STDOUT.

    • You can pass the -n flag to specify the number of lines to print.

  • less

    • Lets you page through a long file or stream of text a.k.a. you can scroll.

    • Exit by pressing q

# Start in ~/unixclass
cd ~/unixclass

cat arrayDat.txt
# Start in ~/unixclass
cd ~/unixclass

# Print the first 10 lines of arrayDat.txt
head arrayDat.txt

# Print the first 5 lines of arrayDat.txt
head -n 5 arrayDat.txt
# Start in ~/unixclass
cd ~/unixclass

# Print the last 10 lines of arrayDat.txt
tail arrayDat.txt

# Print the last 5 lines of arrayDat.txt
tail -n 5 arrayDat.txt
# Start in ~/unixclass
cd ~/unixclass

# Scroll through the contents of arrayDat.txt
less arrayDat.txt

The Unix Filesystem

Manipulating Files

Until now, all the examples of using a shell that we've seen have used our keyboard input as STDIN and the terminal screen as STDOUT. However, the shell gives us the ability to change STDIN and STDOUT by using two mechanisms: pipes and redirectors.

Redirectors

A redirector is denoted by a >. A redirector redirects the output of a command into a file, essentially changing the STDOUT to a file rather than the terminal screen.

For example, echo 'Hello, World!' usually outputs the message "Hello, World!" onto our terminal screen. However, if we run

echo 'Hello, World!' > hello.txt

nothing will be printed to the screen. Instead, STDOUT is changed to the file hello.txt, so the output of echo 'Hello, World!' will be pasted into the file hello.txt.

We can verify that by running cat hello.txt, which should print out the contents of hello.txt to our terminal screen.

A redirector will always replace the contents of a file. If, however, you wish to append content to a file, you can use >>.

Using >> to append contents to a file:

echo 'Hello, again!' >> hello.txt

Pipes

A pipe is denoted by a |. Pipes allow us to use the output of a command as the input of another command.

For example, let's say you wanted to search through the contents of cities.csv to search for every line that includes "TX". You can do so by printing the contents of the file using cat, then piping the output into grep "TX".

cat cities.csv | grep "TX"

This will print every line that includes "TX".

Piping is a very powerful feature of the shell, because it lets us compose many small operations into a very complex set of operations. For example:

# List the files in the current directory
# and print extended information
ls -l
# Format the output of ls -l by removing 
# cases of multiple spaces and replace
# them with a single space for easier
# parsing
ls -l | tr -s " "
# Use the cut program to separate each line
# of output into colums, delimited by a space
# then print columns 3-4. In the case of this
# output, we're printing the users and groups
# which own the files in the directory
ls -l | tr -s " " | cut -d" " -f3-4
# Alphabetically sort the users and groups which
# own the files in the directory
ls -l | tr -s " " | cut -d" " -f3-4 | sort
# Count the instances of users and groups so we can
# know who owns files in the current directory and
# how many files they own
ls -l | tr -s " " | cut -d" " -f3-4 | sort | uniq -c

Getting System Information

  • uname

    • uname prints various different pieces of system information. Possible pieces of information are shown in the uname manual pages.

    • The -a flag will print all the information uname can print.

uname -a
Linux luria 3.10.0-514.2.2.el7.x86_64 \#1 SMP Tue Dec 6 23:06:41 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
  • file

    • This will print out the file type of the given file.

file hello.txt
hello.txt: ASCII text

file my_directory.zip
my_directory.zip: Zip archive data, at least v2.0 to extract
  • ps

    • Shows the running processes on the system. Pass it the argument aux to get extensive information about every running process on the system

# Prints information on every running process on the system
ps aux

# Pipe the output into grep and search for every process being run by a specific user
ps aux | grep "asoberan"
asoberan  6342  0.0  0.0 115716  2372 pts/124  Ss   May02   0:00 /bin/bash
asoberan  7855  0.0  0.0 302216  4668 pts/124  Sl+  15:25   0:00 srun --pty bash
asoberan  7873  0.0  0.0  96680   768 pts/124  S+   15:25   0:00 srun --pty bash
  • htop

    • An interactive utility to see every running process on the system, as well as extensive information about the CPU, RAM, etc. usage on the system.

    • Useful to check this utility if you're worried about your own CPU and RAM usage on the head node.

Viewing the Unix Tree

  • pwd

    • This command name stands for "print working directory".

    • It displays on the terminal the absolute path name of the current (working) directory.

    • It is useful to locate where you are currently in the Unix tree.

# print working directory
pwd
  • ls

    • This command stands for "list".

    • It displays the content of a directory.

    • By default, the content is listed in lexicographic order.

# list the content of the current directory
ls
ls .

# list content of the parent directory
ls ..

# list the contens of your home directory from anywhere
ls ~

Network Filesystems

NFS

The UNIX filesystem does not have to be present on the local hard drive. UNIX supports filesystems that are present over the network, called Network Filesystems (NFS), so that you can mount a directory from another computer and traverse it just as if it were a normal directory on your local computer.

Our Luria cluster uses this technology to mount our many storage servers, which run NFS servers. Every storage server is mounted at /net.

tree -L 1 / command, showing the different storage servers mounted at /net:

[asoberan@luria]$ tree -L 1 /net
/net
├── bmc-lab1
├── bmc-lab2
├── bmc-lab2.mit.edu
├── bmc-lab3
├── bmc-lab4
├── BMC-LAB4
├── bmc-lab5
├── bmc-lab6
├── BMC-LAB6
├── bmc-lab7
├── bmc-lab8
├── bmc-pub10
├── bmc-pub14
├── bmc-pub15
├── bmc-pub16
├── bmc-pub17
├── bmc-pub9
├── gelbart
├── ostrom
└── rowley

21 directories, 0 files

From our perspective, these look like any other directory on Luria, but they're present on completely different computers.

SMB

Another protocol for sharing filesystems over the network is SMB, which is supported on UNIX systems through Samba. In addition to NFS, our storage servers run Samba servers, which allow you to mount them on your local laptop or PC. For instructions on doing so, refer to the page linked below:

Active Data Storage

File Ownership

Writing Scripts

Instead of running a single command at a time, you can combine multiple commands into a file to create a script. Scripts can simplify multi-step processes into a single invocation, saving you time in the long-run.

Writing a script is as simple as writing a file. Usually, we make the first line of the script #!/bin/bash to tell the shell to use bash when running the script.

You can create variables in a bash script using the = operator. So to make a variable named myname, you'd write myname="Allen". To use the variable later in the script, you'd prefix it with a $. For example, $myname.

You can also ask for a user's input and store that into a variable by using the read command. Preface it with echo "question?" to give context for what the user is inputting. For example:

echo "What is your name?"
read name

echo "Hello, $name"

You can run shell commands inside of a script and store their results in a variable. To do so, you wrap the command with $(). For example, to get the size of a file and store it in a variable, you'd do file_size = $(du file).

Knowing this, let's create a script that looks through the files in a directory, and moves any files above a certain size to a new folder. Name it sizewatcher.sh

#!/bin/bash

# Make variables for directory to loop through
# and directory where files should be moved.
directory="/home/asoberan/unixclass"
new_directory="/home/asoberan/bigfiles"

# If the new directory doesn't exist, then
# create the directory
if [ ! -d $new_directory]; then
    mkdir $new_directory;
fi;

# Loop through the files in the directory.
# Store the size of the file (in KB), in
# file_size. If the file size is greater
# than 400000 KB, move the file to the
# new directory 
for file in $(ls $directory);
do
    file_size=$(du $file | cut -f1);
    if [ $file_size -gt "400000" ]; then
        mv $file $new_directory;
    fi;
done;

Once you've created and saved this file, make sure to modify its permissions to let it be executed. An easy way of doing this is running the following command:

chmod +x sizewatcher.sh

Then run the script:

./sizewatcher.sh

Let's create another script that asks a user for a directory, then sorts the files in that directory into folders that correspond to the year and month that the file was created. Name it sorter.sh:

#!/bin/bash

# Ask for directory and store it in a variable
# called "directory"
echo "What directory to check?"
read directory

# Check if the provided argument is a directory
if [ ! -d "$directory" ]; then
    echo "Error: $directory is not a directory."
    exit 1
fi

# Iterate over files in the directory
for file in "$directory"/*; do
    # Check to see if the file is really file and
    # not a directory, etc.
    if [ -f "$file" ]; then
        # Get the year and month of creation for the file
        year=$(date -r "$file" +%Y)
        month=$(date -r "$file" +%m)

        # Create directory for the year and month if it doesn't exist
        mkdir -p "$directory/$year-$month"

        # Move the file to the corresponding directory
        mv "$file" "$directory/$year-$month"
        echo "Moved $file to $directory/$year-$month"
    fi
done

echo "File sorting complete."

Make it executable and then run it.

Change File Ownership and Permissions

To change the owners of a file, you can use the following commands:

  • chown

    • This changes the user who owns a particular file or directory.

  • chgrp

    • This changes the group who owns a particular file or directory.

To change the permissions that the owners of a file have, you use the chmod command.

chmod takes two arguments: the permissions to give a file, and the file to change the permissions of. The permissions are represented as a 3-digit number, where each digit represents the permissions to give the user, group, or others, respectively.

Read, write, and execute permissions are represented by the following numbers:

  • r - 4

  • w - 2

  • x - 1

If you want to give someone multiple permissions, you add the numeric representations of those permissions together. For example:

  • Read, write, execute (rwx) permissions = (4 + 2 + 1) = 7

  • Write, execute (_wx) permissions = (2 + 1) = 3

So let's say you want to give a file the following permissions:

  • The user that owns the file should be able to read, write, and execute the file. rwx = (4 + 2 + 1) = 7

  • The group that owns the file should be able to read and execute the file. r_x = (4 + 1) = 5

  • Anyone else should have no permissions for the file. ___ = 0

The you'd run the following command:

chmod 750 arrayDat.txt

Remembering the syntax for this command can be quite cumbersome, so I recommend using a third-party website such as https://quickref.me/chmod.