01a: From File Explorer to Command Line

Roman E. Reggiardo, Vikas Peddu

11 July, 2023

Prediction:

What does data look like? Where do we keep it?

Computers and Data

The command line gives us an opportunity to work with computers in a more direct way compared to “Folders” and “Files”.

For example:

Look at your desktop, what’s there?

Folders?

What’s inside?

This sounds like something we could see in everyday life…

Computers, on the other hand, think differently

Really they operate in binary, on 1’s and 0’s, but just beneath ‘Files’ and ‘Folders’ they operate with:

“Folders” == “Directories”

“Files” == “Paths”

Prediction:

How will your experience be different when you switch from “human think” to “command line/computer think”?

Practice 01:

Create a “Project Folder”

Using the Graphical User Interface (GUI), navigate to the Documents folder and create a Folder called:

BSCC_2023_folder

Practice 01A:

Create sub-folders and a README.txt file:

data/
notes/
code/

make sure to keep the folder names lowercase!

Practice 01B:

Edit README.txt to say something….

Reflection:

What you’ve just made is a ‘project folder’:

Why is this useful? What could it help with moving forward?

So why even bother with the command line? We just did so much!

We can create, copy, rename, move (etc etc) files using a GUI, it works, its easy and intuitive.

  • only possible if a GUI is provided

  • not reproducible (could someone else get the same result? yes…but in the same exact way? maybe not.)

  • limited to what the GUI/Operating System designers decided was useful for most users.

Bioinformatics asks us to do much more with data, and thus computers, than most.

Beyond the GUI: Command Line interface [CLI]

On your desktops you can view and interact with the command line with a terminal – a piece of software built to enable our use of command line programming

terminal on left, file explorer on right

Prediction:

Instead of the mouse, how will we navigate the command line?

Anatomy of the command line

localuser@bscc-vm-vikas:~$

user @ host : location $ commands you enter here

user : the name your computer calls you (we’re all bscc-VM)

host : the computer you’re using

location : the directory/folder you’re in

$ : prompt that tells you commands come next

Before we start: An introduction to code chunks

If you see something like this

# print out 'hello' to the command line
echo 'hello'
hello

its code (and expected output below) that you can execute in your terminal on the command line. Just copy and paste or type it in yourself, hit enter, and follow along

Lines that start with # and are grayed out are comments that don’t run as code; they are just notes

Before we start: Some basic commands

The command line accepts…commands!

These tell us where we are and what’s in here with

# present working directory
pwd
/Users/vikas/Documents/UCSC/teaching/ucsc_scbc_2022/slides/01_command_line_and_bash
# list directory contents
ls
01a_file_explorer_to_cmdline.qmd
01a_file_explorer_to_cmdline.rmarkdown
01a_file_explorer_to_cmdline_files
01b_commandline_to_bash.html
01b_commandline_to_bash.qmd
01b_commandline_to_bash_files
01c_scripting_for_automation.html
01c_scripting_for_automation.qmd
01c_scripting_for_automation_files
01d_intermediate_bash_scripting.html
01d_intermediate_bash_scripting.qmd
01d_intermediate_bash_scripting_files

Another basic command with big implications: cd

Our command line “mouse” – move somewhere, anywhere

# change directory to SCBC_2022_folder
cd BSCC_2023_folder
pwd
/Users/vikas/Documents/UCSC/teaching/ucsc_scbc_2022/slides/01_command_line_and_bash/BSCC_2023_folder

hint: try hitting tab to complete the rest of your statements – it’ll auto-complete (like texting, etc)

What this looks like in file explorer

If you’re going forward, you should be able to go back too

# present working directory
pwd
/Users/vikas/Documents/UCSC/teaching/ucsc_scbc_2022/slides/01_command_line_and_bash
# change directory to BSCC_2023_folder
cd BSCC_2023_folder
pwd
/Users/vikas/Documents/UCSC/teaching/ucsc_scbc_2022/slides/01_command_line_and_bash/BSCC_2023_folder
# change directory to the one above/before/in front of pwd
cd .. 
pwd
/Users/vikas/Documents/UCSC/teaching/ucsc_scbc_2022/slides

Briefly: The role of ‘.’ and ‘..’

These are really useful shortcuts that correspond to location

# change directory to BSCC_2023_folder
cd BSCC_2023_folder
# change directory to the one above/before/in front of pwd
cd .. 
pwd .
/Users/vikas/Documents/UCSC/teaching/ucsc_scbc_2022/slides/01_command_line_and_bash

as we just saw, cd .. tells us to go from:

# change directory to BSCC_2023_folder
cd BSCC_2023_folder
pwd
/Users/vikas/Documents/UCSC/teaching/ucsc_scbc_2022/slides/01_command_line_and_bash/BSCC_2023_folder

to

/Users/vikas/Documents/UCSC/teaching/ucsc_scbc_2022/slides/01_command_line_and_bash

Briefly: The role of ‘.’ and ‘..’

These are really useful shortcuts that correspond to location

# change directory to BSCC_2023_folder
cd BSCC_2023_folder
# change directory to the one above/before/in front of pwd
cd .. 
pwd .
/Users/vikas/Documents/UCSC/teaching/ucsc_scbc_2022/slides/01_command_line_and_bash

looking again, pwd . is the same as pwd becuase . is equivalent to pwd, .. is one step up

What exactly is pwd telling us?

what pwd ‘returns’ is a string of text

# present working directory
pwd
/Users/vikas/Documents/UCSC/teaching/ucsc_scbc_2022/slides/01_command_line_and_bash

it is the path to our present working directory:

each directory along the path is separated by /

Thinking in paths

When you cd to a different directory, it has to be in a valid path

from

/Users/vikas/Documents/UCSC/teaching/ucsc_scbc_2022/slides/01_command_line_and_bash

we can go to

/Users/vikas/Documents/UCSC/teaching/ucsc_scbc_2022/slides/01_command_line_and_bash/BSCC_2023_folder

but not to Documents

# should ERROR
cd Documents
bash: line 1: cd: Documents: No such file or directory

If you’re going somewhere, you need directions

To get to the Documents directory from

/Users/vikas/Documents/UCSC/teaching/ucsc_scbc_2022/slides/01_command_line_and_bash

directly, we need to provide the entire path

/Users/vikas/Documents/UCSC/teaching

looks like:

cd /Users/vikas/Documents
pwd
/Users/vikas/Documents

Practice 02: Moving through directories

before beginning please run:

cd
  1. Where are you? What does it appear running just cd does?
  2. What’s there?
  3. Go to Pictures/ directory, is anything in there?
  4. Go to BSCC_2023_folder/data/ directory, what does pwd return?
  5. List the three ways you can get back to where you were in 1.

Reflection:

Do you like moving around with the command line? It is easier than mouse? Is it more challenging?

Summary 02:

pwd : return present working directory - man pwd

ls : list contents of directory - man ls

cd : change directory - man cd

. and .. : represent present working directory and upstream directory, respectively

Quick aside: the command line user man ual

The man tool prints out a manual for the tool you provide it

# man [name of commandline tool]
man pwd

PWD(1)                    BSD General Commands Manual                   PWD(1)

NNAAMMEE
     ppwwdd -- return working directory name

SSYYNNOOPPSSIISS
     ppwwdd [--LL | --PP]

DDEESSCCRRIIPPTTIIOONN
     The ppwwdd utility writes the absolute pathname of the current working
     directory to the standard output.

     Some shells may provide a builtin ppwwdd command which is similar or identi-
     cal to this utility.  Consult the builtin(1) manual page.

     The options are as follows:

     --LL      Display the logical current working directory.

     --PP      Display the physical current working directory (all symbolic
             links resolved).

     If no options are specified, the --LL option is assumed.

EENNVVIIRROONNMMEENNTT
     Environment variables used by ppwwdd:

     PWD  Logical current working directory.

EEXXIITT SSTTAATTUUSS
     The ppwwdd utility exits 0 on success, and >0 if an error occurs.

SSEEEE AALLSSOO
     builtin(1), cd(1), csh(1), sh(1), getcwd(3)

SSTTAANNDDAARRDDSS
     The ppwwdd utility conforms to IEEE Std 1003.1-2001 (``POSIX.1'').

BBUUGGSS
     In csh(1) the command ddiirrss is always faster because it is built into that
     shell.  However, it can give a different answer in the rare case that the
     current directory or a containing directory was moved after the shell
     descended into it.

     The --LL option does not work unless the PWD environment variable is
     exported by the shell.

BSD                             April 12, 2003                             BSD

Prediction:

Now we know how to move, what else have we done with the file explorer/mouse that we can do on the command line?

More command line: Creating

Moving is great, but we can do much more

# create a **file**
touch temp.txt
ls
01a_file_explorer_to_cmdline.qmd
01a_file_explorer_to_cmdline.rmarkdown
01a_file_explorer_to_cmdline_files
01b_commandline_to_bash.html
01b_commandline_to_bash.qmd
01b_commandline_to_bash_files
01c_scripting_for_automation.html
01c_scripting_for_automation.qmd
01c_scripting_for_automation_files
01d_intermediate_bash_scripting.html
01d_intermediate_bash_scripting.qmd
01d_intermediate_bash_scripting_files
BSCC_2023_folder
temp.txt

touch is our command to create a file, not a directory

Creating continued

We also will need to create directories to help organize our files

# make a **directory**
mkdir temporary_dir

since its a directory, we can cd into it

cd temporary_dir
pwd
cd ..
/Users/vikas/Documents/UCSC/teaching/ucsc_scbc_2022/slides/01_command_line_and_bash/temporary_dir

Moving and Copying

If we want to organize, we need to be able to move things

# move a **file** to a different **directory**
# mv [file you want to move] [destination you want to target]
mv temp.txt temporary_dir
ls temporary_dir
temp.txt

The file is no longer present in our pwd:

01a_file_explorer_to_cmdline.qmd
01a_file_explorer_to_cmdline.rmarkdown
01a_file_explorer_to_cmdline_files
01b_commandline_to_bash.html
01b_commandline_to_bash.qmd
01b_commandline_to_bash_files
01c_scripting_for_automation.html
01c_scripting_for_automation.qmd
01c_scripting_for_automation_files
01d_intermediate_bash_scripting.html
01d_intermediate_bash_scripting.qmd
01d_intermediate_bash_scripting_files
BSCC_2023_folder
temporary_dir

Moving and Copying continued

Sometimes moving isn’t quite what we want, in that case we can copy a file and keep the original

# copy a **file** to a different **directory**
# cp [file you want to move] [destination you want to target]
cp temporary_dir/temp.txt .
ls temporary_dir
temp.txt

The file now exists in both directories

/Users/vikas/Documents/UCSC/teaching/ucsc_scbc_2022/slides/01_command_line_and_bash
01a_file_explorer_to_cmdline.qmd
01a_file_explorer_to_cmdline.rmarkdown
01a_file_explorer_to_cmdline_files
01b_commandline_to_bash.html
01b_commandline_to_bash.qmd
01b_commandline_to_bash_files
01c_scripting_for_automation.html
01c_scripting_for_automation.qmd
01c_scripting_for_automation_files
01d_intermediate_bash_scripting.html
01d_intermediate_bash_scripting.qmd
01d_intermediate_bash_scripting_files
BSCC_2023_folder
temp.txt
temporary_dir

Destroying cannot be undone!!

Well, we called these temporary, so let’s get rid of them carefully

# destroy a **file** NOT REVERSIBLE
rm temp.txt
ls
01a_file_explorer_to_cmdline.qmd
01a_file_explorer_to_cmdline.rmarkdown
01a_file_explorer_to_cmdline_files
01b_commandline_to_bash.html
01b_commandline_to_bash.qmd
01b_commandline_to_bash_files
01c_scripting_for_automation.html
01c_scripting_for_automation.qmd
01c_scripting_for_automation_files
01d_intermediate_bash_scripting.html
01d_intermediate_bash_scripting.qmd
01d_intermediate_bash_scripting_files
BSCC_2023_folder
temporary_dir

this doesn’t destroy the copy in temporary_dir

# destroy a **file** NOT REVERSIBLE
ls temporary_dir
temp.txt

Destroying cannot be undone continued

We can also remove directories with -rf (we’ll come back to this later)

# destroy a **directory** NOT REVERSIBLE
rm -rf temporary_dir

now try to cd into our deleted directory

cd temporary_dir
bash: line 0: cd: temporary_dir: No such file or directory

this also destroys the files within

Prediction:

For the same task, which is faster: file explorer or command line?

Practice 03: Doing the same stuff differently

Create a “Project Directory”

Make the directory BSCC_2023_dir with the same contents as BSCC_2023_folder.

Delete the BSCC_2023_folder directory.

There are multiple ways to do this, see if you can find the fastest approach.

Reflection:

Is it starting to seem like the command line might do some things faster/cleaner/better than the file explorer?

Working with files

We’ve already experienced a minor road block in the GUI – there’s no text editor installed.

The command line can help us here, just print the whole file out in terminal:

# 'concatenate' or **print** whole **files**
cat BSCC_2023_dir/README.txt
Vikas, Roman, and Daniel love to code!

Printing continued

cat works great when we have existing files with content, but what about printing in general?

# write/print arguments 
echo 'Hello World'
Hello World

Printing to files: Standard Output

echo is not limited to sending stuff to terminal, we can also send this output to a file

# `>` tells cmd line to send the output to a specific file, rather than printing to terminal
echo 'Banana slugs have no known predators' > BSCC_2023_dir/README.txt
cat BSCC_2023_dir/README.txt
Banana slugs have no known predators

Something new here: > is letting us write our echo output to our BSCC_2023_dir/README.txt file.

But in this case, we over-wrote our original statement!

Printing to files: Thinking about writing

How can we re-generate our original BSCC_2023_dir/README.txt ?

echo 'Vikas, Roman, and Daniel love to code!' > BSCC_2023_dir/README.txt
cat BSCC_2023_dir/README.txt
Vikas, Roman, and Daniel love to code!

Printing to files: Appending instead of over-writing

Much like . and .. , > and >> have distinct functions.

## `>>` lets us append output to an existing file, line by line
echo 'Banana slugs have no known predators' >> BSCC_2023_dir/README.txt
cat BSCC_2023_dir/README.txt
Vikas, Roman, and Daniel love to code!
Banana slugs have no known predators

now we have both lines!

Prediction:

echo would be a pretty laborious way to write something like an essay, but what real uses might it have?

Looking inside files with more precision

cat is great when we want to see everything, but sometimes files are huge and that would be….difficult.

We can also search for key terms in files:

# `grep` will return lines that contain the input argument
# grep [text to search for] [file to search in]
grep "Roman" BSCC_2023_dir/README.txt
Vikas, Roman, and Daniel love to code!

the text doesn’t have to be in the beginning:

grep "predators" BSCC_2023_dir/README.txt
Banana slugs have no known predators

Reflection:

Does this beat opening up Microsoft word or are you still skeptical?

Summary 03:

echo : prints whatever text you want – man echo

cat : prints the entirety of a file (more uses soon) - man cat

grep : finds your input text inside of a given file - man grep

rm : irreversibly deletes files, the -rf flag enables recursive deletion of directories - man rm

mv : moves a file from A to B directory - man mv

cp : copy a file from A to B directory - man cp

A tool to copy data: cp

cp : copy

Before you run the following code, cd to your BSCC_2023_dir/data directory

# cp <this file> <to here>
cp /media/fileshare/talking.txt . 

Practice 04 (pt.1)

Now you’ve got a file: talking.txt (make sure you know where it is).

It contains two types of line : A line for speakers and a line for statements .

  1. Try out two new tools: head -2 and tail -2, what do they do?
  2. How many speakers are there?

Practice 04 (pt.2)

  1. What indicates that a line is a speaker vs. a statement?
  2. Print out only the speaker’s names.

finally, use echo to add your own speaker ID line and statement of choice (hint: \n can be used to represent a new line).

Reflection:

What other steps in the last practice could we theoretically use command line for instead of doing them manually?

Final section project:

Part 1:

Within your BSCC_2023_dir/data directory, create a new file: book_of_poems.txt .

Using echo and the \n trick, add as many lines of speakers and statements as you’d like to, following the format of Practice 04.

Final section project (pt. 2):

Part 2:

Now, utilizing grep --help, figure out a way to make two new files:

  1. poetry.txt : that contains only the statements from book_of_poems.txt (maybe they can form a coherent poem together?).
  2. authors.txt : that contains only the names of the authors (plus any annotation that already existed).

Summary 04:

cp : copy files - man cp

head : print only the top 6 lines of a file (default) - man head

tail : print only the last 6 lines of a file (default) - man tail

Final Reflection

What does automation mean, to you, with work that you might try to accomplish with data?