Connecting to the Linux server
Most materials modelling research of the type you will be learning about in this course is done through the Linux Operating System. For example, all the major High Performance Computing (HPC) resources you may eventually use, such as ARCHER2 which is a national HPC facility and CX1 which is the Imperial HPC system, run Linux.
In this course, you will also be using a Linux server to run your
calculations. You can remotely open a desktop session on the server we will be
using with x2go
which can be run from the
Imperial Software Hub if you’re using
a lab workstation. If you’d prefer to use your own laptop to connect, you’ll
need to download and install the x2go client that’s freely available via
their website.
If there is not already a session named “mt-studenty” or “mt-studentx” visible in the right-hand list on x2go when you start it up, you will need to create it.
- The shortcut to create a new session is “Ctrl+n”.
- In this menu, you should name the session “mt-studenty” or “mt-studentx”.
- Set the Host as “mt-studenty.mt.ic.ac.uk” or “mt-studentx.mt.ic.ac.uk”.
- Select “XFCE” as the session type in the bottom drop down menu.
- An image of this is given below:
- Once you have set this, click OK.
- Then select the session from the right-hand menu (e.g. click on the picture).
- Enter your Imperial username and password to login.
- The username is case sensitive and should be in lowercase letters.
- Note: while I try to ensure students registered for the course have access before the first class, I may not have the most up-to-date information. If it doesn’t accept your login details please let me know and I can give you access.
- If it is your first time connecting it may ask you whether you want to accept the host key (or something to that effect), to which you should click “yes”.
- If using a college workstation you may also get some pop up messages about pulseaudio which you’re free to ignore.
- It will take a minute or so for the session to fully start up so by patient.
- When logging in for the first time, once the desktop loads it will ask if you want to accept the default menu configuration. You should accept the default here. If you do not, you’ll have no desktop menus. You can still access everything by right clicking on the desktop. The desktop menus are all customizable and you can always change them later however you’d like, but the default is the best place to start. If you make any customizations, they’ll be remembered when you next log in.
- I’d suggest also turning the sound volume on your pc down as command line work can produce a lot of beeps which may become annoying.
Once you’ve loaded into the desktop session, feel free to take a look around the menus and settings that are there. This is the XFCE desktop manager, and while it is basic, it is quite light on resources (important as you are all in fact using a single powerful workstation simultaneously), and very customizable.
Using the command line
For the majority of first principles codes calculations are launched from the command line, so it’s important to have some familiarity with it. Often, when running calculations on a HPC system, the command line will be the only way to access the system. While you’ll be able to use a file manager on the desktop to access files, I encourage you to use the command line instead even if it takes you a little longer initially.
The terminal application brings you to the command line and allows you to type in various commands and instructions directly. Open the terminal application from the menu. You can customize the appearance to your preference by changing the text size, font and colour scheme.
You can launch a program from the terminal by typing its name. For example
typing gedit
will launch the gedit text editor by default. Note if you
do this now, you won’t be able to use the same terminal until gedit is closed.
A way around this is discussed later in the lab.
Files and Directories
Once you have opened the terminal, the first thing you need to be able to do is navigate the file structure.
ls
ls
is used to list the contents of a directory. Please note the first
character of this command is the lowercase L
rather than the digit 1
. Try
typing it now. By default you should start off in your home directory when you
open a terminal, so you should see several automatically generated
directories. Note ls
has been set to use colour highlighting for you by
default, so different types of files are shown in different colours.
You can list the contents of any directory by name. For example:
ls /opt/share/
will list the contents of that directory. Here we have provided the full (absolute) path, since we started with/
.- The filesystem hierarchy always has
/
as the very top level directory for a linux system. - If we specify a path to a directory, beginning with
/
it means we are giving its absolute path from the top level directory down to its location. - Your home directory is
/home/username/
whereusername
is your college id.~
is a shortened way to write this:~
also refers to your home directory.
- The filesystem hierarchy always has
ls Documents
will list the contents of theDocuments
directory below the current directory if it exists. This is the same as typingls ./Documents
as will be discussed later. Note this directory is empty to begin with so you won’t see any output from this command.- Here we specify a path without a leading
/
. It means we are referring to a file or directory relative to your current location.
- Here we specify a path without a leading
Useful Options
ls -lh
. The-l
option requests a long listing format, giving extra details about each file and folder, such as permissions, owner, size, and modification date. The-h
requests sizes in a human-readable format, otherwise sizes are listed in bytes i.e. with this optionls
would list “4.0K” instead of “4096”.ls -a
. This tellsls
to list all files, i.e. to not to ignore files starting with “.” which are typically hidden byls
and often used to store various user configuration files. Try typing this now, and you’ll see many more files and folders than previously.
pwd
pwd
will output the name of the present working directory. Try using it now.
It should respond with something like /home/username
.
cd
cd
is used to change directory. You can specify either the full path, or a
directory relative to your current directory.
cd Documents
to move to the/home/username/Documents
directory assuming we are in/home/username
.cd /home/username/Desktop
to move to the usersDesktop
directory regardless of where we start from.cd ..
to move up one directory.cd .
to move to the present directory (i.e. do nothing).cd
andcd ~
will both move to your home directory.
Task
Now that you know these few commands, let’s take a look at some important locations on the system for this course:
- The various files associated with the labs for this course are all available
in
/opt/Courses/MSE404
. - The documentation files for the version of Quantum Espresso we will be using
are all collected in
/opt/share/quantum-espresso/doc-6.3
. - Try changing to each of these directories now and taking a look at files that are contained in each of them.
mkdir
mkdir
is used to create new directories. For example mkdir carbon
will
create a new directory called carbon
.
Useful Options
mkdir -p
can be used to make parent directories as needed. For example,mkdir -p carbon/lda/dos
ifcarbon
already exists will createlda
as a subdirectory of it, anddos
as a subdirectory oflda
.
rm
rm
is used to delete files. For example, if we create a text file with the
text editor gedit
called tmp
, then we can delete it with rm tmp
in the
directory that contains it.
Useful Options
rm -r
will recursively delete a directory and any files contained in it.rm -f
can be used to force deletion without prompting. Combine with the previous option with caution. There is no recycle bin when usingrm
. When you use it to delete a file it is gone, unless you have a backup. (On mt-studenty and mt-studentx there are daily snapshots you can recover deleted files from in ~/.snapshots)
rmdir
rmdir
is used to delete directories. Note this will give an error if the
directory you try to delete is not empty.
cp
cp
is used to copy files. For example, cp sourcefile newfile
would copy
the file sourcefile
to newfile
.
Useful Options
cp -r
will recursively copy a directory and any files contained in it.
mv
mv
is used to move or rename files. For example,
mv oldfilename newfilename
would rename a file from “oldfilename” to
“newfilename”, while mv filename ../
would move “filename” to the parent
directory.
ln
ln
is used to create links. These are somewhat similar to shortcuts in
windows except that they operate at the filesystem level. A link to a file can
be used in the same way as the file itself. They are useful for many reasons:
- Convenience: you may want to frequently use a particular directory, but it has quite a long path. You can create a link to it in your home directory and use the link in exactly the same manner as the original path.
- Saving space: For example, when working with a code that expects a certain file as input in the current directory. If the file is very large you might want to avoid making copies of it, so you can instead make a link to the file.
- If a parameter file is used in several places, you can use links so that when you update one file, they are all updated, instead of having to update several copies manually.
Note, while the other commands we’ve looked at so far can all be done through the file manager rather than the command line, link creation is less well supported or at least less obvious in most file manager software.
ln
can create two types of links: hard links are created by default, and
symbolic links are created when the -s
option is specified. Symbolic links
contain text which points to a particular file, while hard links point to the
same data on the disk as the original file. It’s generally simplest to stick
to symbolic links, especially starting out. An example of how links are
used is as follows:
mkdir tmpdir1; mkdir tmpdir2; touch tmpdir1/tmpfile1
This will create two directories and a file in one of them. Note semi-colons can be used to separate commands on the same line. And we have also used thetouch
command which offers a quick way to create an empty file.cd tmpdir2
Move to the tmpdir2 directory.ln -s ../tmpdir1/tmpfile1 tmpfile2
This will create a symbolic link to the file.ln -s ../tmpdir1 .
will create a link to thetmpdir1
directory intmpdir2
.- Typing
ls
in thetmpdir2
directory will show the filestmpfile2
andtmpdir1
. If your colour highlighting is working, you should see these as different colours to usual files (probably light blue). To see where the links point you can usels -l
.
cp -s
can also be used to create symbolic links (with cp -l
creating hard
links), and can be a little easier to use, particularly for linking a set of
files.
cp -s ../results/*dat .
would create symbolic links from all the.dat
files in the results directory to the current directory.
If you create a link to a directory, one thing that may not be immediately
obvious is how to delete it. Say your link is called tmpdir
. If you
try rm tmpdir/
you’ll get the error
rm: cannot remove 'tmpdir': Is a directory
. Then if you try
rmdir tmpdir/
you’ll get the error
rmdir: failed to remove 'tmpdir/': Not a directory
. What you need to do
is use rm
but ensure you don’t have a trailing slash following the link
name: rm tmpdir
. NB if you use rm -r tmpdir/
, this will delete the
contents of the directory you have linked to which is likely not what you
want to do, so be careful.
Useful Options
ln -sf
: the additional-f
option forces the link to overwrite any pre-existing files or links.- For example, if we had incorrectly done
ln -s ../tmpdir1/tmpdir1 .
in the above example, the link would have been created, but wouldn’t point to any valid file.ls
will usually show these invalid links as red. We could then doln -sf ../tmpdir1 .
to overwrite thetmpdir1
link with the correct path, rather than deleting incorrect link and repeating theln
command.
- For example, if we had incorrectly done
Note If you create a link correctly and have colour highlighting in the
terminal application you are using, symbolic links that work will be blue. If
they are red it means they don’t point to a valid file or directory. You can
use ls -l
to see where a symbolic link points. If you’ve made a mistake
you’ll need to either delete the incorrect link, or use the -f
flag as
described above to overwrite it.
Task
Now that you know how to copy files and make directories, let’s set up some folders for the labs in the course. While it’s of course possible to do this through the file manager, please use the commands we’ve gone through so far to do the following:
- In your home directory make the directory
MSE404
. - Copy the directory with the input files for this course
/opt/Courses/MSE404/lab01
to the directory you just created~/MSE404
. Remember you need to pass an additional flag tocp
to copy a directory. - List the contents of directory
~/MSE404/lab01/methane
withls -l
- You’ll see there are three files in the directory you copied.
CH4.in
- an input file for the DFT code you’ll be using.C.pz-vbc.UPF
- a pseudopotential file for carbon, which tells the DFT code what approximation to use in your calculation.H.pz-vbc.UPF
- a pseudopotential file for hydrogen, which tells the DFT code what approximation to use in your calculation.
- You’ll learn more about these types of files in later classes.
- If you’ve followed the steps as given here, and have colour highlighting
in the terminal application you’re using, you’ll see the pseudopotential files are
highlighted in red when you did
ls -l
. This is because they are links and currently don’t point to valid files. It’s looking for files at../../pseudo/C.pz-vbc.UPF
and../../pseudo/H.pz-vbc.UPF
where../..
means go up two directories from the current directory. In other words this is looking for the files at~/MSE404/pseudo/C.pz-vbc.UPF
and~/MSE404/pseudo/H.pz-vbc.UPF
. You’ll correct this in the next step.
- You’ll see there are three files in the directory you copied.
- Go to the
~/MSE404
directory and make a copy there of the directory holding various pseudopotentials used in the course:/opt/Courses/MSE404/pseudo
. You should now see that the file in~/MSE404/lab01/methane
is blue when you dols
in this directory indicating it can now find a file in the location specified by the link. - Within your
MSE404
directory also create a symbolic link to the Quantum Espresso documentation folder/opt/share/quantum-espresso/doc-6.3
calledqe-doc
.
Tab completion
A useful feature which has long been available in the Linux terminal is tab completion. This is also referred to as command-line completion. These days this is available in both Mac and Windows systems also.
Try typing ged
in a terminal and hitting tab
. The full name of the program
gedit
should automatically fill in. If more than one match is possible,
pressing tab
twice will bring up a list of possible matches. Try typing
ge
and pressing tab twice to see this.
This can also be used to complete file and directory names. From your home
directory, try typing cd De
and pressing tab. It should automatically fill
in the full command to be cd Desktop/
.
Reading a file
There are several built-in programs that will allow you to examine a text file in the terminal.
cat
cat
is the simplest command to check the output of a file.
cat filename
will output the contents offilename
to the terminal. For a large file you may need to scroll back up in the terminal (e.g.shift+PdgUp
).
more
more
works in a somewhat similar way to cat
, except for files bigger than
the terminal window, it allows you to look through a page (screen) full of
data at a time.
more filename
to start paging through a file.- To move to the next page press
space
orz
. - To move down one line of text press
enter
. - To quit before reaching the end of the file press
q
.
less
less
is more advanced version of more
with many more features. This doesn’t
output file contents to the terminal however, so once you exit less
your
terminal will be as it was before the file was opened.
less filename
to open a file.- move down one line:
down-arrow
,enter
,j
, ore
. - move up one line:
up-arrow
,y
, ork
. - move down one page:
page-down
,space
, orf
. - move up one page:
page-up
,w
, orb
. - search for a string of text:
/
, type in the search text and press enter. - A useful feature when trying to check a file that is currently being written
to is to press
F
. This is similar totail -F
. It will scroll forward and keep trying to read when the end of the file is reached. To exit this mode pressctrl+c
. - quit:
q
.
Task
Let’s try taking a look at one of the quantum-espresso documentation files at
~/MSE404/qe-doc/INPUT_PW.txt
(since you linked the folder in the previous
set of tasks).
- Try looking at this file with
cat
. You’ll see the entire file is dumped straight to your terminal. If you want to scroll up and down from the keyboard you can do this withshift+PdgUp
andshift+PdgDwn
. - Now try looking at the same file with
less
. Now you can scroll up and down with the arrow keys or with thek
andj
keys. - Within less, search for the text
disk_io
. - Take a look at the file
~/MSE404/lab01/methane/CH4.in
inless
. This is a basic Quantum Espresso input file. You’ll be learning more about this in the next lab.
Other Useful Command Line Features
While at first, the command line can seem quite basic. There are many built-in features that allow you to work extremely efficiently after a little practice.
The Manual and man command
Linux systems provide an interface to a built-in reference manual through the
man
command. This is very useful as a quick way to check the syntax used by
a command, and what options it might take.
- Type
man ls
to check the manual page for the ls command. - Type
man man
to check the manual page for the man command. - Type
man intro
to get an introduction to user commands.
When reading the manual data, by default it should use the less
pager, so
you can use all the commands for less
listed above to navigate and search
through the manual.
Wildcards
In Linux many commands will accept wild cards as a way to perform their action on a set of files.
*
is used to represent zero or more characters. The*
wildcard is very commonly used and worth remembering.ls *.pdf
would list all the files ending in.pdf
in the current directory.rm *.o
would remove all files ending in.o
in the current directory.
?
is used to represent any single character.cp ../example?.cpp .
would copy all files with names such asexample1.cpp
,example2.cpp
,examplea.cpp
from the parent to the current directory.
[]
specifies a range.ls example_0[2,3,5]
would list files with the nameexample_02
,example_03
andexample_05
.ls example_0[3-5]
would list files with the nameexample_03
,example_04
andexample_05
.
For more information on this see man 7 glob
.
IO Redirection
IO redirection allows you to easily send the output of a command to a file, use a file as input for a command, or use the output of one command as input to another command. The most frequently used features are as follows:
>
is used to redirect the output from a command into a file.ls > dirlist.txt
will save the result of thels
command in a file calleddirlist.txt
. Take care: ifdirlist.txt
already exists it will overwrite the previous contents.
&>
is used to redirect both the standard output and standard error from a command to the same file.>>
is used to redirect the output from a command to a file, but will append the output to the end of the file if the file already exists.- If you run
ls >> dirlist.txt
you will see the ls output repeated several times in the filedirlist.txt
.
- If you run
<
is used to redirect a file to the input of a command.less < dirlist.txt
will work asless
is happy to read either from stdin or a file.
|
is used to redirect the output of a command to the input of another command.ls | less
will take the output of thels
command and display it directly inless
(this is why it’s useful forless
to be able to read from stdin).
tee
allows you to split a stream so that the output goes both to stdout and to a file.ls -l | tee dirfiles
will output the result ofls -l
to the terminal and also save it in the filedirfiles
.
History
By default the terminal keeps a history of previously entered commands. You can
scroll back through previously entered commands using the up-arrow
and
down-arrow
keys.
The history
command will output a list of all the previously entered commands
following a history index number. By default the list is usually limited to
1000 commands, with the oldest being removed as new commands are entered. You
can look at the full list with history | less
.
!!
will repeat the most recent command.!n
will repeat the command indexed byn
in the history list. For example!100
would repeat the command at history index 100.!string
will repeat the most recent command in the history starting with “string”.
For more details see man history
.
The linux shell also gives a built in way to search through previously entered
commands. You can press ctrl+r
to bring up the reverse-history-search and
type some text from the command you want to re-enter. You can keep hitting
ctrl+r
to search back through commands containing that text. If you
accidentally go past the entry you want, ctrl+s
will go forward through
your history.
An additional useful shortcut is ‘alt+.’ which will enter the most recent argument you used, and repeated presses will cycle back through arguments.
Job Control
Linux allows jobs to run both in the foreground and background of a terminal
session. If we launch a program from the terminal by typing e.g. gedit
and
pressing enter (gedit
is a text editor), then we can’t can’t do anything else
in the terminal until gedit
is closed. gedit
is said to be running in the
foreground in this case.
With gedit
running in the foreground, we can move it to the background as
follows.
- In the terminal where it is running, press
ctrl+z
. This stops the job. A message will appear in the terminal like[1]+ Stopped gedit
. If you try to type, or click any menu buttons, ongedit
it won’t respond. - Type
bg
in the terminal. This moves the most recent stopped job to the background, i.e.gedit
in this example. Now you’ll be able to use bothgedit
and the terminal as normal.
- To move a job that’s running in the background, back to the foreground you
can use the
fg
command. - To start a job in the background you can append
&
to the command when you enter it in the terminal. For examplegedit &
will launchgedit
in the background directly.
Environment Variables
There are a couple of environment variables that are important for running commands via the command line.
$PATH
- this is a colon-separated list of directories that are searched for executables to run when you type in a command without specifying any path.- If you type
echo $PATH
you can see what this is. - For example, when you type
gedit
, it searches each directory in turn for an executable namedgedit
, and runs the first one it finds. - You can use the
which
command to see where it would find an executable. - For example
which gedit
returns/usr/bin/gedit
on our system. - Note you can run a particular executable by specifying it’s path and not
relying on
$PATH
. So if$PATH
was unset, we could type/usr/bin/gedit
, or we could docd /usr/bin
,./gedit
.
- If you type
$LD_LIBRARY_PATH
- this is a colon-separated list of additional directories to search for libraries used by applications.- Libraries are pieces of code that are common to many applications, so are shared and installed separately. And when the application is run, it tries to find this library.
- Generally, most default OS software will work fine without this set, but if you’re using some custom mathematical libraries or compilers this may be needed.
Environment Modules: Lmod
Environment modules simplify the whole process of changing environment variables to get some custom software package working. When a module is loaded it sets whatever environment variables are required automatically. This lets many different and possibly conflicting software versions be installed on the same system. Environment modules are almost always used on HPC systems to make various versions of common version of computational software packages available. The version installed on mt-studenty/mt-studentx is called Lmod.
- The main command is
module
. - For help type
module help
(the same text is output if you justmodule
). - To see what modules are currently available type
module avail
. (some groups of modules which require a compiler module to be loaded will not be visible). - To load the module making software compiled using the gcc compilers
available type
module load gcc
. This makes some additional modules available. - To see a complete list of all modules type
module spider
. - The particular module implementation we use allows us to easily switch between versions of codes compiled with different compilers.
- For example to load the Quantum Espresso module, you’ll first need to have the “gcc” and “mkl” modules loaded. You can load everything in a single command though provided the order is correct:
module load gcc mkl espresso
- Throughout the course, when running on the undergrad server you’ll need to remember to load the modules above before you’ll be able to use the codes in the Quantum Espresso package.
- If you want to have the modules necessary loaded automatically whenever you
log in to the server, you can add the above module load command to the
end of the file
~/.bashrc
. This file is run whenever your start a shell and can be used to customize your environment.
Task
Now you know enough to run your first DFT calculation.
- First we need to load the Quantum Espresso module. An optimized compilation
of this is available on the system compiled with the GCC compilers and
Intel Math Kernel Libraries. Type the following in your terminal to load
everything you need in one go
module load gcc mkl espresso
to load everything you need. - Now you’ll be able to use the various Quantum Espresso package executables
directly. Go to the methane folder you copied earlier
(
~/MSE404/lab01/methane
) and look at what files are in that directory. - The input file there is for the
pw.x
code. You can pass this file topw.x
with input redirection and save the output to a file with output redirection (otherwise the output will just be in the terminal). To do this typepw.x < CH4.in > CH4.out
. - See what files are present in the directory now. As well as the output file we saved, some others were generated.
- Take a look through the output file using
less
. We’ll discuss the contents of this in next week’s lab, but for now skip to the section immediately following the line that readsEnd of self-consistent calculation
:- This lists the calculated eigenvalues.
- We have four doubly-occupied bands in our calculation, so you’ll see four numbers.
- Following this, you can see in the output, the value of the highest occupied energy level in eV, and the total energy in Rydberg.
- Congratulate yourself on running your first DFT calculation.
There are a lot of commands to remember from this week’s lab. You can always
refer back to this week, but to make things easier in future weeks you might
find a cheatsheet useful. If you don’t want to make your own, a quick Google
search of Linux cheat sheet
brings up a number of options.
If you’re interested in seeing some more command line tools, some of these are listed in linuxcommands. If you have time, I encourage you to at least skim through them, as it’s very useful to know what kinds of things are possible. We’ll refer back to some of these commands throughout the course.
Once you’re finished with everything, be sure to log out of the remote desktop session. You can see this option if you click on your name on the top right.
Summary
To run the DFT example in this lab you will need to do the following:
- Make a directory that will hold the material for the MSE404 labs:
mkdir ~/MSE404
. - Copy the directory with the inputs for this lab to this directory:
cp -r /opt/Courses/MSE404/lab01 ~/MSE404
. - Copy the directory with the various pseudopotential files to this directory
also:
cp -r /opt/Courses/MSE404/pseudo ~/MSE404
. - Make a link to the Quantum Espresso documentation directory in your home directory:
cd ~/MSE404; ln -s /opt/share/quantum-espresso/doc-6.3 ./qe-doc
. - Load the modules needed to run Quantum Espresso:
module load gcc mkl espresso
. - Go to the directory with the input files you copied and use these with
pw.x
saving the output in a file:cd ~/MSE404/lab01/methane; pw.x < CH4.in > CH4.out
. - Take a look through the output file:
less CH4.out
. - Logout of the remote server once you’re done.