Useful Linux commands
An introduction to the Linux command line is beyond the scope of this website, and there are many online sesources providing this. Here is a random list of commands we have found useful in preparing and analysing calculations and managing your data.
grep
This commands searches one or more text files for lines containing a given string. In its most basic form, the syntax is
grep '<pattern>' <file>
The first argument is taken as the search string, all following ones as file names. The quotes are only necessary if <pattern> contains spaces or special characters, but they never hurt.
find
Search for files of a given name (or other attributes) in a directory and all its subdirectories. Its most common use is
find <directory> -name '<filename>'
<directory> can be . (a shortcut for the current directory), and <filename> may contain wildcards (see below). Other search criteria than the filename are available (man find...)
du
The du command ("disk usage") determines the size of a directory and all its subdirectories. With the -s flag ("short"), it will only print the total size. The default output unit is blocks (on most computers this will be kilobybytes). The -h flag ("human readable") defaults to the largest non-zero unit such as Mega- Giga- or Terabytes. A simple example would be
du -sh .
If there is no directory as an argument, the current directory is used. Note that this command searches subdirectories recursively, which can take a while for large directories.
Wildcards
Most Linux commands that operate on filenames accept "wildcards" as part of the filename argument. There are complex options for filtering your filenames with wildcards, but the most commonly used ones are * (any number of charachters, including none), and ? (exactly one character). Examples would be:
ls -l *.log
List all files that end with .log.
ls -l ?.log
All files with exactly one character before .log. This would list files called a.log and 5.log, but not 5a.log.
Redirection and piping
Most commands will take their standard input from the keyboard and print their output to the screen. This can be redirected to or from files, or directly used as input for other commands. The most commonly used symbols are the arrows (<, > and >>) for file redirection and the vertical bar | to create a "pipe", buffering the output from one command and usig it as input for another. In detail:
<command> > <file>
Redirect output from <command> into <file>. If <file>exists, overwrite it.
<command> >> <file>
Redirect output from <command> into <file>. If <file>exists, append to it.
<command> < <file>
Take the contents of <file> as standard input for <command>
A useful command in this context is xargs, which passes its input to another command as an argument, rather than standard input. An example command, using some of the described methods, would be
find . -name '*.log' |xargs grep methane >methane-calcs.txt
This would
- find all files ending with the suffix .log in the current directory (.) and all its subdirectories
- pass each of these as the last argument (the file to be searched) to the grep command, which finds lines containing the string "methane"
- write each of these lines into the file methane-calcs.txt