BASH Frequently Asked Questions

Note: The FAQ was split into individual pages for easier editing. Just click the 'Edit' link at the bottom of each entry, and please don't add new ones to this page; create a new page with the entry number instead.
Thank you.

These are answers to frequently asked questions on channel #bash on the freenode IRC network. These answers are contributed by the regular members of the channel (originally heiner, and then others including greycat and r00t), and by users like you. If you find something inaccurate or simply misspelled, please feel free to correct it!

All the information here is presented without any warranty or guarantee of accuracy. Use it at your own risk. When in doubt, please consult the man pages or the GNU info pages as the authoritative references.

BASH is a BourneShell compatible shell, which adds many new features to its ancestor. Most of them are available in the KornShell, too. The answers given in this FAQ may be slanted toward Bash, or they may be slanted toward the lowest common denominator Bourne shell, depending on who wrote the answer. In most cases, an effort is made to provide both a portable (Bourne) and an efficient (Bash, where appropriate) answer. If a question is not strictly shell specific, but rather related to Unix, it may be in the UnixFaq.

If you can't find the answer you're looking for here, try BashPitfalls. If you want to help, you can add new questions with answers here, or try to answer one of the BashOpenQuestions.

Chet Ramey's official Bash FAQ contains many technical questions not covered here.

Contents

  1. How can I read a file line-by-line?
  2. How can I store the return value of a command in a variable?
  3. How can I insert a blank character after each character?
  4. How can I check whether a directory is empty or not?
  5. How can I use array variables?
  6. How can I use associative arrays or variable variables?
  7. Is there a function to return the length of a string?
  8. How can I recursively search all files for a string?
  9. My command line produces no output: tail -f logfile | grep 'foo bar'
  10. How can I recreate a directory structure, without the files?
  11. How can I print the n'th line of a file?
  12. A program (e.g. a file manager) lets me define an external command that an argument will be appended to - but i need that argument somewhere in the middle...
  13. How can I concatenate two variables? How do I append a string to a variable?
  14. How can I redirect the output of multiple commands at once?
  15. How can I run a command on all files with the extension .gz?
  16. How can I use a logical AND in a shell pattern (glob)?
  17. How can I group expressions, e.g. (A AND B) OR C?
  18. How can I use numbers with leading zeros in a loop, e.g. 01, 02?
  19. How can I split a file into line ranges, e.g. lines 1-10, 11-20, 21-30?
  20. How can I find and deal with file names containing newlines, spaces or both?
  21. How can I replace a string with another string in all files?
  22. How can I calculate with floating point numbers instead of just integers?
  23. I want to launch an interactive shell that has a special set of aliases and functions, not the ones in the user's ~/.bashrc.
  24. I set variables in a loop. Why do they suddenly disappear after the loop terminates? Or, why can't I pipe data to read?
  25. How can I access positional parameters after $9?
  26. How can I randomize (shuffle) the order of lines in a file? (Or select a random line from a file, or select a random file from a directory.)
  27. How can two processes communicate using named pipes (fifos)?
  28. How do I determine the location of my script? I want to read some config files from the same place.
  29. How can I display value of a symbolic link on standard output?
  30. How can I rename all my *.foo files to *.bar, or convert spaces to underscores, or convert upper-case file names to lower case?
  31. What is the difference between the old and new test commands ([ and [[)?
  32. How can I redirect the output of 'time' to a variable or file?
  33. How can I find a process ID for a process given its name?
    1. BEGIN greycat rant
    2. END greycat rant
  34. Can I do a spinner in Bash?
  35. How can I handle command-line arguments to my script easily?
  36. How can I get all lines that are: in both of two files (set intersection) or in only one of two files (set subtraction).
  37. How can I print text in various colors?
  38. How do Unix file permissions work?
  39. What are all the dot-files that bash reads?
  40. How do I use dialog to get input from the user?
  41. How do I determine whether a variable contains a substring?
  42. How can I find out if a process is still running?
  43. Why does my crontab job fail? 0 0 * * * some command > /var/log/mylog.`date +%Y%m%d`
  44. How do I create a progress bar?
  45. How can I ensure that only one instance of a script is running at a time (mutual exclusion)?
  46. I want to check to see whether a word is in a list (or an element is a member of a set).
  47. How can I redirect stderr to a pipe?
  48. Eval command and security issues
    1. Examples of bad use of eval
    2. Examples of good use of eval
  49. How can I view periodic updates/appends to a file? (ex: growing log file)
  50. I'm trying to construct a command dynamically, but I can't figure out how to deal with quoted multi-word arguments.
  51. I want history-search just like in tcsh. How can I bind it to the up and down keys?
  52. How do I convert a file from DOS format to UNIX format (remove CRs from CR-LF line terminators)?
  53. I have a fancy prompt with colors, and now bash doesn't seem to know how wide my terminal is. Lines wrap around incorrectly.
  54. How can I tell whether a variable contains a valid number?
  55. Tell me all about 2>&1 -- what's the difference between 2>&1 >foo and >foo 2>&1, and when do I use which?
  56. How can I untar or unzip multiple tarballs at once?
  57. How can group entries (in a file by common prefixes)?
  58. Can bash handle binary data?
  59. I saw this command somewhere: :(){ :|:& } (fork bomb). How does it work?
  60. I'm trying to write a script that will change directory (or set a variable), but after the script finishes, I'm back where I started (or my variable isn't set)!
  61. Is there a list of which features were added to specific releases (versions) of Bash?
  62. How do I create a temporary file in a secure manner?
  63. My ssh client hangs when I try to run a remote background job!
  64. Why is it so hard to get an answer to the question that I asked in #bash ?
  65. Is there a "PAUSE" command in bash like there is in MSDOS batch scripts? To prompt the user to press any key to continue?
  66. I want to check if [[ $var == foo || $var == bar || $var == more ]] without repeating $var n times.
  67. How can I trim leading/trailing white space from one of my variables?
  68. How do I run a command, and have it abort (timeout) after N seconds?
  69. I want to automate an ssh (or scp, or sftp) connection, but I don't know how to send the password....
  70. How do I convert Unix (epoch) timestamps to human-readable values?
  71. How do I convert an ASCII character to its decimal (or hexadecimal) value and back?
  72. How can I ensure my environment is configured for cron, batch, and at jobs?
  73. How can I use parameter expansion? How can I get substrings? How can I get a file without its extension, or get just a file's extension?
  74. How do I get the effects of those nifty Bash Parameter Expansions in older shells?
  75. How do I use 'find'? I can't understand the man page at all!
  76. How do I get the sum of all the numbers in a column?
  77. How do I log history or "secure" bash against history removal?
  78. I want to set a user's password using the Unix passwd command, but how do I script that? It doesn't read standard input!
  79. How can I grep for lines containing foo AND bar, foo OR bar? Or for files containing foo AND bar, possibly on separate lines?
  80. How can I make an alias that takes an argument?
  81. How can I determine whether a command exists anywhere in my PATH?
  82. Why is $(...) preferred over `...` (backticks)?
  83. How do I determine whether a variable is already defined? Or a function?
  84. How do I return a string from a function? "return" only lets me give a number.
  85. How to write several times to a fifo without having to reopen it?
  86. How to ignore aliases or functions when running a command?
  87. How can I get the permissions of a file without parsing ls -l output?

1. How can I read a file line-by-line?

    while read line
    do
        echo "$line"
    done < "$file"          # or   <<< "$var"    to iterate over a variable 

If you want to operate on individual fields within each line, you may supply additional variables to read:

    # Input file has 3 columns separated by white space.
    while read first_name last_name phone; do
      ...
    done < "$file"

If the field delimiters are not whitespace, you can set IFS (input field separator):

    while IFS=: read user pass uid gid gecos home shell; do
      ...
    done < /etc/passwd

Also, please note that you do not necessarily need to know how many fields each line of input contains. If you supply more variables than there are fields, the extra variables will be empty. If you supply fewer, the last variable gets "all the rest" of the fields after the preceding ones are satisfied. For example,

    while read first_name last_name junk; do
      ...
    done <<< 'Bob Smith 123 Main Street Elk Grove Iowa 123-555-6789'
    # Inside the loop, first_name will contain "Bob", and
    # last_name will contain "Smith".  The variable "junk" holds
    # everything else.

The read command modifies each line read, e.g. by default it removes all leading whitespace characters (blanks, tab characters, ... -- basically any leading characters present in IFS). If that is not desired, the IFS variable has to be cleared:

    while IFS= read line
    do
        echo "$line"
    done < "$file"

As a feature, the read command concatenates lines that end with a backslash '\' character to one single line. To disable this feature, KornShell and BASH, as well as the POSIX standard for the Bourne shell, have read -r:

    while IFS= read -r line
    do
        echo "$line"
    done < "$file"

Note that reading a file line by line this way is very slow for large files. Consider using e.g. AWK instead if you get performance problems.

One may also read from a command instead of a regular file:

    some command | while read line; do
       other commands
    done

This method is especially useful for processing the output of find with a block of commands:

    find . -print0 | while read -d $'\0' file; do
        mv "$file" "${file// /_}"
    done

This command reads one filename at a time from the file command and renames the file so that its spaces are replaced by underscores.

Note the usage of -print0 in the find command, which uses NUL bytes as filename delimiters, and -d $'\0' in the read command to instruct it to read all text into the file variable until it finds a NUL byte. By default, find and read delimit their input with newlines; however, since filenames can potentially contain newlines themselves, this default behaviour will split those filenames with newlines up and cause the command block to fail. See FAQ #20 for more details.

Using a pipe to send find's output into a while loop places the loop in a subshell and may therefore cause problems later on if the commands inside the body of the loop attempt to set variables which need to be used outside the loop; in that case, see FAQ 24, or use process substitution like:

    while read line; do
        other commands
    done < <(some command)

Sometimes it's useful to read a file into an array, one array element per line. You can do that with the following example:

    O=$IFS IFS=$'\n' arr=($(< myfile)) IFS=$O

This temporarily changes the Input Field Separator to a newline, so that each line will be considered one field by read. Then it populates the array arr with the fields. Then it sets the IFS back to what it was before.

This same trick works on a stream of data as well as a file:

    O=$IFS IFS=$'\n' arr=($(find . -type f)) IFS=$O

Of course, this will blow up in your face if the filenames contain newlines; see FAQ 20 for hints on dealing with such filenames.

On the other hand, if the file lacks a trailing newline (such as /proc/$$/cmdline on Linux), the line will not be printed by a while read ... loop, as read returns a failure that aborts the while loop, thus failing to print the ultimate line:

    # This does not work:
    echo -en 'line 1\ntruncated line 2' | while read line; do echo $line; done

    # This does not work either:
    echo -en 'line 1\ntruncated line 2' | while read line; do echo "$line"; done; echo "$line"

    # This works:
    echo -en 'line 1\ntruncated line 2' | (while read line; do echo "$line"; done; echo "$line")

For a discussion of why the second example above does not work as expected, see FAQ #24.

2. How can I store the return value of a command in a variable?

Well, that depends on exactly what you mean by that question. Some people want to store the command's output (either stdout, or stdout + stderr); and others want to store the command's exit status (0 to 255, with 0 typically meaning "success").

If you want to capture the output:

    var=$(command)      # stdout only; stderr remains uncaptured
    var=$(command 2>&1) # both stdout and stderr will be captured

If you want the exit status:

    command
    var=$?

If you want both:

    var1=$(command)
    var2=$?

The assignment to var1 has no effect on command's exit status, which is still in $?.

If you don't actually want the exit status, but simply want to take an action upon success or failure:

    if command
    then
        echo "it succeeded"
    else
        echo "it failed"
    fi

Or (shorter):

    command && echo "it succeeded" || echo "it failed"

What if you want the exit status of a command in a few that are piped to each other? Use the PIPESTATUS array (BASH only). Say you want the exit status of grep in the following:

    grep foo somelogfile | head -5
    result=${PIPESTATUS[0]}

Now, some trickier stuff. Let's say you want only the stderr, but not stdout. Well, then first you have to decide where you do want stdout to go:

    var=$(command 2>&1 >/dev/null)  # Save stderr, discard stdout.
    var=$(command 2>&1 >/dev/tty)   # Save stderr, send stdout to the terminal.
    var=$(command 3>&2 2>&1 1>&3-)  # Save stderr, send stdout to stderr

It's possible, although considerably harder, to let stdout "fall through" to wherever it would've gone if there hadn't been any redirection. This involves "saving" the current value of stdout, so that it can be used inside the command substitution:

    exec 3>&1                    # Save the place that stdout (1) points to.
    var=$(command 2>&1 1>&3)     # Run command.  stderr is captured.
    exec 3>&-                    # Close FD #3.

    # Or this alternative:
    { var=$(command 2>&1 1>&3-) ;} 3>&1 # Capture stderr, let stdout through.

In the last example above, note that 1>&3- duplicates FD 3 and stores a copy in FD 1, and then closes FD 3.

What you cannot do is capture stdout in one variable, and stderr in another, using only FD redirections. You must use a temporary file to achieve that one.

Well, you can use a horrible hack like:

   result=$( { stdout=$(cmd) ; } 2>&1; echo "this line is the separator"; echo "$stdout")
   var_out=${result#*this line is the separator$'\n'}
   var_err=${result%$'\n'this line is the separator*}

Obviously, this is not robust, because either the standard output or the standard error of the command could contain whatever separator string you employ.

3. How can I insert a blank character after each character?

    sed 's/./& /g'

Example:

    $ echo "testing" | sed 's/./& /g'
    t e s t i n g

For more examples of sed 1-liners, see sed 1-liners or the sed FAQ.

4. How can I check whether a directory is empty or not?

  • I just deleted three completely wrong answers from this question. Please, people, make sure that when you add to the FAQ, your answers

    • answer the question that was asked, and
    • actually work

    Thanks. -- GreyCat

Most modern systems have an "ls -A" which explicitly omits "." and ".." from the directory listing:

    if [ -n "$(ls -A somedir)" ]
    then
        echo directory is non-empty
    fi

This can be shortened to:

    if [ "$(ls -A somedir)" ]
    then
        echo directory is non-empty
    fi

Another way, using Bash features, involves setting the special shell option which changes the behavior of globbing. Some people prefer to avoid this approach, because it's so drastically different and could severely alter the behavior of scripts.

Nevertheless, if you're willing to use this approach, it does greatly simplify this particular task:

    shopt -s nullglob
    if [[ -z $(echo *) ]]; then
        echo directory is empty
    fi

It also simplifies various other operations:

    shopt -s nullglob
    for i in *.zip; do
        blah blah "$i"  # No need to check $i is a file.
    done

Without the shopt, that would have to be:

    for i in *.zip; do
        [[ -f $i ]] || continue  # If no .zip files, i becomes *.zip
        blah blah "$i"
    done

(You may want to use the latter anyway, if there's a possibility that the glob may match directories in addition to files.)

Finally, you may wish to avoid the direct question altogether. Usually people want to know whether a directory is empty... because they want to do something involving the files therein, etc. Look to the larger question. For example, something like this may be an appropriate solution:

   find "$somedir" -type f -exec echo Found unexpected file {} \;

It's all a matter of addressing the program's actual requirements.

5. How can I use array variables?

BASH and KornShell already have one-dimensional arrays indexed by a numerical expression, e.g.

  •  host[0]="micky"
     host[1]="minnie"
     host[2]="goofy"
     i=0
     while (($i < ${#host[@]} ))
     do
         echo "host number $i is ${host[i++]}"
     done

The awkward expression  ${#host[@]}  returns the number of elements for the array host. Also noteworthy is the fact that inside the square brackets, i++ works as a C programmer would expect. The square brackets in an array reference force an ArithmeticExpression.

It's possible to assign multiple values to an array at once, but the syntax differs from BASH to KornShell:

  •  # BASH
     array=(one two three four)
     # KornShell
     set -A array -- one two three four

Using array elements en masse is one of the key features. Much like "$@" for the positional parameters, "${arr[@]}" expands the array to a list of words, one array element per word, even if the words contain internal whitespace. For example,

  •  for x in "${arr[@]}"; do
       echo "next element is '$x'"
     done

If one simply wants to dump the full array, "${arr[*]}" will cause the elements to be concatenated together, with the first character of IFS (a space by default) between them.

  •  arr=(x y z)
     IFS=/; echo "${arr[*]}"; unset IFS
     # prints x/y/z

BASH's arrays are also sparse. Elements may be added and deleted out of sequence.

  •  arr=(0 1 2 3)
     arr[42]="what was the question?"
     unset arr[2]
     echo "${arr[*]}"
     # prints 0 1 3 what was the question?

BASH 3.0 added the ability to retrieve the list of index values in an array, rather than just iterating over the elements:

  •  echo ${!arr[*]}
     # using the previous array, prints 0 1 3 42

Parameter Expansions may be performed on array elements en masse as well:

  •  arr=(abc def ghi jkl)
     echo "${arr[@]#?}"          # prints bc ef hi kl
     echo "${arr[@]/[aeiou]/}"   # prints bc df gh jkl
     

For examples of loading data into arrays, see FAQ #1. For examples of using arrays to hold complex shell commands, see FAQ #50 and FAQ #40.

6. How can I use associative arrays or variable variables?

Sometimes it's convenient to have associative arrays, arrays indexed by a string. Perl calls them "hashes", while Tcl simply calls them "arrays". KornShell93 already supports this kind of array:

  •  # KornShell93 script - does not work with BASH
     typeset -A homedir             # Declare KornShell93 associative array
     homedir[jim]=/home/jim
     homedir[silvia]=/home/silvia
     homedir[alex]=/home/alex
     
     for user in ${!homedir[@]}     # Enumerate all indices (user names)
     do
         echo "Home directory of user $user is ${homedir[$user]}"
     done

BASH (including version 3.x) does not (yet) support them. However, we could simulate this kind of array by dynamically creating variables like in the following example:

  •  for user in jim silvia alex
     do
         eval homedir_$user=/home/$user
     done

This creates the variables

  •  homedir_jim=/home/jim
     homedir_silvia=/home/silvia
     homedir_alex=/home/alex

with the corresponding content. Note the use of the eval command, which interprets a command line not just one time like the shell usually does, but twice. In the first step, the shell uses the input homedir_$user=/home/$user to create a new line homedir_jim=/home/jim. In the second step, caused by eval, this variable assignment is executed, actually creating the variable.

Print the variables using

  •  for user in jim silvia alex
     do
         varname=homedir_$user              # e.g. "homedir_jim"
         eval varcontent='$'$varname        # e.g. "/home/jim"
         echo "home directory of $user is $varcontent"
     done

The eval line needs some explanation. In a first step the command substitution is run:

  •  eval varcontent='$'$varname

becomes

  •  eval varcontent=$homedir_jim

In a second step the eval re-evaluates the line, and converts this to

  •  varcontent=/home/jim

Before starting to use dynamically created variables, think again of a simpler approach. If it still seems to be the best thing to do, have a look at the following disadvantages:

  1. It's hard to read and to maintain.
  2. The variable names must match the regular expression ^[a-zA-Z_][a-zA-Z_0-9]* -- i.e., a variable name cannot contain arbitrary characters but only letters, digits, and underscores. In the example above we could not have processed the home directory of a user named hong-hu, because a dash '-' cannot be a valid part of a variable name.

  3. Quoting is hard to get right. If content strings (not variable name) can contain whitespace characters and quotes, it's hard to quote it right to preserve it.
  4. If the program handles unsanitized user input, it can be VERY dangerous!

Here is the summary. "var" is a constant prefix, "$index" contains index string, "$content" is the string to store. Note that quoting is absolutely essential here. A missing backslash \ or a wrong type of quote (e.g. apostrophes '...' instead of quotation marks "...") can (and probably will) cause the examples to fail:

  • Set variables
    •   eval "var$index=\"$content\""    # index must only contain characters from [a-zA-Z0-9_]
  • Print variable content
    •   eval "echo \"var$index=\$$varname\""
  • Check if a variable is empty
    •   if eval "[ -z "\$var$index\" ]"
        then echo "variable is empty: $var$index"
        fi

You've seen the examples. Now maybe you can go a step back and consider using AWK associative arrays, or a multi-line environment variable instead of dynamically created variables.

7. Is there a function to return the length of a string?

The fastest way, not requiring external programs (but usable only with BASH and KornShell):

${#varname}

or

expr "$varname" : '.*'

(expr prints the number of characters matching the pattern .*, which is the length of the string)

or

expr length "$varname"

(for a BSD/GNU version of expr. Do not use this, because it is not POSIX).

8. How can I recursively search all files for a string?

90% of the time, all you need is one of these:

# Recurse and print matching lines (GNU grep):
grep -r "$search" .

# Recurse and print only the filenames (GNU grep):
grep -r -l "$search" .

You can use find if your grep lacks a -r option:

find . -type f -exec grep -l "$search" {} \;

The {} characters will be replaced with the current file name.

This command is slower than it needs to be, because find will call grep with only one file name, resulting in many grep invocations (one per file). Since grep accepts multiple file names on the command line, find can be instructed to call it with several file names at once:

find . -type f -exec grep -l "$search" {} +

The trailing '+' character instructs find to call grep with as many file names as possible, saving processes and resulting in faster execution. This example works for POSIX find, e.g. with Solaris, as well as very recent GNU find.

GNU (and recent BSD) find commands use a helper program called xargs for the same purpose:

find . -type f -print0 | xargs -0 grep -l "$search"

The -print0 / -0 options ensure that any file name can be processed, even ones containing blanks, TAB characters, or newlines.

9. My command line produces no output: tail -f logfile | grep 'foo bar'

Most standard Unix commands buffer their output if used non-interactively. This means, that they don't write each character (or even each line) as they are ready, but collect a larger number (e.g. 4 kilobytes) before printing it. In the case above, the tail command buffers its output, and therefore grep only gets its input in e.g. 4K blocks.

Unfortunately there's no easy solution to this, because the behaviour of the standard programs would need to be changed. *See bottom of section before taking 'no easy solution' to heart* Some programs provide special command line options for this purpose, e.g.

grep (e.g. GNU version 2.5.1)

--line-buffered

sed (e.g. GNU version 4.0.6)

-u,--unbuffered

awk (some GNU versions)

-W interactive, or use the fflush() function

tcpdump, tethereal

-l

The expect package (http://expect.nist.gov/) has an unbuffer example program, which can help here. It disables buffering for the output of a program. Example usage:

    unbuffer tail -f logfile | grep 'foo bar'

There is another option when you have more control over the creation of the log file. If you would like to grep the real-time log of a text interface program which does buffered session logging by default (or you were using script to make a session log), then try this instead:

   $ program | tee -a program.log

   In another window:
   $ tail -f program.log | grep whatever

Apparently this works because tee produces unbuffered output. This has only been tested on GNU tee, YMMV.

If you simply wanted to highlight the search term, rather than filter out non-matching lines, you can use the 'less' program instead of Bash:

   $ less program.log

Inside less, start a search with the '/' command (similar to searching in vi). This should highlight any instances of the search term. Now put less into "follow" mode, which by default is bound to shift+f. You should get an unfiltered tail of the specified file, with the search term highlighted.

10. How can I recreate a directory structure, without the files?

With the cpio program:

    cd "$srcdir"
    find . -type d -print | cpio -pdumv "$dstdir"

or with GNU tar, and more verbose syntax:

    cd "$srcdir"
    find . -type d -print | tar c --files-from - --no-recursion | tar x --directory "$dstdir"

This creates a list of directory names with find, non-recursively adds just the directories to an archive, and pipes it to a second tar instance to extract it at the target location.

11. How can I print the n'th line of a file?

The dirty (but not quick) way would be:

    sed -n ${n}p "$file"

but this reads the whole input file, even if you only wanted the third line.

This one avoids that problem:

    sed -n "$n{p;q;}" "$file"

At line $n the command "p" is run, printing it, with a "q" afterwards: quit the program.

Another way, more obvious to some, is to grab the last line from a listing of the first n lines:

   head -n $n $file | tail -n 1 

Another approach, using AWK:

   awk "NR==$n{print;exit}" file

If you want more than one line, it's pretty easy to adapt any of the previous methods:

   x=3 y=4;
   sed -n "$x,${y}p;${y}q;" "$file"                # Print lines $x to $y; quit after $y.
   head -n $y "$file" | tail -n $(($y - $x + 1))   # Same
   awk "NR>=$x{print} NR==$y{exit}" "$file"        # Same

12. A program (e.g. a file manager) lets me define an external command that an argument will be appended to - but i need that argument somewhere in the middle...

    sh -c 'echo "$1"' -- hello

13. How can I concatenate two variables? How do I append a string to a variable?

There is no concatenation operator for strings (either literal or variable dereferences) in the shell. The strings are just written one after the other:

    var=$var1$var2

If the right-hand side contains whitespace characters, it needs to be quoted:

    var="$var1 - $var2"

If you're appending a string that doesn't "look like" part of a variable name, you just smoosh it all together:

    var=$var1/.-

Otherwise, braces or quotes may be used to disambiguate the right-hand side:

    var=${var1}xyzzy
    # Without braces, var1xyzzy would be interpreted as a variable name

    var="$var1"xyzzy
    # Alternative syntax

CommandSubstitution can be used as well. The following line creates a log file name logname containing the current date, resulting in names like e.g. log.2004-07-26:

    logname="log.$(date +%Y-%m-%d)"

There's no difference when the variable name is reused, either:

    string="$string more data here"

Bash 3.1 has a new += operator that you may see from time to time:

    string+=" more data here"     # EXTREMELY non-portable!

It's generally best to use the portable syntax.

14. How can I redirect the output of multiple commands at once?

Redirecting the standard output of a single command is as easy as

    date > file

To redirect standard error:

    date 2> file

To redirect both:

    date > file 2>&1

In a loop or other larger code structure:

    for i in $list; do
        echo "Now processing $i"
        # more stuff here...
    done > file 2>&1

However, this can become tedious if the output of many programs should be redirected. If all output of a script should go into a file (e.g. a log file), the exec command can be used:

    # redirect both standard output and standard error to "log.txt"
    exec > log.txt 2>&1
    # all output including stderr now goes into "log.txt"

Otherwise command grouping helps:

    {
        date
        # some other command
        echo done
    } > messages.log 2>&1

In this example, the output of all commands within the curly braces is redirected to the file messages.log.

15. How can I run a command on all files with the extension .gz?

Often a command already accepts several files as arguments, e.g.

    zcat *.gz

(One some systems, you would use gzcat instead of zcat. If neither is available, or if you don't care to play guessing games, just use gzip -dc instead.) If an explicit loop is desired, or if your command does not accept multiple filename arguments in one invocation, the for loop can be used:

    for file in *.gz
    do
        echo "$file"
        # do something with "$file"
    done

To do it recursively, you should use a loop, plus the find command:

    while read file; do
        echo "$file"
        # do something with "$file"
    done < <(find . -name '*.gz' -print)

For more hints in this direction, see FAQ #20, below. To see why the find command comes after the loop instead of before it, see FAQ #24.

16. How can I use a logical AND in a shell pattern (glob)?

That can be achieved through the !() extglob operator. You'll need extglob set. It can be checked with:

$ shopt extglob

and set with:

$ shopt -s extglob

To warm up, we'll move all files starting with foo AND not ending with .d to directory foo_thursday.d:

$ mv foo!(*.d) foo_thursday.d

For the general case:

Delete all files containing Pink_Floyd AND not containing The_Final_Cut:

$ rm !(!(*Pink_Floyd*)|*The_Final_Cut*)

By the way: these kind of patterns can be used with KornShell and KornShell93, too. They don't have to be enabled there, but are the default patterns.

17. How can I group expressions, e.g. (A AND B) OR C?

The TestCommand [ uses parentheses () for expression grouping. Given that "AND" is "-a", and "OR" is "-o", the following expression

    (0<n AND n<=10) OR n=-1

can be written as follows:

    if [ \( $n -gt 0 -a $n -le 10 \) -o $n -eq -1 ]
    then
        echo "0 < $n <= 10, or $n=-1"
    else
        echo "invalid number: $n"
    fi

Note that the parentheses have to be quoted: \(, '(' or "(".

BASH and KornShell have different, more powerful comparison commands with slightly different (easier) quoting:

Examples:

    if (( (n>0 && n<10) || n == -1 ))
    then echo "0 < $n < 10, or n==-1"
    fi

or

    if [[ ( -f $localconfig && -f $globalconfig ) || -n $noconfig ]]
    then echo "configuration ok (or not used)"
    fi

Note that the distinction between numeric and string comparisons is strict. Consider the following example:

    n=3
    if [[ n>0 && n<10 ]]
    then echo "$n is between 0 and 10"
    else echo "ERROR: invalid number: $n"
    fi

The output will be "ERROR: ....", because in a string comparision "3" is bigger than "10", because "3" already comes after "1", and the next character "0" is not considered. Changing the square brackets to double parentheses (( makes the example work as expected.

18. How can I use numbers with leading zeros in a loop, e.g. 01, 02?

As always, there are different ways to solve the problem, each with its own advantages and disadvantages.

If there are not many numbers, BraceExpansion can be used:

    for i in 0{1,2,3,4,5,6,7,8,9} 10
    do
        echo $i
    done

Output:

   00
   01
   02
   03
   [...]

This gets tedious for large sequences, but there are other ways, too. If you have the printf command (which is a Bash builtin, and is also POSIX standard), it can be used to format a number:

    for ((i=1; i<=10; i++))     # Bash 2 for-loop syntax
    do
        printf "%02d " "$i"
    done

In Bash 3, you can use ranges inside brace expansion. Also, since printf will implicitly loop if given more arguments than format specifiers, you can simplify this enormously:

   printf "%03d\n" {1..300}     # Bash 3 brace expansion

The KornShell and KornShell93 have the typeset command to specify the number of leading zeros:

    $ typeset -Z3 i=4
    $ echo $i
    004

If the command seq(1) is available (it's part of GNU sh-utils/coreutils), you can use it as follows:

    seq -w 1 10

or, for arbitrary numbers of leading zeros (here: 3):

    seq -f "%03g" 1 10

Combining printf with seq(1), you can do things like this:

   printf "%03d\n" $(seq 300)

(That may be helpful if your version of seq(1) lacks printf-style format specifiers. Since it's a nonstandard external tool, it's good to keep your options open.)

Finally, the following example works with any BourneShell derived shell to zero-pad each line to three bytes:

   i=0
   while test $i -le 10
   do
       echo "00$i"
       i=`expr $i + 1`
   done |
       sed 's/.*\(...\)$/\1/g'

In this example, the number of '.' inside the parentheses in the sed command determines how many total bytes from the echo command (at the end of each line) will be kept and printed.

Now, since the number one reason this question is asked is for downloading images in bulk, you can use the printf command with xargs(1) and wget(1) to fetch files:

   printf "%03d\n" {$START..$END} | xargs -i% wget $LOCATION/%

Or, in a slightly more general case:

   for i in {1..100}; do
      wget "$prefix$(printf %03d $i).jpg"
      # other commands
   done

19. How can I split a file into line ranges, e.g. lines 1-10, 11-20, 21-30?

Some Unix systems provide the split utility for this purpose:

    split --lines 10 --numeric-suffixes input.txt output-

For more flexibility you can use sed. The sed command can print e.g. the line number range 1-10:

    sed -n '1,10p'

This stops sed from printing each line (-n). Instead it only processes the lines in the range 1-10 ("1,10"), and prints them ("p"). sed still reads the input until the end, although we are only interested in lines 1 though 10. We can speed this up by making sed terminate immediately after printing line 10:

    sed -n -e '1,10p' -e '10q'

Now the command will quit after reading line 10 ("10q"). The -e arguments indicate a script (instead of a file name). The same can be written a little shorter:

    sed -n '1,10p;10q'

We can now use this to print an arbitrary range of a file (specified by line number):

file=/etc/passwd
range=10
firstline=1
maxlines=$(wc -l < "$file") # count number of lines
while (($firstline < $maxlines))
do
    ((lastline=$firstline+$range+1))
    sed -n -e "$firstline,${lastline}p" -e "${lastline}q" "$file"
    ((firstline=$firstline+$range+1))
done

This example uses BASH and KornShell ArithmeticExpressions, which older Bourne shells do not have. In that case the following example should be used instead:

file=/etc/passwd
range=10
firstline=1
maxlines=`wc -l < "$file"` # count line numbers
while [ $firstline -le $maxlines ]
do
    lastline=`expr $firstline + $range + 1`
    sed -n -e "$firstline,${lastline}p" -e "${lastline}q" "$file"
    firstline=`expr $lastline + 1`
done

20. How can I find and deal with file names containing newlines, spaces or both?

The preferred method is still to use find(1):

    find ... -exec command {} \;

or, if you need to handle filenames en masse, with GNU and recent BSD tools:

    find ... -print0 | xargs -0 command

or with POSIX find:

    find ... -exec command {} +

Use that unless you really can't.

Another way to deal with files with spaces in their names is to use the shell's filename expansion (globbing). This has the disadvantage of not working recursively (except with zsh's extensions), but if you just need to process all the files in a single directory, it works fantastically well.

This example changes all the *.mp3 files in the current directory to use underscores in their names instead of spaces. It uses Parameter Expansions that will not work in the original BourneShell, but should be good in Korn and Bash.

for file in *.mp3; do
    mv "$file" "${file// /_}"
done

You could do the same thing for all files (regardless of extension) by using

for file in *\ *; do

instead of *.mp3.

Another way to handle filenames recursively involes using the -print0 option of find (a GNU/BSD extension), together with bash's -d option for read:

unset a i
while read -d $'\0' file; do
  a[i++]="$file"        # or however you want to process each file
done < <(find /tmp -type f -print0)

The preceding example reads all the files under /tmp (recursively) into an array, even if they have newlines or other whitespace in their names, by forcing read to use the NUL byte (\0) as its word delimiter. Since NUL is not a valid byte in Unix filenames, this is the safest approach besides using find -exec.

21. How can I replace a string with another string in all files?

sed is a good command to replace strings, e.g.

    sed 's/olddomain\.com/newdomain.com/g' input > output

To replace a string in all files of the current directory:

    for i in *; do
        sed 's/old/new/g' "$i" > atempfile && mv atempfile "$i"
    done

GNU sed 4.x has a special -i flag which makes the loop and temp file unnecessary:

      sed -i 's/old/new/g' *

On some (but not all) BSD systems, sed has a -i flag as well, but it takes a mandatory argument. The above example then becomes

      sed -i '' 's/old/new/g' *

which in turn does not work with GNU sed. Effectively, whenever portability matters, -i should be avoided.

Those of you who have perl 5 can accomplish the same thing using this code:

    perl -pi -e 's/old/new/g' *

Recursively (requires GNU or BSD find):

    find . -type f -print0 | xargs -0 perl -pi -e 's/old/new/g'

To replace for example all "unsigned" with "unsigned long", if it is not "unsigned int" or "unsigned long" ...:

    find . -type f -print0 | xargs -0 perl -i.bak -pne \
        's/\bunsigned\b(?!\s+(int|short|long|char))/unsigned long/g'

Finally, for those of you with none of the useful things above, here's a script that may be useful:

    #!/bin/sh
    # chtext - change text in several files

    # neither string may contain '|' unquoted
    old='olddomain\.com'
    new='newdomain\.com'

    # if no files were specified on the command line, use all files:
    [ $# -lt 1 ] && set -- *

    for file
    do
        [ -f "$file" ] || continue # do not process e.g. directories
        [ -r "$file" ] || continue # cannot read file - ignore it
        # Replace string, write output to temporary file. Terminate script in case of errors
        sed "s|$old|$new|g" "$file" > "$file"-new || exit
        # If the file has changed, overwrite original file. Otherwise remove copy
        if cmp "$file" "$file"-new >/dev/null 2>&1
        then rm "$file"-new              # file has not changed
        else mv "$file"-new "$file"      # file has changed: overwrite original file
        fi
    done

If the code above is put into a script file (e.g. chtext), the resulting script can be used to change a text e.g. in all HTML files of the current and all subdirectories:

    find . -type f -name '*.html' -exec chtext {} \;

Many optimizations are possible:

  • use another sed separator character than '|', e.g. ^A (ASCII 1)

  • the find command above could use either xargs or the built-in xargs of POSIX find

Note: set -- * in the code above is safe with respect to files whose names contain spaces. The expansion of * by set is the same as the expansion done by for, and filenames will be preserved properly as individual parameters, and not broken into words on whitespace.

A more sophisticated example of chtext is here: http://www.shelldorado.com/scripts/cmds/chtext

22. How can I calculate with floating point numbers instead of just integers?

BASH does not have built-in floating point arithmetic:

    $ echo $((10/3))
    3

Bash cannot do anything with floating point numbers, including compare them to each other(*). Instead, an external program must be used, e.g. bc, awk or dc:

    $ echo "scale=3; 10/3" | bc
    3.333

The "scale=3" command notifies bc that three digits of precision after the decimal point are required.

If you are trying to compare floating point numbers, be aware that a simple x < y is not supported by all versions of bc.

    # This would work with some versions, but not HP-UX 10.20.
    # The here string feature, inherited from rc->zsh->ksh93 was
    # introduced in bash 2.05b-alpha1
    imadev:~$ bc <<< '1 < 2'
    syntax error on line 1,

Alternatively, you could use this:

    if [[ $(bc <<< "1.4 - 2.5") = -* ]]; then
        echo "1.4 is less than 2.5."
    fi

This example subtracts 2.5 from 1.4, and checks the sign of the result. If it is negative, the former number is less than the latter.

Portable version:

    case "`echo "1.4 - 2.5" | bc`" in
      -*) echo "1.4 is less than 2.5";;
    esac

AWK can be used for calculations, too:

    $ awk 'BEGIN {printf "%.3f\n", 10 / 3}'
    3.333

There is a subtle but important difference between the bc and the awk solution here: bc reads commands and expressions from standard input. awk on the other hand evaluates the expression as part of the program. Expressions on standard input are not evaluated, i.e. echo 10/3 | awk '{print $0}' will print 10/3 instead of the evaluated result of the expression.

Newer versions zsh and KornShell93 have built-in floating point arithmetic, together with mathematical functions like sin() or cos() .

(*)Actually, I lied. It can print them, using printf and one of the %e or %f or %g format strings. But that's all.

23. I want to launch an interactive shell that has a special set of aliases and functions, not the ones in the user's ~/.bashrc.

bash --rcfile /my/custom/bashrc

Variant question: I have a script that sets up an environment, and I want to give the user control at the end of it.

Put exec bash at the end of it to launch an interactive shell. This shell will inherit the environment (which does not include aliases, but that's OK, because aliases suck). Of course, you must also make sure that your script runs in a terminal -- otherwise, you must create one, for example, by using exec xterm -e bash.

24. I set variables in a loop. Why do they suddenly disappear after the loop terminates? Or, why can't I pipe data to read?

The following command always prints "total number of lines: 0", although the variable linecnt has a larger value in the while loop:

    linecnt=0
    cat /etc/passwd | while read line
    do
        linecnt=`expr $linecnt + 1`
    done
    echo "total number of lines: $linecnt"

The reason for this surprising behaviour is that a while/for/until loop runs in a subshell when it's part of a pipeline. For the while loop above, a new subshell with its own copy of the variable linecnt is created (initial value, taken from the parent shell: "0"). This copy then is used for counting. When the while loop is finished, the subshell copy is discarded, and the original variable linecnt of the parent (whose value has not changed) is used in the echo command.

Different shells behave differently when using redirection or pipes with a loop:

  • BourneShell creates a subshell when the input or output of a loop is redirected, either by using a pipeline or by a redirection operator ('<', '>').

  • BASH creates a new process only if the loop is part of a pipeline

  • KornShell creates it only if the loop is part of a pipeline, but not if the loop is the last part of it.

To solve this, either use a method that works without a subshell, or make sure you do all processing inside that subshell (a bit of a kludge, but often easier to work with):

    linecnt=0
    cat /etc/passwd |
    (
        while read line ; do
                linecnt="$((linecnt+1))"
        done
        echo "total number of lines: $linecnt"
    )

To avoid the subshell completely (not easily possible if the other part of the pipe is a command!), use redirection, which does not have this problem (at least for BASH and KornShell):

    linecnt=0
    while read line ; do
        linecnt="$((linecnt+1))"
   done < /etc/passwd
   echo "total number of lines: $linecnt"

For BASH, when the input of the pipe is a command rather than a file, you can use ProcessSubstitution:

    while read LINE; do
        echo "-> $LINE"
    done < <(grep PATH /etc/profile)

If you're reading from a plain file, a portable and common work-around is to redirect the standard input of the script using exec:

    linecnt=0
    exec < /etc/passwd    # redirect standard input from the file /etc/passwd
    while read line       # "read" gets its input from the file /etc/passwd
    do
        linecnt=`expr $linecnt + 1`
    done
    echo "total number of lines: $linecnt"

This works as expected, and prints a line count for the file /etc/passwd. But the input is redirected from that file permanently. What if we need to read the original standard input sometime later again? In that case we have to save a copy of the original standard input file descriptor, which we later can restore:

    exec 3<&0             # save original stdin file descriptor 0 as FD 3
    exec 0</etc/passwd    # redirect stdin from the file /etc/passwd

    linecnt=0
    while read line       # "read" gets its input from the file /etc/passwd
    do
        linecnt=`expr $linecnt + 1`
    done

    exec 0<&3             # restore saved stdin (FD 0) from FD 3
    exec 3<&-             # close the no-longer-needed FD 3

    echo "total number of lines: $linecnt"

Subsequent exec commands can be combined into one line, which is interpreted left-to-right:

    exec 3<&0
    exec 0</etc/passwd
    _...read redirected standard input..._
    exec 0<&3
    exec 3<&-

is equivalent to

    exec 3<&0 0</etc/passwd
    _...read redirected standard input..._
    exec 0<&3 3<&-

Another useful trick (using Bash syntax) is breaking a variable into words using read:

    echo "$foo" | read a b c      # this doesn't work
    read a b c <<< "$foo"         # but this does

Again, the pipeline causes the read command in the first example to run in a subshell, so its effect is never witnessed in the parent process. The second example does not create any subshells, so it works as we expect. The <<< operator is specific to bash (2.05b and later), and the input which follows it is usually called a "here string".

For more examples of how to break input into words, see FAQ #1.

25. How can I access positional parameters after $9?

Use ${10} instead of $10. This works for BASH and KornShell, but not for older BourneShell implementations. Another way to access arbitrary positional parameters after $9 is to use for, e.g. to get the last parameter:

    for last
    do
        : # nothing
    done

    echo "last argument is: $last"

To get an argument by number, we can use a counter:

    n=12        # This is the number of the argument we are interested in
    i=1
    for arg
    do
        if [ $i -eq $n ]
        then
            argn=$arg
            break
        fi
        i=`expr $i + 1`
    done
    echo "argument number $n is: $argn"

This has the advantage of not "consuming" the arguments. If this is no problem, the shift command discards the first positional arguments:

    shift 11
    echo "the 12th argument is: $1"

In addition, BASH treats the set of positional parameters as an array, and you may use parameter expansion syntax to address those elements in a variety of ways:

    for x in "${@:(-2)}"    # iterate over the last 2 parameters
    for y in "${@:2}"       # iterate over all parameters starting at $2
                            # which may be useful if we don't want to shift

Although direct access to any positional argument is possible this way, it's hardly needed. The common way is to use getopts to process command line options (e.g. "-l", or "-o filename"), and then use either for or while to process all arguments in turn. An explanation of how to process command line arguments is available in FAQ #35, and another is found at http://www.shelldorado.com/goodcoding/cmdargs.html

26. How can I randomize (shuffle) the order of lines in a file? (Or select a random line from a file, or select a random file from a directory.)

    randomize(){
        while read l ; do echo "0$RANDOM $l" ; done |
        sort -n |
        cut -d" " -f2-
    }

Note: the leading 0 is to make sure it doesn't break if the shell doesn't support $RANDOM, which is supported by BASH, KornShell, KornShell93 and POSIX shell, but not BourneShell.

The same idea (printing random numbers in front of a line, and sorting the lines on that column) using other programs:

    awk '
        BEGIN { srand() }
        { print rand() "\t" $0 }
    ' |
    sort -n |    # Sort numerically on first (random number) column
    cut -f2-     # Remove sorting column

This is faster than the previous solution, but will not work for very old AWK implementations (try "nawk", or "gawk", if available).

A related question we frequently see is, How can I print a random line from a file? The problem here is that you need to know in advance how many lines the file contains. Lacking that knowledge, you have to read the entire file through once just to count them -- or, you have to suck the entire file into memory. Let's explore both of these approaches.

   n=$(wc -l < "$file")        # Count number of lines.
   r=$((RANDOM % n + 1))       # Random number from 1..n.
   sed -n "$r{p;q;}" "$file"   # Print the r'th line.

(These examples use the answer from FAQ 11 to print the n'th line.) The first one's pretty straightforward -- we use wc to count the lines, choose a random number, and then use sed to print the line. If we already happened to know how many lines were in the file, we could skip the wc command, and this would be a very efficient approach.

The next example sucks the entire file into memory. This approach saves time reopening the file, but obviously uses more memory. (Arguably: on systems with sufficient memory and an effective disk cache, you've read the file into memory by the earlier methods, unless there's insufficient memory to do so, in which case you shouldn't, QED.)

   oIFS=$IFS IFS=$'\n' lines=($(<"$file")) IFS=$oIFS
   n=${#lines[@]}
   r=$((RANDOM % n))
   echo "${lines[r]}"

Note that we don't add 1 to the random number in this example, because the array of lines is indexed counting from 0.

Also, some people want to choose a random file from a directory (for a signature on an e-mail, or to chose a random song to play, or a random image to display, etc.). A similar technique can be used:

    files=(*.ogg)               # Or *.gif, or *
    n=${#files[@]}              # For aesthetics
    xmms "${files[RANDOM % n]}" # Choose a random element

... or just use shuf (man shuf).

  • No man page for shuf on HP-UX 10.20, OpenBSD 4.0, or Debian unstable. apt-cache show shuf gives nothing. Searching for shuf in the http://freshmeat.net/ search box gives no results. Do you have a pointer to where this thing comes from?

    • On Debian 4.0, shuf is in the science/biosquid package

      shuf is a part of GNU Coreutils

      • Not in GNU coreutils 5.97, which is the newest available in Debian unstable as of 2007-06-20.

        • gnu.org clearly shows shuf in their Coreutils package. If only Debian would update their packages once a century.

Speaking of GNU coreutils, as of version 6.9 GNU sort has the -R (aka --random-sort) flag. Oddly enough, it only works for the generic locale:

     LC_ALL=C sort -R file     # output the lines in file in random order
     LC_ALL=POSIX sort -R file # output the lines in file in random order
     LC_ALL=en_US sort -R file # effectively ignores the -R option

You can seed the random value to sort with the --random-source flag, which expects a file with entropy.

     export LC_ALL=C
     # Keep in mind that seeding a random number generator with another RNG
     # only "lends" the original seed's entropy to the new RNG. sort -R will
     # not be "more random" than /dev/urandom!
     sort --random-source=/dev/urandom -R file

27. How can two processes communicate using named pipes (fifos)?

NamedPipes, also known as FIFOs ("First In First Out") are well suited for inter-process communication. The advantage over using files as a means of communication is, that processes are synchronized by pipes: a process writing to a pipe blocks if there is no reader, and a process reading from a pipe blocks if there is no writer.

Here is a small example of a server process communicating with a client process. The server sends commands to the client, and the client acknowledges each command:

Server

#! /bin/sh
# server - communication example

# Create a FIFO. Some systems don't have a "mkfifo" command, but use
# "mknod pipe p" instead

mkfifo pipe

while sleep 1
do
    echo "server: sending GO to client"

    # The following command will cause this process to block (wait)
    # until another process reads from the pipe
    echo GO > pipe

    # A client read the string! Now wait for its answer. The "read"
    # command again will block until the client wrote something
    read answer < pipe

    # The client answered!
    echo "server: got answer: $answer"
done

Client

#! /bin/sh
# client

# We cannot start working until the server has created the pipe...
until [ -p pipe ]
do
    sleep 1;    # wait for server to create pipe
done

# Now communicate...

while sleep 1
do
    echo "client: waiting for data"

    # Wait until the server sends us one line of data:
    read data < pipe

    # Received one line!
    echo "client: read <$data>, answering"

    # Now acknowledge that we got the data. This command
    # again will block until the server read it.
    echo ACK > pipe
done

Write both examples to files server and client respectively, and start them concurrently to see it working:

    $ chmod +x server client
    $ server & client &
    server: sending GO to client
    client: waiting for data
    client: read <GO>, answering
    server: got answer: ACK
    server: sending GO to client
    client: waiting for data
    client: read <GO>, answering
    server: got answer: ACK
    server: sending GO to client
    client: waiting for data
    [...]

28. How do I determine the location of my script? I want to read some config files from the same place.

This topic comes up frequently. This answer covers not only the expression used above ("configuration files"), but also several variant situations. If you've been directed here, please read this entire answer before dismissing it.

This is a complex question because there's no single right answer to it. Even worse: it's not possible to find the location reliably in 100% of all cases. All ways of finding a script's location depend on the name of the script, as seen in the predefined variable $0. But providing the script name in $0 is only a (very common) convention, not a requirement.

The suspect answer is "in some shells, $0 is always an absolute path, even if you invoke the script using a relative path, or no path at all". But this isn't reliable across shells; some of them (including BASH) return the actual command typed in by the user instead of the fully qualified path. And this is just the tip of the iceberg!

Your script may not actually be on a locally accessible disk at all. Consider this:

  ssh remotehost bash < ./myscript

The shell running on remotehost is getting its commands from a pipe. There's no script anywhere on any disk that bash can see.

Moreover, even if your script is stored on a local disk and executed, it could move. Someone could mv the script to another location in between the time you type the command and the time your script checks $0. Or someone could have unlinked the script during that same time window, so that it doesn't actually have a link within a file system any more.

Even in the cases where the script is in a fixed location on a local disk, the $0 approach still has some major drawbacks. The most important is that the script name (as seen in $0) may not be relative to the current working directory, but relative to a directory from the program search path $PATH (this is often seen with KornShell). Or (and this is most likely problem by far...) there might be multiple links to the script from multiple locations, one of them being a simple symlink from a common PATH directory like /usr/local/bin, which is how it's being invoked. Your script might be in /opt/foobar/bin/script but the naive approach of reading $0 won't tell you that -- it may say /usr/local/bin/script instead.

(For a more general discussion of the Unix file system and how symbolic links affect your ability to know where you are at any given moment, see this Plan 9 paper.)

Having said all that, if you still want to make a whole slew of naive assumptions, and all you want is the fully qualified version of $0, you can use something like this (POSIX, non-Bourne):

  [[ $0 = /* ]] && echo $0 || echo $PWD/$0

Or the BourneShell version:

  case $0 in /*) echo $0;; *) echo `pwd`/$0;; esac

Or a shell-independent variant (needs a readlink(1) supporting -f, though, so it's OS-dependent):

  readlink -f "$0"

If we want to account for the cases where the script's relative pathname (in $0) may be relative to a $PATH component instead of the current working directory (as mentioned above), we can still try to search the script like the shell would have done: in all directories from $PATH.

The following script shows how this could be done:

#!/bin/bash

myname=$0
if [ -s "$myname" ] && [ -x "$myname" ]; then
    # $myname is already a valid file name

    mypath=$myname
else
    case "$myname" in
    /*) exit 1;;             # absolute path - do not search PATH
    *)
        # Search all directories from the PATH variable. Take
        # care to interpret leading and trailing ":" as meaning
        # the current directory; the same is true for "::" within
        # the PATH.
    
        # Replace leading : with . in PATH, store in p
        p=${PATH/#:/.:}
        # Replace trailing : with .
        p=${p//%:/:.}
        # Replace :: with .
        p=${p//::/:.:}
        # Temporary input field separator, see FAQ #1
        OFS=$IFS; IFS=:
        # Split the path on colons and loop through each of them
        for dir in $p; do
                [ -f "$dir/$myname" ] || continue # no file
                [ -x "$dir/$myname" ] || continue # not executable
                mypath=$dir/$myname
                break           # only return first matching file
        done
        # Restore old input field separator
        IFS=$OFS
        ;;
    esac
fi

if [ ! -f "$mypath" ]; then
    echo >&2 "cannot find full path name: $myname"
    exit 1
fi

echo >&2 "path of this script: $mypath"

Note that $mypath is not necessarily an absolute path name. It still can contain relative parts like ../bin/myscript.

Are you starting to see how ridiculously complex this problem is becoming? And this is still just the simplistic case where we've made a lot of assumptions about the script not moving and not being piped in!

Generally, storing data files in the same directory as their programs is a bad practise. The Unix file system layout assumes that files in one place (e.g. /bin) are executable programs, while files in another place (e.g. /etc) are data files. (Let's ignore legacy Unix systems with programs in /etc for the moment, shall we....)

Here are some common sense alternatives you should consider, instead of attempting to perform the impossible:

  • It really makes the most sense to keep your script's configuration in a single, static location such as /etc/foobar.conf.

  • If you need to define multiple configuration files, then you can have a directory (say, /var/lib/foobar/ or /usr/local/lib/foobar/), and read that directory's location from a fixed place such as /etc/foobar.conf.

  • If you don't even want that much to be hard-coded, you could pass the location of foobar.conf (or of your configuration directory itself) as a parameter to the script.

  • If you need the script to assume certain default in the absence of /etc/foobar.conf, you can put defaults in the script itself, or fall back to something like $HOME/.foobar.conf if /etc/foobar.conf is missing.

  • When you install the script on a target system, you could put the script's location into a variable in the script itself. The information is available at that point, and as long as the script doesn't move, it will always remain correct for each installed system.
  • In most cases, it makes more sense to abort gracefully if your configuration data can't be found by obvious means, rather than going through arcane processes and possibly coming up with wrong answers.

29. How can I display value of a symbolic link on standard output?

The external command readlink can be used to display the value of a symbolic link.

$ readlink /bin/sh
bash

you can also use GNU find's %l directive, which is especially useful if you need to resolve links in batches:

$ find /bin/ -type l -printf '%p points to %l\n'
/bin/sh points to bash
/bin/bunzip2 points to bzip2
...

If your system lacks readlink, you can use a function like this one:

readlink() {
    local path=$1 ll

    if [ -L "$path" ]; then
        ll="$(LC_ALL=C ls -l "$path" 2> /dev/null)" &&
        echo "${ll/* -> }"
    else
        return 1
    fi
}

30. How can I rename all my *.foo files to *.bar, or convert spaces to underscores, or convert upper-case file names to lower case?

Some GNU/Linux distributions have a rename(1) command, which you can use for the former; however, the syntax differs from one distribution to the next, so it's not a portable answer. Consult your system's man pages if you want to learn how to use yours, if you have one at all. It's often perfectly good for one-shot interactive renames, just not in portable scripts. We don't include any rename(1) examples here because it's too confusing -- there are two common versions of it and they're totally incompatible with each other.

You can do mass renames in POSIX shells with Parameter Expansion, like this:

for f in *.foo; do mv "$f" "${f%.foo}.bar"; done

Here's a similar example, this time replacing spaces in filenames with underscores:

for f in *\ *; do mv "$f" "${f// /_}"; done

This invokes the external command mv once for each file, so it may not be as efficient as some of the rename implementations.

If you want to do it recursively, then it becomes much more challenging. This example for renaming *.foo to *.bar works (in BASH) as long as no files have newlines in their names:

find . -name '*.foo' -print | while IFS=$'\n' read -r f; do
  mv "$f" "${f%.foo}.bar"
done

For more techniques on dealing with files with inconvenient characters in their names, see FAQ #20.

To convert filenames to lower case:

# tolower - convert file names to lower case

for file in *
do
    [ -f "$file" ] || continue                # ignore non-existing names
    newname=$(echo "$file" | tr '[:upper:]' '[:lower:]')     # lower case
    [ "$file" = "$newname" ] && continue      # nothing to do
    [ -f "$newname" ] && continue             # don't overwrite existing files
    mv "$file" "$newname"
done

We use the fancy range notation, because tr can behave very strangely when using the A-Z range on some systems:

imadev:~$ echo Hello | tr A-Z a-z
hÉMMÓ

To make sure you aren't caught by surprise when using tr with ranges, either use the fancy range notations, or set your locale to C.

imadev:~$ echo Hello | LC_ALL=C tr A-Z a-z
hello
imadev:~$ echo Hello | tr '[:upper:]' '[:lower:]'
hello
# Either way is fine here.

Or, if you have the utility mmv(1) on your machine, you could simply do:

# convert all filenames to lowercase
mmv "*" "#l1"

This technique can also be used to replace all unwanted characters in a file name e.g. with '_' (underscore). The script is the same as above, only the "newname=..." line has changed.

# renamefiles - rename files whose name contain unusual characters
# Portable version.
for file in *
do
    [ -f "$file" ] || continue            # ignore non-regular files, etc.
    newname=$(echo "$file" | sed 's/[^a-zA-Z0-9_.]/_/g')
    [ "$file" = "$newname" ] && continue  # nothing to do
    [ -f "$newname" ] && continue         # do not overwrite existing files
    mv "$file" "$newname"
done

The character class in [] contains all allowed characters; modify it as needed.

Here's an example that does the same thing, but this time using Parameter Expansion instead:

# renamefiles (more efficient, less portable version)
for file in *; do
   [ -f "$file" ] || continue
   newname=${f//[^a-zA-Z0-9_.]/_}
   [ "$file" = "$newname" ] && continue
   [ -f "$newname" ] && continue
   mv "$file" "$newname"
done

31. What is the difference between the old and new test commands ([ and [[)?

[ ("test" command) and [[ ("new test" command) are both used to evaluate expressions. Some examples:

    if [ -z "$variable" ]
    then
        echo "variable is empty!"
    fi

    if [ ! -f "$filename" ]
    then
        echo "not a valid, existing file name: $filename"
    fi

and

    if [[ -e $file ]]
    then
        echo "directory entry does not exist: $file"
    fi

    if [[ $file0 -nt $file1 ]]
    then
        echo "file $file0 is newer than $file1"
    fi

To cut a long story short: [ implements the old, portable syntax of the command. Although all modern shells have built-in implementations, there usually still is an external executable of that name, e.g. /bin/[. [[ is a new improved version of it, which is a keyword, not a program. This has beneficial effects on the ease of use, see below. [[ is understood by KornShell, BASH (e.g. 2.03), KornShell93, and the POSIX shell, but not by the older BourneShell.

Although [ and [[ have much in common, and share many expression operators like "-f", "-s", "-n", "-z", there are some notable differences. Here is a comparison list:

Feature

new test [[

old test [

Example

string comparison

>

\>

-

<

\<

-

== (or =)

=

-

!=

!=

-

expression grouping

&&

-a

[[ -n $var && -f $var ]] && echo "$var is a file"

||

-o

-

Pattern matching

== (or =)

(not available)

[[ $name = a* ]] || echo "name does not start with an 'a': $name"

In-process regular expression matching

=~

(not available)

[[ $(date) =~ ^Fri\ ...\ 13 ]] && echo "It's Friday the 13th!"

Special primitives that [[ is defined to have, but [ may be lacking (depending on the implementation):

Description

Primitive

Example

entry (file or directory) exists

-e

[[ -e $config ]] && echo "config file exists: $config"

file is newer/older than other file

-nt / -ot

[[ $file0 -nt $file1 ]] && echo "$file0 is newer than $file1"

two files are the same

-ef

[[ $input -ef $output ]] && { echo "will not overwrite input file: $input"; exit 1; } 

negation

!

-

But there are more subtle differences.

  • No field splitting will be done for [[ (and therefore many arguments need not be quoted)

     file="file name"
     [[ -f $file ]] && echo "$file is a file"

    will work even though $file is not quoted and contains whitespace. With [ the variable needs to be quoted:

     file="file name"
     [ -f "$file" ] && echo "$file is a file"

    This makes [[ easier to use and less error-prone.

  • No file name generation will be done for [[. Therefore the following line tries to match the contents of the variable $path with the pattern /*

     [[ $path = /* ]] && echo "\$path starts with a forward slash /: $path"

    The next command most likely will result in an error, because /* is subject to file name generation:

     [ $path = /* ] && echo "this does not work"

    (If you need to do that using Bourne shells, use case instead.)

  • As a rule of thumb, [[ is used for strings and files. If you want to compare numbers, use an ArithmeticExpression, e.g.

     i=0
     while ((i<10)); do ...

When should the new test command [[ be used, and when the old one [? If portability to the BourneShell is a concern, the old syntax should be used. If on the other hand the script requires BASH or KornShell, the new syntax is much more flexible.

32. How can I redirect the output of 'time' to a variable or file?

Bash's time keyword uses special trickery, so that you can do things like

   time find ... | xargs ...

and get the execution time of the entire pipeline, rather than just the simple command at the start of the pipe. (This is different from the behavior of the external command time(1), for obvious reasons.)

Because of this, people who want to redirect time's output often encounter difficulty figuring out where all the file descriptors are going. It's not as hard as most people think, though -- the trick is to call time in a different shell or block, and redirect stderr of that shell or block (which will contain time's results). If you need to redirect the actual command's stdout or stderr, you do that inside the inner shell/block. For example:

  • File redirection:
       bash -c "time ls" 2>time.output      # Explicit, but inefficient.
       ( time ls ) 2>time.output            # Slightly more efficient.
       { time ls; } 2>time.output           # Most efficient.
    
       # The general case:
       { time some command >stdout 2>stderr; } 2>time.output
  • Command substitution:
       foo=$( bash -c "time ls" 2>&1 )       # Captures *everything*.
       foo=$( { time ls; } 2>&1 )            # More efficient version.
    
       # Keep stdout unmolested.
       exec 3>&1
       foo=$( { time bar 1>&3; } 2>&1 )      # Captures stderr and time.
       exec 3>&-
    
       # Keep both stdout and stderr unmolested.
       exec 3>&1 4>&2
       foo=$( { time bar 1>&3 2>&4; } 2>&1 )  # Captures time only.
       exec 3>&- 4>&-

33. How can I find a process ID for a process given its name?

Usually a process is referred to using its process ID (PID), and the ps command can display the information for any process given its process ID, e.g.

    $ echo $$         # my process id
    21796
    $ ps -p 21796
    PID TTY          TIME CMD
    21796 pts/5    00:00:00 ksh

But frequently the process ID for a process is not known, but only its name. Some operating systems, e.g. Solaris, BSD, and some versions of Linux have a dedicated command to search a process given its name, called pgrep:

    $ pgrep init
    1

Often there is an even more specialized program available to not just find the process ID of a process given its name, but also to send a signal to it:

    $ pkill myprocess

Some systems also provide pidof. It differs from pgrep in that multiple output process IDs are only space separated, not newline separated.

    $ pidof cron
    5392

If these programs are not available, a user can search the output of the ps(1) command using grep.

The major problem when grepping the ps output is that grep may match its own ps entry (try: ps aux | grep init). To make matters worse, this does not happen every time; the technical name for this is a "race condition". To avoid this, there are several ways:

  • Using grep -v at the end
         ps aux | grep name | grep -v grep
    will throw away all lines containing "grep" from the output. Disadvantage: You always have the exit state of the grep -v, so you can't e.g. check if a specific process exists.
  • Using grep -v in the middle
         ps aux | grep -v grep | grep name
    This does exactly the same, except that the exit state of "grep name" is accessible and a representation for "name is a process in ps" or "name is not a process in ps". It still has the disadvantage of starting a new process (grep -v).
  • Using [] in grep
         ps aux | grep [n]ame

    This spawns only the needed grep-process. The trick is to use the []-character class (regular expressions). To put only one character in a character group normally makes no sense at all, because a [c] will always be a "c". In this case, it's the same. grep [n]ame searches for "name". But as grep's own process list entry is what you executed ("grep [n]ame") and not "grep name", it will not match itself.

33.1. BEGIN greycat rant

Most of the time when someone asks a question like this, it's because they want to manage a long-running daemon using primitive shell scripting techniques. Common variants are "How can I get the PID of my foobard process.... so I can start one if it's not already running" or "How can I get the PID of my foobard process... because I want to prevent the foobard script from running if foobard is already active." Both of these questions will lead to seriously flawed production systems.

If what you really want is to restart your daemon whenever it dies, just do this:

#!/bin/sh
while true; do
   mydaemon --in-the-foreground
done

where --in-the-foreground is whatever switch, if any, you must give to the daemon to PREVENT IT from automatically backgrounding itself. (Often, -d does this and has the additional benefit of running the daemon with increased verbosity.) Self-daemonizing programs may or may not be the target of a future greycat rant....

If that's too simplistic, look into daemontools or runit, which are programs for managing services.

If what you really want is to prevent multiple instances of your program from running, then the only sure way to do that is by using a lock. For details on doing this, see ProcessManagement or FAQ 45.

ProcessManagement also covers topics like "I want to divide my batch job into 5 'threads' and run them all in parallel." Please read it.

33.2. END greycat rant

34. Can I do a spinner in Bash?

Sure.

    i=1
    sp="/-\|"
    echo -n ' '
    while true
    do
        echo -en "\b${sp:i++%${#sp}:1}"
    done

You can also use \r instead of \b. You can use pretty much any character sequence you want as well. If you want it to slow down, put a sleep command inside the loop.

To use as a function called from a loop on every iteration, for example:

sp="/-\|"
sc=0
spin() {
   echo -ne "\b${sp:sc++:1}"
   ((sc==4)) && sc=0
}

When printing the next output line (ie when the spin is over) use:  echo -e "\r$line"  or:  echo -en '\r'; echo "$line" 

A similar technique can be used to build progress bars.

35. How can I handle command-line arguments to my script easily?

Well, that depends a great deal on what you want to do with them. Here's a general template that might help for the simple cases:

    while [[ $1 == -* ]]; do
        case "$1" in
          -h|--help) show_help; exit 0;;
          -v) verbose=1; shift;;
          -f) output_file=$2; shift 2;;
        esac
    done
    # Now all of the remaining arguments are the filenames which followed
    # the optional switches.  You can process those with "for i" or "$@".

For more complex/generalized cases, or if you want things like "-xvf" to be handled as three separate flags, you can use getopts. (NEVER use getopt(1)!)

Here is a simplistic getopts example:

    x=1         # Avoids an error if we get no options at all.
    while getopts "abcf:g:h:" opt; do
      case "$opt" in
        a) echo "You said a";;
        b) echo "You said b";;
        c) echo "You said c";;
        f) echo "You said f, with argument $OPTARG";;
        g) echo "You said g, with argument $OPTARG";;
        h) echo "You said h, with argument $OPTARG";;
      esac
      x=$OPTIND
    done
    shift $((x-1))
    echo "Left overs: $@"

36. How can I get all lines that are: in both of two files (set intersection) or in only one of two files (set subtraction).

Use the comm(1) command.

  # intersection of file1 and file2
  comm -12 <(sort file1) <(sort file2)
  # subtraction of file1 from file2
  comm -13 <(sort file1) <(sort file2)

Read the comm(1) manpage for details.

If for some reason you lack the core comm(1) program, you can use these other methods:

  1. An amazingly simple and fast implementation, that took just 20 seconds to match a 30k line file against a 400k line file for me.
    • Note that it probably only works with GNU grep, and that the file specified with -f is will be loaded into ram, so it doesn't scale for very large files.
    • It has grep read one of the sets as a pattern list from a file (-f), and interpret the patterns as plain strings not regexps (-F), matching only whole lines (-x).
      # intersection of file1 and file2
      grep -xF -f file1 file2
      # subtraction of file1 from file2
      grep -vxF -f file1 file2
  2. An implementation using sort and uniq
      # intersection of file1 and file2
      sort file1 file2 | uniq -d  (Assuming each of file1 or file2 does not have repeated content)
      # file1-file2 (Subtraction)
      sort file1 file2 file2 | uniq -u
      # same way for file2 - file1, change last file2 to file1
      sort file1 file2 file1 | uniq -u
  3. Another implementation of subtraction:
      cat file1 file1 file2 | sort | uniq -c |
      awk '{ if ($1 == 2) { $1 = ""; print; } }'
    • This may introduce an extra space at the start of the line; if that's a problem, just strip it away.
    • Also, this approach assumes that neither file1 nor file2 has any duplicates in it.
    • Finally, it sorts the output for you. If that's a problem, then you'll have to abandon this approach altogether. Perhaps you could use awk's associative arrays (or perl's hashes or tcl's arrays) instead.

See also: http://www.pixelbeat.org/cmdline.html#sets

37. How can I print text in various colors?

Do not hard-code ANSI color escape sequences in your program! The tput command lets you interact with the terminal database in a sane way.

  tput setaf 1; echo this is red
  tput setaf 2; echo this is green
  tput setaf 0; echo now we are back in black

tput reads the terminfo database which contains all the escape codes necessary for interacting with your terminal, as defined by the $TERM variable. For more details, see the terminfo(5) man page.

If you don't know in advance what your user's terminal's default text color is, you can use tput sgr0 to reset the colors to their default settings. This also removes boldface (tput bold), etc.

See also http://bash-hackers.org/wiki/doku.php?id=scripting:terminalcodes for an overview.

38. How do Unix file permissions work?

See Permissions.

39. What are all the dot-files that bash reads?

See DotFiles.

40. How do I use dialog to get input from the user?

  foo=$(dialog --inputbox "text goes here" 8 40 2>&1 >/dev/tty)
  echo "The user typed '$foo'"

The redirection here is a bit tricky.

  1. The foo=$(command) is set up first, so the standard output of the command is being captured by bash.

  2. Inside the command, the 2>&1 causes standard error to be sent to where standard out is going -- in other words, stderr will now be captured.

  3. >/dev/tty sends standard output to the terminal, so the dialog box will be seen by the user. Standard error will still be captured, however.

Another common dialog(1)-related question is how to dynamically generate a dialog command that has items which must be quoted (either because they're empty strings, or because they contain internal white space). One can use eval for that purpose, but the cleanest way to achieve this goal is to use an array.

  unset m; i=0
  words=(apple banana cherry "dog droppings")
  for w in "${words[@]}"; do
    m[i++]=$w; m[i++]=""
  done
  dialog --menu "Which one?" 12 70 9 "${m[@]}"

In the previous example, the while loop that populates the m array could have been reading from a pipeline, a file, etc.

Recall that the construction "${m[@]}" expands to the entire contents of an array, but with each element implicitly quoted. It's analogous to the "$@" construct for handling positional parameters. For more details, see FAQ50 below.

Here's another example, using filenames:

    files=(*.mp3)       # These may contain spaces, apostrophes, etc.
    cmd=(dialog --menu "Select one:" 22 76 16)
    i=0 n=${#cmd[*]}
    for f in "${files[@]}"; do
        cmd[n++]=$((i++)); cmd[n++]="$f"
    done
    choice=$("${cmd[@]}" 2>&1 >/dev/tty)
    echo "Here's the file you chose:"
    ls -ld "${files[choice]}"

A separate but useful function of dialog is to track progress of a process that produces output. Below is an example that uses dialog to track processes writing to a log file. In the dialog window, there is a tailbox where output is stored, and a msgbox with a clickable Quit. Clicking quit will cause trap to execute, removing the tempfile, and destroying the tail process.

  #you can not tail a nonexistant file, so always ensure it pre-exists!
  rm -f dialog-tail.log; echo Initialize log >> dialog-tail.log
  date >> dialog-tail.log
  tempfile=`tempfile 2>/dev/null` || tempfile=/tmp/test$$
  trap "rm -f $tempfile" 0 1 2 5 15
  dialog --title "TAIL BOXES" \
        --begin 10 10 --tailboxbg dialog-tail.log 8 58 \
        --and-widget \
        --begin 3 10 --msgbox "Press OK " 5 30 \
        2>$tempfile &
  mypid=$!;
  for i in 1 2 3;  do echo $i >> dialog-tail.log; sleep 1; done
  echo Done. >> dialog-tail.log
  wait $mypid;

For an example of creating a progress bar using dialog --gauge, see FAQ #44.

41. How do I determine whether a variable contains a substring?

  if [[ $foo = *bar* ]]

The above works in virtually all versions of Bash. Bash version 3 also allows regular expressions:

  if [[ $foo =~ ab*c ]]   # bash 3, matches abbbbcde, or ac, etc.

If you are programming in the BourneShell instead of Bash, there is a more portable (but less pretty) syntax:

  case "$foo" in
    *bar*) .... ;;
  esac

case allows you to match variables against globbing-style patterns. If you need a portable way to match variables against regular expressions, use grep or egrep.

  if echo "$foo" | grep bar >/dev/null; then ...

42. How can I find out if a process is still running?

The kill command is used to send signals to a running process. As a convenience function, the signal "0", which does not exist, can be used to find out if a process is still running:

  •  myprog &          # Start program in the background
     daemonpid=$!      # ...and save its process id
    
     while sleep 60
     do
         if kill -0 $daemonpid       # Is the process still alive?
         then
             echo >&2 "OK - process is still running"
         else
             echo >&2 "ERROR - process $daemonpid is no longer running!"
             break
         fi
     done

This is one of those questions that usually masks a much deeper issue. It's rare that someone wants to know whether a process is still running simply to display a red or green light to an operator. More often, there's some ulterior motive, such as the desire to ensure that some daemon which is known to crash frequently is still running, or to ensure mutually exclusive access to a resource, etc. For much better discussion of these issues, see ProcessManagement or FAQ #33.

43. Why does my crontab job fail? 0 0 * * * some command > /var/log/mylog.`date +%Y%m%d`

In many versions of crontab, the percent sign (%) is treated specially, and therefore must be escaped with backslashes.

0 0 * * * some command > /var/log/mylog.`date +\%Y\%m\%d` 

See your system's manual (crontab(5) or crontab(1)) for details. Note: on systems which split the crontab manual into two parts, you may have to type man 5 crontab or man -s 5 crontab to read the part you need.

44. How do I create a progress bar?

The easiest way is to use dialog --gauge. Here is an example, which relies heavily on BASH features:

   # We want to process all of the *.zip files in the current directory.
   files=(*.zip)
   dialog --gauge "Working..." 20 75 < <(
      n=${#files[*]}; i=0
      for f in "${files[@]}"; do
         # process "$f" in some way (for testing, "sleep 1")
         echo $((100*(++i)/n))
      done)

Here's an explanation of what it's doing:

  • An array named files is populated with all the files we want to process.

  • dialog is invoked, and its input is redirected from a process substitution. (A pipe could also be used here; we'd simply have to reverse the dialog command and the loop.)

  • The processing loop iterates over the array.
  • Every time a file is processed, it increments a counter (i), and writes the percent complete to stdout.

For more examples of using dialog, see FAQ #40.

45. How can I ensure that only one instance of a script is running at a time (mutual exclusion)?

We need some means of mutual exclusion. One easy way is to use a "lock": any number of processes can try to acquire the lock simultaneously, but only one of them will succeed.

How can we implement this using shell scripts? Some people suggest creating a lock file, and checking for its presence:

  •  # locking example -- WRONG
    
     lockfile=/tmp/myscript.lock
     if [ -f "$lockfile" ]
     then                      # lock is already held
         echo >&2 "cannot acquire lock, giving up: $lockfile"
         exit 0
     else                      # nobody owns the lock
         > "$lockfile"         # create the file
         #...continue script
     fi

This example does not work, because there is a time window between checking and creating the file. Assume two processes are running the code at the same time. Both check if the lockfile exists, and both get the result that it does not exist. Now both processes assume they have acquired the lock -- a disaster waiting to happen. We need an atomic check-and-create operation, and fortunately there is one: mkdir, the command to create a directory:

  •  # locking example -- CORRECT
    
     lockdir=/tmp/myscript.lock
     if mkdir "$lockdir"
     then    # directory did not exist, but was created successfully
         echo >&2 "successfully acquired lock: $lockdir"
         # continue script
     else
         echo >&2 "cannot acquire lock, giving up on $lockdir"
         exit 0
     fi

The advantage over using a lock file is, that even when two processes call mkdir at the same time, only one process can succeed at most. This atomicity of check-and-create is ensured at the operating system kernel level.

Note that we cannot use "mkdir -p" to automatically create missing path components: "mkdir -p" does not return an error if the directory exists already, but that's the feature we rely upon to ensure mutual exclusion.

Now let's spice up this example by automatically removing the lock when the script finishes:

  •  lockdir=/tmp/myscript.lock
     if mkdir "$lockdir"
     then
         echo >&2 "successfully acquired lock"
    
         # Remove lockdir when the script finishes, or when it receives a signal
         trap 'rm -rf "$lockdir"' 0    # remove directory when script finishes
         trap "exit 2" 1 2 3 15        # terminate script when receiving signal
    
         # Optionally create temporary files in this directory, because
         # they will be removed automatically:
         tmpfile=$lockdir/filelist
    
     else
         echo >&2 "cannot acquire lock, giving up on $lockdir"
         exit 0
     fi

This example provides reliable mutual exclusion. There is still the disadvantage that a stale lock file could remain when the script is terminated with a signal not caught (or signal 9, SIGKILL), but it's a good step towards reliable mutual exclusion. An example that remedies this (contributed by Charles Duffy) follows:

  • Are we sure this code's correct? There seems to be a discrepancy between the names LOCK_DEFAULT_NAME and DEFAULT_NAME; and it checks for processes in what looks to be a race condition; and it uses the Linux-specific /proc file system and the GNU-specific egrep -o to do so.... I don't trust it. It looks overly complex and fragile. And quite non-portable. -- GreyCat

     LOCK_DEFAULT_NAME=$0
     LOCK_HOSTNAME="$(hostname -f)"
    
     ## function to take the lock if free; will fail otherwise
     function grab-lock {
       local PROGRAMNAME="${1:-$DEFAULT_NAME}"
       local PID=${2:-$$}
       (
         umask 000;
         mkdir -p "/tmp/${PROGRAMNAME}-lock"
         mkdir "/tmp/${PROGRAMNAME}-lock/held" || return 1
         mkdir "/tmp/${PROGRAMNAME}-lock/held/${LOCK_HOSTNAME}--pid-${PID}" && return 0 || return 1
       ) 2>/dev/null
       return $?
     }
    
     ## function to nicely let go of the lock
     function release-lock {
       local PROGRAMNAME="${1:-$DEFAULT_NAME}"
       local PID=${2:-$$}
       (
         rmdir "/tmp/${PROGRAMNAME}-lock/held/${LOCK_HOSTNAME}--pid-${PID}" || true
         rmdir "/tmp/${PROGRAMNAME}-lock/held" && return 0 || return 1
       ) 2>/dev/null
       return $?
     }
    
     ## function to force anyone else off of the lock
     function break-lock {
       local PROGRAMNAME="${1:-$DEFAULT_NAME}"
       (
         [ -d "/tmp/${PROGRAMNAME}-lock/held" ] || return 0
         for DIR in "/tmp/${PROGRAMNAME}-lock/held/${LOCK_HOSTNAME}--pid-"* ; do
           OTHERPID="$(echo $DIR | egrep -o '[0-9]+$')"
           [ -d /proc/${OTHERPID} ] || rmdir $DIR
         done
         rmdir /tmp/${PROGRAMNAME}-lock/held && return 0 || return 1
       ) 2>/dev/null
       return $?
     }
    
     ## function to take the lock nicely, freeing it first if needed
     function get-lock {
       break-lock "$@" && grab-lock "$@"
     }
     

Instead of using mkdir we could also have used the program to create a symbolic link, ln -s.

I believe using if (set -C; >$lockfile); then ... is equally safe if not safer. The Bash source uses open(filename, flags|O_EXCL, mode); which should be atomic on almost all platforms (with the exception of some versions of NFS where mkdir may not be atomic either). I haven't traced the path of the flags variable, which must contain O_CREAT, nor have I looked at any other shells. I wouldn't suggest using this until someone else can backup my claims. --Andy753421

Using set -C does not work with ksh88. Ksh88 does not use O_EXCL, when you set noclobber (-C). --jrw32982

For more discussion on these issues, see ProcessManagement.

46. I want to check to see whether a word is in a list (or an element is a member of a set).

The safest way to do this would be to loop over all elements in your set/list and check them for the element/word you are looking for. Say we are looking for the content of bar in the array foo:

  •    for element in "${foo[@]}"; do
          [[ $element = $bar ]] && echo "Found $bar."
       done

Or, to stop searching when you find it:

  •    for element in "${foo[@]}"; do
          [[ $element = $bar ]] && { echo "Found $bar."; break; }
       done

If for some reason your list/set is not in an array, but is a string of words, and the element you are searching for is also a word, you can use this:

  •    for element in $foo; do
          [[ $element = $bar ]] && echo "Found $bar."
       done

A less safe, but more clever version:

  •    if [[ " $foo " = *\ "$bar"\ * ]]; then
          echo "Found $bar."
       fi

And, if for some reason you don't know the syntax of for well enough, here's how to check your script's parameters for an element. For example, '-v':

  •    for element; do
          [[ $element = '-v' ]] && echo "Switching to verbose mode."
       done

GNU's grep has a \b feature which allegedly matches the edges of words. Using that, one may attempt to replicate the "clever" approach used above, but it is fraught with peril:

  •    # Is 'foo' one of the positional parameters? 
       egrep '\bfoo\b' <<<"$@" >/dev/null && echo yes
       # This is where it fails: is '-v' one of the positional parameters?
       egrep '\b-v\b' <<<"$@" >/dev/null && echo yes
       # Unfortunately, \b sees "v" as a separate word.
       # Nobody knows what the hell it's doing with the "-".
    
       # Is "someword" in the array 'array'?
       egrep '\bsomeword\b' <<<"${array[@]}"
       # Obviously, you can't use this if someword is '-v'!

Since this "feature" of GNU grep is both non-portable and poorly defined, we don't recommend using it.

47. How can I redirect stderr to a pipe?

A pipe can only carry stdout of a program. To pipe stderr through it, you need to redirect stderr to the same destination as stdout. Optionally you can close stdout or redirect it to /dev/null to only get stderr. Some sample code:

# Assume 'myprog' is a program that outputs both, stdout and stderr.

# version 1: redirect stderr towards the pipe while stdout survives (both come
# mixed)
myprog 2>&1 | grep ...

# version 2: redirect stderr towards the pipe without getting stdout (it's
# redirected to /dev/null)
myprog 2>&1 >/dev/null | grep ...

# version 3: redirect stderr towards the pipe while the "original" stdout gets
# closed
myprog 2>&1 >&- | grep ...

For further explanation of how redirections and pipes interact, see FAQ #55.

This has an obvious application with programs like dialog, which draws (using ncurses) windows onto the screen (stdout), and returns results on stderr. One way to deal with this would be to redirect stderr to a temporary file. But this is not necessary -- see FAQ #40 for examples of using dialog specifically!

In the examples above (as well as FAQ #40), we either discarded stdout altogether, or sent it to a known device (/dev/tty for the user's terminal). One may also pipe stderr only but keep stdout intact (without a priori knowledge of where the script's output is going). This is a bit trickier.

# Redirect stderr to a pipe, keeping stdout unaffected.

exec 3>&1                       # Save current "value" of stdout.
myprog 2>&1 >&3 | grep ...      # Send stdout to FD 3.
exec 3>&-                       # Now close it for the remainder of the script.

# Thanks to http://www.tldp.org/LDP/abs/html/io-redirection.html

The same can be done without exec:

$ myfunc () { echo "I'm stdout"; echo "I'm stderr" >&2;}
$ { myfunc 2>&1 1>&3- | cat  > stderr.file 3>&- ; } 3>&1
I'm stdout
$ cat stderr.file
I'm stderr

The fd 3 is closed (1>&3- and 3>&-) so that the commands do not inherit it. You can check the difference on linux trying the following:

{ bash <<< 'lsof -a -p $$ -d1,2,3'   ;} 3>&1
{ bash <<< 'lsof -a -p $$ -d1,2,3' 3>&-  ;} 3>&1

To show a dialog one-liner:

exec 3>&1
dialog --menu Title 0 0 0 FirstItem FirstDescription 2>&1 >&3 | sed 's/First/Only/'
exec 3>&-

This will have the dialog window working properly, yet it will be the output of dialog (returned to stderr) being altered by the sed.

A similar effect can be achieved with process substitution:

# Bash only.
perl -e 'print "stdout\n"; warn "stderr\n"' 2> >(tr a-z A-Z)

This will pipe standard error through the tr command.

A redirection tutorial (with an example that redirects stdout to a pipe and stderr to another pipe):

48. Eval command and security issues

48.1. Examples of bad use of eval

"eval" is a common misspelling of "evil". The section dealing with spaces in file names used to include the following quote "helpful tool (which is probably not as safe as the \0 technique)", end quote.

    Syntax : nasty_find_all [path] [command] <maxdepth>

    #This code is evil and must never be used
    export IFS=" "
    [ -z "$3" ] && set -- "$1" "$2" 1
    FILES=`find "$1" -maxdepth "$3" -type f -printf "\"%p\" "`
    #warning, evilness
    eval FILES=($FILES)
    for ((I=0; I < ${#FILES[@]}; I++))
    do
        eval "$2 \"${FILES[I]}\""
    done
    unset IFS

This script is supposed to recursively search for files with newlines and/or spaces in them, arguing that find -print0 | xargs -0 was unsuitable for some purposes such as multiple commands. It was followed by an instructional description on all the lines involved, which we'll skip.

To its defense, it works:

$ ls -lR
.:
total 8
drwxr-xr-x  2 vidar users 4096 Nov 12 21:51 dir with spaces
-rwxr-xr-x  1 vidar users  248 Nov 12 21:50 nasty_find_all

./dir with spaces:
total 0
-rw-r--r--  1 vidar users 0 Nov 12 21:51 file?with newlines
$ ./nasty_find_all . echo 3
./nasty_find_all
./dir with spaces/file
with newlines
$

But consider this:

$ touch "\"); ls -l $'\x2F'; #"

You just created a file called  "); ls -l $'\x2F'; #

Now FILES will contain  ""); ls -l $'\x2F'; #. When we do eval FILES=($FILES), it becomes

FILES=(""); ls -l $'\x2F'; #"

Which becomes the two statements  FILES=("");  and  ls -l / . Congratulations, you just allowed execution of arbitrary commands.

$ touch "\"); ls -l $'\x2F'; #"
$ ./nasty_find_all . echo 3
total 1052
-rw-r--r--   1 root root 1018530 Apr  6  2005 System.map
drwxr-xr-x   2 root root    4096 Oct 26 22:05 bin
drwxr-xr-x   3 root root    4096 Oct 26 22:05 boot
drwxr-xr-x  17 root root   29500 Nov 12 20:52 dev
drwxr-xr-x  68 root root    4096 Nov 12 20:54 etc
drwxr-xr-x   9 root root    4096 Oct  5 11:37 home
drwxr-xr-x  10 root root    4096 Oct 26 22:05 lib
drwxr-xr-x   2 root root    4096 Nov  4 00:14 lost+found
drwxr-xr-x   6 root root    4096 Nov  4 18:22 mnt
drwxr-xr-x  11 root root    4096 Oct 26 22:05 opt
dr-xr-xr-x  82 root root       0 Nov  4 00:41 proc
drwx------  26 root root    4096 Oct 26 22:05 root
drwxr-xr-x   2 root root    4096 Nov  4 00:34 sbin
drwxr-xr-x   9 root root       0 Nov  4 00:41 sys
drwxrwxrwt   8 root root    4096 Nov 12 21:55 tmp
drwxr-xr-x  15 root root    4096 Oct 26 22:05 usr
drwxr-xr-x  13 root root    4096 Oct 26 22:05 var
./nasty_find_all
./dir with spaces/file
with newlines
./
$

It doesn't take much imagination to replace  ls -l  with  rm -rf  or worse.

One might think these circumstances are obscure, but one should not be tricked by this. All it takes is one malicious user, or perhaps more likely, a benign user who left the terminal unlocked when going to the bathroom, wrote a funny php uploading script that doesn't sanity check file names or who made the same mistake as oneself in allowing arbitrary code execution (now instead of being limited to the www-user, an attacker can use nasty_find_all to traverse chroot jails and/or gain additional privileges), uses an IRC or IM client that's too liberal in the filenames it accepts for file transfers or conversation logs, etc.

48.2. Examples of good use of eval

Command eval has other uses especially when creating variables out of blue. Here is an example how to parse command line options that do not take parameters automatically:

#!/bin/bash
#
# Create option variables dynamically. Try call:
#
#    sh -x example.sh --verbose --test --debug

for i in "$@"
do
    case "$i" in
       --test|--verbose|--debug)
            shift                   # Remove option from command line
            name=${i#--}            # Delete option prefix
            eval "$name='$name'"    # make *new* variable
            ;;
    esac
done

echo "verbose: $verbose"
echo "test: $test"
echo "debug: $debug"

So, why is this version acceptable? It's acceptable because we have restricted the eval command so that it will only be executed when the input is one of a finite set of known values. Therefore, it can't ever be abused by the user to cause arbitrary command execution -- any input with funny stuff in it wouldn't match one of the three predetermined possible inputs. This variant would not be acceptable:

#!/bin/bash
# Dangerous code.  Do not use this.
for i in "$@"
do
    case "$i" in
       --test*|--verbose*|--debug*)
            shift                   # Remove option from command line
            name=${i#--}            # Delete option prefix
            eval "$name='$name'"    # make *new* variable
            ;;
    esac
done

All that's changed is that we attempted to make the previous "good" example (which doesn't do very much) useful in some way, by letting it take things like --test=foo. But look at what this enables:

$ ./foo --test='; ls -l /etc/passwd;x='
-rw-r--r-- 1 root root 943 2007-03-28 12:03 /etc/passwd

Once again: by permitting the eval command to be used on unfiltered user input, we've permitted arbitrary command execution.

49. How can I view periodic updates/appends to a file? (ex: growing log file)

tail -f will show you the growing log file. On some systems (e.g. OpenBSD), this will automatically track a rotated log file to the new file with the same name (which is usually what you want). To get the equivalent functionality on GNU systems, use tail -F instead.

This is helpful if you need to view only the updates to the file after your last view.

# Start by setting n=1
   tail -n $n testfile; n="+$(( $(wc -l < testfile) + 1 ))"

Every invocation of this gives the update to the file from where we stopped last. If you know the line number from where you want to start, set n to that.

50. I'm trying to construct a command dynamically, but I can't figure out how to deal with quoted multi-word arguments.

Some people attempt to do things like this:

    # Non-working example
    args="-s 'The subject' $address"
    mail $args < $body

This fails because of word-splitting. When $args is evaluated, it becomes four words: 'The is the second word, and subject' is the third word.

What's needed is a way to maintain each word as a separate item, even if that word contains multiple spaces. Quotes won't do it, but an array will.

    # Working example
    args=(-s "The subject" "$address")
    mail "${args[@]}" < $body

Often, this question arises when someone is trying to use dialog to construct a menu on the fly. For an example of how to do this properly, see FAQ #40.

Another reason people attempt to do this is because they want to echo "I am going to run this command: $command" before they actually run it. If that's all you want, then simply use the set -x command, or invoke your script with #!/bin/bash -x or bash -x ./myscript. Note that you can turn it off and back on inside the script with set +x and set -x.

It's worth noting that you cannot put a pipeline command into an array variable and then execute it using the "${array[@]}" technique. The only way to store a pipeline in a variable would be to add (carefully!) a layer of quotes if necessary, store it in a string variable, and then use eval or sh to run the variable. This is not recommended, for security reasons.

51. I want history-search just like in tcsh. How can I bind it to the up and down keys?

Just add the following to /etc/inputrc or your ~/.inputrc

"\e[A":history-search-backward
"\e[B":history-search-forward

52. How do I convert a file from DOS format to UNIX format (remove CRs from CR-LF line terminators)?

All these are from the sed one-liners page:

sed 's/.$//' dosfile              # assumes that all lines end with CR/LF
sed 's/^M$//' dosfile             # in bash/tcsh, press Ctrl-V then Ctrl-M
sed 's/\x0D$//' dosfile           # GNUism - does not work with Unix sed!

If you want to remove all CRs regardless of whether they are at the end of a line, you can use tr:

tr -d '\r' < dosfile

If you want to use the second sed example above, but without embedding a literal CR into your script:

sed $'s/\r$//' dosfile            # BASH only

Some distributions have a dos2unix command which can do this. In vim, you can use :set fileformat=unix to do it.

53. I have a fancy prompt with colors, and now bash doesn't seem to know how wide my terminal is. Lines wrap around incorrectly.

You must put \[ and \] around any non-printing escape sequences in your prompt. Thus:

BLUE=$(tput setaf 4)
PURPLE=$(tput setaf 5)
BLACK=$(tput setaf 0)
PS1='\[$BLUE\]\h:\[$PURPLE\]\w\[$BLACK\]\$ '

Without the \[ \], bash will think the bytes which constitute the escape sequences for the color codes will actually take up space on the screen, so bash won't be able to know where the cursor actually is.

Refer to the Wikipedia article for ANSI escape codes.

54. How can I tell whether a variable contains a valid number?

First, you have to define what you mean by "number". The most common case seems to be that, when people ask this, they actually mean "a non-negative integer, with no leading + sign". Or in other words, a string of all digits.

if [[ $foo = *[^0-9]* ]]; then
    echo "'$foo' has a non-digit somewhere in it"
else
    echo "'$foo' is strictly numeric"
fi

This can be done in Korn and legacy Bourne shells as well, using case:

case "$foo" in
    *[!0-9]*) echo "'$foo' has a non-digit somewhere in it" ;;
    *) echo "'$foo' is strictly numeric" ;;
esac

If what you actually mean is "a valid floating-point number" or something else more complex, then there are a few possible ways. One of them is to use Bash's extglob capability:

# Bash example; extended globs are disabled by default
shopt -s extglob
[[ $foo = *[0-9]* && $foo = ?([+-])*([0-9])?(.*([0-9])) ]] && echo "foo is a number"

The leading test of $foo is to ensure that it contains at least one digit. The extended glob, by itself, would match the empty string, or a lone + or -, which may not be desirable behavior.

The features enabled with extglob in Bash are also allowed in the Korn shell by default. The difference here is that Ksh lacks Bash's [[ and must use case instead:

# Ksh example using extended globs
case $foo in
  *[0-9]*)
    case $foo in
        ?([+-])*([0-9])?(.*([0-9]))) echo "foo is a number";;
    esac;;
esac

Note that this uses the same extended glob as the Bash example before it; the third closing parenthesis at the end of it is actually part of the case syntax.

If your definition of "a valid number" is even more complex, or if you need a solution that works in legacy Bourne shells, you might prefer to use a regular expression. Here is a portable version, using egrep:

if test "$foo" && echo "$foo" | egrep '^[-+]?[0-9]*(\.[0-9]*)?$' >/dev/null; then
    echo "'$foo' might be a number"
else
    echo "'$foo' might not be a number"
fi

(Like the extended globs, this extended regular expression matches a lone + or -, and the code may therefore require adjustment. The initial test command only requires a non-empty string. Closing the last "bug" is left as an exercise for the reader, mostly because GreyCat is too damned lazy to learn expr(1).)

Bash version 3 and above have regular expression support in the [[ command. However, due to serious bugs and syntax changes in Bash's [[ regex support, we do not recommend using it. Nevertheless, if I simply omit all Bash regex answers here, someone will come along and fill them in -- and they probably won't work, or won't contain all the caveats necessary. So, in the interest of preventing disasters, here are the Bash regex answers that you should not use.

if [[ $foo = *[0-9]* && $foo =~ ^[-+]?[0-9]*\(\.[0-9]*\)?$ ]]; then  # Bash 3.1 only!
    echo "'$foo' looks rather like a number"
else
    echo "'$foo' doesn't look particularly numeric to me"
fi

Unfortunately, Bash changed the syntax of its regular expression support after version 3.1, so the following may work in some patched versions of Bash 3.2:

if [[ $foo = *[0-9]* && $foo =~ ^[-+]?[0-9]*(\.[0-9]*)?$ ]]; then    # **PATCHED** Bash 3.2 only!
    echo "'$foo' looks rather like a number"
else
    echo "'$foo' doesn't look particularly numeric to me"
fi

It fails rather spectacularly in bash 3.1 and in bash 3.2 without patches.

Note that the parentheses in the egrep regular expression and the bash 3.2.patched regular expression don't require backslashes in front of them, whereas the ones in the bash 3.1 command do.

Stuffing the Bash regex into a variable, and then using [[ $foo =~ $bar ]], may also be an effective workaround in some cases. But this belongs in a separate FAQ....

55. Tell me all about 2>&1 -- what's the difference between 2>&1 >foo and >foo 2>&1, and when do I use which?

Bash processes all redirections from left to right, in order. And the order is significant. Moving them around within a command may change the results of that command.

For newbies who've somehow managed to miss the previous hundred or so examples, here's what you want:

foo >file 2>&1          # Sends both stdout and stderr to file.

Now for the rest of you, here's a simple demonstration of what's happening:

foo() {
  echo "This is stdout"
  echo "This is stderr" 1>&2
}
foo >/dev/null 2>&1             # produces no output
foo 2>&1 >/dev/null             # writes "This is stderr" on the screen

Why do the results differ? In the first case, >/dev/null is performed first, and therefore the standard output of the command is sent to /dev/null. Then, the 2>&1 is performed, which causes standard error to be sent to the same place that standard output is already going. So both of them are discarded.

In the second example, 2>&1 is performed first. This means standard error is sent to wherever standard output happens to be going -- in this case, the user's terminal. Then, standard output is sent to /dev/null and is therefore discarded. So when we run foo the second time, we see only its standard error, not its standard output.

There are times when we really do want 2>&1 to appear first -- for one example of this, see FAQ 40.

There are other times when we may use 2>&1 without any other redirections. Consider:

find ... 2>&1 | grep "some error"

In this example, we want to search find's standard error (as well as its standard output) for the string "some error". The 2>&1 in the piped command forces standard error to go into the pipe along with standard output. (When pipes and redirections are mixed in this way, remember: the pipe is done first, before any redirections. So find's standard output is already set to point to the pipe before we process the 2>&1 redirection.)

If we wanted to read only standard error in the pipe, and discard standard output, we could do it like this:

find ... 2>&1 >/dev/null | grep "some error"

The redirections in that example are processed thus:

  1. First, the pipe is created. find's output is sent to it.

  2. Next, 2>&1 causes find's standard error to go to the pipe as well.

  3. Finally, >/dev/null causes find's standard output to be discarded, leaving only stderr going into the pipe.

A related question is FAQ #47, which discusses how to send stderr to a pipeline.

56. How can I untar or unzip multiple tarballs at once?

As the tar command was originally designed to read from and write to tape devices (tar - Tape ARchiver), you can specify only filenames to put inside an archive or to extract out of an archive (e.g. tar x myfileonthe.tape). There is an option to tell tar that the archive is not on some tape, but in a file: -f. This option takes exactly one argument: the filename of the file containing the archive. All other (following) filenames are taken to be archive members:

    tar -x -f backup.tar myfile.txt
    # OR (more common syntax IMHO)
    tar xf backup.tar myfile.txt

Now here's a common mistake -- imagine a directory containing the following archive-files you want to extract all at once:

    $ ls
    backup1.tar backup2.tar backup3.tar

Maybe you think of tar xf *.tar. Let's see:

    $ tar xf *.tar
    tar: backup2.tar: Not found in archive
    tar: backup3.tar: Not found in archive
    tar: Error exit delayed from previous errors

What happened? The shell replaced your *.tar by the matching filenames. You really wrote:

    tar xf backup1.tar backup2.tar backup3.tar

And as we saw earlier, it means: "extract the files backup2.tar and backup3.tar from the archive backup1.tar", which will of course only succeed when there are such filenames stored in the archive.

The solution is relatively easy: extract the contents of all archives one at a time. As we use a UNIX shell and we are lazy, we do that with a loop:

    for tarname in *.tar; do
      tar xf "$tarname"
    done

What happens? The for-loop will iterate through all filenames matching *.tar and call tar xf for each of them. That way you extract all archives one-by-one and you even do it automagically.

The second common archive type in these days is ZIP. The command to extract contents from a ZIP file is unzip (who would have guessed that!). The problem here is the very same: unzip takes only one option specifying the ZIP-file. So, you solve it the very same way:

    for zipfile in *.zip; do
      unzip "$zipfile"
    done

Not enough? Ok. There's another option with unzip: it can take shell-like patterns to specify the ZIP-file names. And to avoid interpretion of those patterns by the shell, you need to quote them. unzip itself and not the shell will interpret *.zip in this case:

    unzip "*.zip"
    # OR, to make more clear what we do:
    unzip \*.zip

(This feature of unzip derives mainly from its origins as an MS-DOS program. MS-DOS's command interpreter does not perform glob expansions, so every MS-DOS program must be able to expand wildcards into a list of filenames. This feature was left in the Unix version, and as we just demonstrated, it can occasionally be useful.)

57. How can group entries (in a file by common prefixes)?

As in, one wants to convert:

    foo: entry1
    bar: entry2
    foo: entry3
    baz: entry4

to

    foo: entry1 entry3
    bar: entry2
    baz: entry4

There are two simple general methods for this:

  1. sort the file, and then iterate over it, collecting entries until the prefix changes, and then print the collected entries with the previous prefix
  2. iterate over the file, collect entries for each prefix in an array indexed by the prefix

A basic implementation of a in bash:

old=xxx ; stuff=
(sort file ; echo xxx) | while read prefix line ; do 
        if [[ $prefix = $old ]] ; then
                stuff="$stuff $line"
        else
                echo "$old: $stuff"
                old="$prefix"
                stuff=
        fi
done 

And a basic implementation of b in awk:

    {
        a[$1] = a[$1] " " $2
    }
    END{
        for (x in a) print x, a[x]
    }

Written out as a shell command:

    awk '{a[$1] = a[$1] " " $2}END{for (x in a) print x, a[x]}' file

58. Can bash handle binary data?

The answer is, basically, no.

While bash won't have as many problems with it as older shells, it still can't process arbitrary binary data, and more specifically, shell variables are not 100% binary clean, so you can't store binary files in them.

One instance where such would sometimes be handy is storing small temporary bitmaps while working with netpbm... here I resorted to adding an extra pnmnoraw to the pipe, creating (larger) ASCII files that bash has no problems storing).

If you are feeling adventurous, consider this experiment:

    # bindec.bash, attempt to decode binary data to ascii decimals
    IFS=
    while read -n1 x ;do
        case "$x" in
            '') echo empty ;;
            # insert the 256 lines generated by the following oneliner here:
            # for x in $(seq 0 255) ;do echo "        $'\\$(printf %o $x)') echo $x;;" ;done
        esac
    done

and then pipe binary data into it, maybe like so:

    for x in $(seq 0 255) ;do echo -ne "\\$(printf %o $x)" ;done | bash bindec.bash | nl | less

This suggests that the 0 character is skipped entirely, because we can't create it with the input generation, enough to conveniently corrupt most binary files we try to process.

  • Yes, Bash is written in C, and uses C semantics for handling strings -- including the NUL byte as string terminator -- in its variables. You cannot store NUL in a Bash variable sanely. It simply was never intended to be used for this. - GreyCat

Note that this refers to storing them in variables... moving data between programs using pipes is always binary clean. Temporary files are also safe, as long as appropriate precautions are taken when creating them.

59. I saw this command somewhere: :(){ :|:& } (fork bomb). How does it work?

First of all -- and this is important -- please do not run this command. I've actually omitted the trigger from the question above, and left only the part that sets up the function.

Here is that part, but written out in normal shell coding style, rather than rammed all together:

:() {  
 : | : &
}

What this does is create a function named : which calls itself recursively. Twice. In the background. Since the function keeps calling itself over and over (forking new processes), forever, this quickly consumes a lot of system resources. That's why it's called a "fork bomb".

If you still don't see how it works, here is an equivalent, which creates a function named bomb instead of :

bomb() { 
 bomb | bomb &
}

A more verbose explanation:

Inside the function, a pipeline is created which forks two more instances of the function (each of which will be a whole process) in the background. Then the function exits. However, for every instance of the function which exits in this manner, two more have already been created. The result is a vast number of processes, extremely quickly.

Theoretically, anybody that has shell access to your computer can use such a technique to consume all the resources to which he/she has access. A chroot(2) won't help here. If the user's resources are unlimited then in a matter of seconds all the resources of your system (processes, virtual memory, open files, etc.) will be used, and it will probably deadlock itself. Any attempt made by the kernel to free more resources will just allow more instances of the function to be created.

As a result, the only way to protect yourself from such abuse is by limiting the maximum allowed use of resources for your users. Such resources are governed by the setrlimit(2) system call. The interface to this functionality in Bash and KornShell is the ulimit command. Your operating system may also have special configuration files to help manage these resources (for example, /etc/security/limits.conf in Debian, or /etc/login.conf in OpenBSD). Consult your documentation for details.

60. I'm trying to write a script that will change directory (or set a variable), but after the script finishes, I'm back where I started (or my variable isn't set)!

Consider this:

   #!/bin/sh
   cd /tmp

If one executes this simple script, what happens? Bash forks, and the parent waits. The child executes the script, including the chdir(2) system call, and then exits. The parent, which was waiting for the child, harvests the child's exit status (presumably 0 for success), and then bash carries on with the next command.

Since the chdir was done by a child process, it has no effect on the parent.

Moreover, there is no conceivable way you can ever have a child process affect any part of the parent's environment, which includes its variables as well as its current working directory.

So, how does one go about it? You can still have the cd command in an external file, but you can't run it as a script. Instead, you must source it (or "dot it in", using the . command, which is a synonym for source).

   echo 'cd /tmp' > $HOME/mycd
   source $HOME/mycd
   pwd                          # Now, we're in /tmp

The same thing applies to setting variables. source the file that contains the commands; don't try to run it.

Functions are run in the same same shell, so it is possible to put

   mycd() { cd /tmp; }

in .bashrc or similar, and then use mycd to change the directory.

61. Is there a list of which features were added to specific releases (versions) of Bash?

  • NEWS: a file tersely listing the notable changes between the current and previous versions

  • CHANGES: a complete bash change history

  • COMPAT: compatibility issues between bash3 and previous versions

Here's a partial list of the changes, in a more compact format:

Feature

Added in version

x+=string

3.1-alpha1

{x..y}

3.0-alpha

${!array[@]}

3.0-alpha

[[ =~

3.0-alpha

<<<

2.05b-alpha1

i++

2.04-devel

for ((;;))

2.04-devel

/dev/fd/N, /dev/tcp/host/port, etc.

2.04-devel

a=(*.txt) file expansion

2.03-alpha

extglob

2.02-alpha1

[[

2.02-alpha1

builtin printf

2.02-alpha1

$(< filename)

2.02-alpha1

** (exponentiation)

2.02-alpha1

\xNNN

2.02-alpha1

(( ))

2.0-beta2

62. How do I create a temporary file in a secure manner?

Good question. To be filled in later. (Interim hints: tempfile is not portable. mktemp exists more widely, but it may require a -c switch to create the file in advance; or it may create the file by default and barf if -c is supplied. There does not appear to be any single command that simply works everywhere, without testing various arguments.)

Suggestion (remove if not universal): A temporary file/directory can be created that is unlikely to match that of an existing file/directory using the RANDOM environmental variable as follows:

   TEMP_DIR=/tmp/$RANDOM
   mkdir $TEMP_DIR

This will make a directory of the form: /tmp/20445/. To decrease the chance of collision with an existing file, the RANDOM variable can be used a number of times:

   TEMP_DIR=/tmp/$RANDOM-$RANDOM-$RANDOM
   mkdir $TEMP_DIR

This will make a directory of the form: /tmp/24953-2875-2182/

  • Hmmm... this has potential, if you check the exit status of mkdir to be sure it actually created the directory. And set umask to something fairly restrictive as well. It could use some more peer review, though. -- GreyCat

63. My ssh client hangs when I try to run a remote background job!

The following will not do what you expect:

   ssh me@remotehost 'sleep 120 &'
   # Client hangs for 120 seconds

This is a "feature" of OpenSSH. The client will not close the connection as long as the remote end's terminal still is still in use -- and in the case of sleep 120 &, stdout and stderr are still connected to the terminal.

The immediate answer to your question -- "How do I get the client to disconnect so I can get my shell back?" -- is to kill the ssh client. You can do this with the kill or pkill commands, of course; or by sending the INT signal (usually Ctrl-C) for a non-interactive ssh session (as above); or by pressing <Enter><~><.> (Enter, Tilde, Period) in the client's terminal window for an interactive remote shell.

The long-term workaround for this is to ensure that all the file descriptors are redirected to a log file (or /dev/null) on the remote side:

   ssh me@remotehost 'sleep 120 >/dev/null 2>&1 &'
   # Client should return immediately

This also applies to restarting daemons on some legacy Unix systems.

   ssh root@hp-ux-box   # Interactive shell
   ...                  # Discover that the problem is stale NFS handles
   /sbin/init.d/nfs.client stop   # autofs is managed by this script and
   /sbin/init.d/nfs.client start  # killing it on HP-UX is OK (unlike Linux)
   exit
   # Client hangs -- use Enter ~ . to kill it.

The legacy Unix /sbin/init.d/nfs.client script runs daemons in the background but leaves their stdout and stderr attached to the terminal (and they don't fully self-daemonize). The solution is either to fix the Unix vendor's broken init script, or to kill the ssh client process after this happens. The author of this article uses the latter approach.

64. Why is it so hard to get an answer to the question that I asked in #bash ?

  • #bash aphorism #1 "The questioner's first description of the problem/question will be misleading."
  • corollary 1.1 "The questioner's second description of the problem/question will also be misleading"
  • corollary 1.2 "The questioner is never precise" ex: will say "print the file" when they mean print the file's name, rather than printing the file itself."
  • #bash aphorism #2, "The questioner will keep changing their original question until it drives the helpers in the channel insane."
  • #bash aphorism #3, "The data is never formatted in the way that makes it easiest to manipulate :-)"
  • #bash aphorism #4, "30 to 40 percent of the conversations in #bash will be about aphorisms #1 and #2"

65. Is there a "PAUSE" command in bash like there is in MSDOS batch scripts? To prompt the user to press any key to continue?

No, but you can use these:

echo press enter to continue; read

echo press any key to continue; read -n 1

read -p 'press enter to continue'

66. I want to check if [[ $var == foo || $var == bar || $var == more ]] without repeating $var n times.

Here's a portable solution:

   case $var in
      foo|bar|more) ... ;;
   esac

And here's one that uses =~ (which requires bash 3.0 or higher):

   regex='^(foo|bar|more)$'
   if [[ $var =~ $regex ]]; then
      ...
   fi

This one only works in bash 3.1 and some 3.2 revisions (it is untested in 3.0):

   if [[ $var =~ '^(foo|bar|more)$' ]]; then
      ...
   fi

The =~ operator behavior changes drastically between 3.1 and 3.2, so be careful with it. The above expression is tested to work in bash 3.1 and 3.2.{13,15,17}; and it doesn't work in 3.2.0. Please also note that the regexp does not need to be quoted in the 3.2 revisions where it works. --redondos

Normally I would never advocate sticking code into a variable and attempting to use it -- lots of people have enormous trouble because they try to do that. In the case of =~, though, it seems to be required. Personally, I'd just stick with the case. --GreyCat

67. How can I trim leading/trailing white space from one of my variables?

There are a few ways to do this -- none of them elegant.

First, the most portable way would be to use sed:

   x=$(echo "$x" | sed -e 's/^ *//' -e 's/ *$//')
   # Note: this only removes spaces.  For tabs too:
   x=$(echo "$x" | sed -e $'s/^[ \t]*//' -e $'s/[ \t]*$//')
   # Or possibly, with some systems:
   x=$(echo "$x" | sed -e 's/^[[:space:]]*//' -e 's/[[:space:]]*$//')

One can achieve the goal using builtins, although at the moment I'm not sure which shells the following syntax supports:

   # Remove leading whitespace:
   while [[ $x = [$' \t\n']* ]]; do x=${x#[$' \t\n']}; done
   # And now trailing:
   while [[ $x = *[$' \t\n'] ]]; do x=${x%[$' \t\n']}; done

Of course, the preceding example is pretty slow, because it removes one character at a time, in a loop (although it's good enough in practice for most purposes). If you want something a bit fancier, there's a bash-only solution using extglob:

   shopt -s extglob
   x=${x##+([$' \t\n'])}; x=${x%%+([$' \t\n'])}
   shopt -u extglob

Rather than specify each type of space character yourself, you can use character classes. Two character classes that are useful for matching whitespace are space and blank.

More info: ctype/wctype(3), re_format/regex(7), isspace(3).

   shopt -s extglob
   x=${x##+([[:space:]])}; x=${x%%+([[:space:]])}
   shopt -u extglob

There are many, many other ways to do this. These are not necessarily the most efficient, but they're known to work.

68. How do I run a command, and have it abort (timeout) after N seconds?

There are two C programs that can do this: doalarm, and timeout. (Compiling them is beyond the scope of this document; suffice to say, it'll be trivial on GNU/Linux systems, easy on most BSDs, and painful on anything else....)

If you don't have or don't want one of the above two programs, you can use a perl one-liner to set an ALRM and then exec the program you want to run under a time limit. In any case, you must understand what your program does with SIGALRM.

function doalarm () { perl -e 'alarm shift; exec @ARGV' "$@"; }

doalarm ${NUMBER_OF_SECONDS_BEFORE_ALRMING} program arg arg ...

If you can't or won't install one of these programs (which really should have been included with the basic core Unix utilities 30 years ago!), then the best you can do is an ugly hack like:

   command & pid=$!
   { sleep 10; kill $pid; } &

This will, as you will soon discover, produce quite a mess regardless of whether the timeout condition kicked in or not. Cleaning it up is not something worth my time -- just use doalarm or timeout instead. Really.

69. I want to automate an ssh (or scp, or sftp) connection, but I don't know how to send the password....

STOP!

First of all, if you actually were to embed your password in a script somewhere, it would be visible to the entire world (or at least, anyone who can read files on your system). This would defeat the entire purpose of having a password on your remote account.

If you understand this and still want to continue, then the next thing you need to do is read and understand the man page for ssh-keygen(1). This will tell you how to generate a public/private key pair (in either RSA or DSA format), and how to use these keys to authenticate to the remote system without sending a password at all.

Here is a brief summary of the procedure:

ssh-keygen -t rsa
ssh me@remote "cat >> ~/.ssh/authorized_keys" < ~/.ssh/id_rsa.pub
ssh me@remote date     # should not prompt for passWORD,
                       # but your key may have a passPHRASE

If your key has a passphrase on it, and you want to avoid typing it every time, look into ssh-agent(1). It's beyond the scope of this document, though.

If you're being prompted for a password even with the public key inserted into the remote authorized_keys file, chances are you have a permissions problem on the remote system. See SshKeys for a discussion of such problems.

If that's not it, then make sure you didn't spell it authorised_keys. SSH uses the US spelling, authorized_keys.

If you really want to use a password instead of public keys, first have your head examined. Then, if you still want to use a password, use expect(1). But don't ask us for help with it.

70. How do I convert Unix (epoch) timestamps to human-readable values?

The only sane way to handle time values within a program is to convert them into a linear scale. You can't store "January 17, 2005 at 5:37 PM" in a variable and expect to do anything with it. Therefore, any competent program is going to use time stamps with semantics such as "the number of seconds since point X". These are called epoch timestamps. If the epoch is January 1, 1970 at midnight UTC, then it's also called a "Unix timestamp", because this is how Unix stores all times (such as file modification times).

Standard Unix, unfortunately, has no tools to work with Unix timestamps. (Ironic, eh?) GNU date, and later BSD date, has a %s extension to generate output in Unix timestamp format:

    date +%s    # Prints the current time in Unix format, e.g. 1164128484

This is commonly used in scripts when one requires the interval between two events:

   start=$(date +%s)
   ...
   end=$(date +%s)
   echo "Operation took $((end - start)) seconds."

Now, to convert those Unix timestamps back into human-readable values, one needs to use an external tool. One method is to trick GNU date using:

   date -d "1970-01-01 UTC + 1164128484 seconds"
   # Prints "Tue Nov 21 12:01:24 EST 2006" in the US/Eastern time zone.

Reading the source code(!!) of GNU date's date parser reveals that it accepts Unix timestamps prefixed with '@', so:

   $ date -d "@1164128484"
   # Prints "Tue Nov 21 18:01:24 CET 2006" in the central European time zone

However, this undocumented feature only appears to work in extremely new versions of GNU date.

If you don't have GNU date available, an external language such as Perl can be used:

   perl -le "print scalar localtime 1164128484"
   # Prints "Tue Nov 21 12:01:24 2006"

I used double quotes in these examples so that the time constant could be replaced with a variable reference. See the documentation for date(1) and Perl for details on changing the output format.

Newer versions of Tcl (such as 8.5) have very good support of date and clock functions. See the tclsh man page for usage details. For example:

   echo 'puts [clock format [clock scan "today"]]' | tclsh
   # Prints today's date (the format can be adjusted with parameters to "clock format").
   
   echo 'puts [clock format [clock scan "fortnight"]]' | tclsh
   # Prints the date two weeks from now.
   
   echo 'puts [clock format [clock scan "5 years + 6 months ago"]]' | tclsh
   # Five and a half years ago, compensating for leap days and daylight savings time.

71. How do I convert an ASCII character to its decimal (or hexadecimal) value and back?

This task is quite easy while using the printf builtin. You can either write two simple functions as shown below or use the plain printf constructions alone.

   # chr() - converts decimal value to its ASCII character representation
   # ord() - converts ASCII character to its decimal value
 
   chr() {
     printf \\$(printf '%03o' $1)
   }
 
   ord() {
     printf '%d' "'$1"
   }

   # hex() - converts ASCII character to a hexadecimal value
   # unhex() - converts a hexadecimal value to an ASCII character

   hex() { 
      printf '%x' "'$1"
   }

   unhex() {
      printf \\x"$1"
   }

   # examples:
 
   chr $(ord A)    # -> A
   ord $(chr 65)   # -> 65

The ord function above is quite tricky. It can be re-written in several other ways (use that one that will best suite your coding style or your actual needs).

  • Q: Tricky? Rather, it's using a feature that I can't find documented anywhere -- putting a single quote in front of an integer. Neat effect, but how on earth did you find out about it? Source diving? -- GreyCat

    A: It validates The Single Unix Specification: "If the leading character is a single-quote or double-quote, the value shall be the numeric value in the underlying codeset of the character following the single-quote or double-quote." (see printf() to know more) -- mjf

   ord() {
     printf '%d' \"$1\"
   }

Or:

   ord() {
     printf '%d' \'$1\'
   }

Or, rather:

   ord() {
     printf '%d' "'$1'"
   }

Etc. All of the above ord functions should work properly. Which one you choose highly depends on particular situation.

There is also an alternative when printing characters by their ascii value that is using escape sequences like:

   echo $'\x27'

which prints a literal ' (there 27 is the hexadecimal ascii value of the character).

72. How can I ensure my environment is configured for cron, batch, and at jobs?

If a shell or other script calling shell commands runs fine interactively but fails due to environment configurations (say: a complex $PATH) when run noninteractively, you'll need to force your environment to be properly configured.

You can write a shell wrapper around your script which configures your environment. You may also want to have a "testenv" script (bash or other scripting language) which tests what shell and environment are present when running under different configurations.

In cron, you can invoke Bash (or the Bourne shell) with the '-c' option, source your init script, then invoke your command, eg:

  * * * * *  /bin/bash -c ". myconfig.bashrc; myscript"

Another approach would be to have myscript dot in the configuration file itself, if it's a rather static configuration. (Or, conditionally dot it in, if you find a certain variable to be missing from your environment... the possibilities are numerous.)

The at and batch utilities copy the current environment (except for the variables TERM, DISPLAY and _) as part of the job metadata, and should recreate it when the job is executed. If this isn't the case you'll want to test the environment and/or explicitly initialize it similarly to cron above.

73. How can I use parameter expansion? How can I get substrings? How can I get a file without its extension, or get just a file's extension?

Parameter Expansion is a separate section of the bash manpage (also man bash -P 'less -p "^   Parameter Expansion"' or see the reference or the bash hackers article about it). It can be hard to understand parameter expansion without actually using it. (DO NOT think about parameter expansion like a regex. It is different and distinct.)

Here's an example of how to use parameter expansion with something akin to a hostname (dot-separated components):

parameter     result
-----------   ------------------------------
${NAME}       polish.ostrich.racing.champion
${NAME#*.}           ostrich.racing.champion
${NAME##*.}                         champion
${NAME%%.*}   polish                        
${NAME%.*}    polish.ostrich.racing         

And, here's an example of the parameter expansions for a typical filename.

parameter     result
-----------   --------------------------------------------------------
${FILE}       /usr/share/java-1.4.2-sun/demo/applets/Clock/Clock.class
${FILE#*/}     usr/share/java-1.4.2-sun/demo/applets/Clock/Clock.class
${FILE##*/}                                                Clock.class
${FILE%%/*}                                                           
${FILE%/*}    /usr/share/java-1.4.2-sun/demo/applets/Clock            

You cannot nest parameter expansions. If you need to perform two separate expansions, use a temporary variable to hold the result of the first expansion.

You may find it helpful to associate that, on your keyboard, the "#" is to the left of the "$" symbol and the "%" symbol is to its right; this corresponds with their acting upon the left (beginning) and right (end) parts of the parameter.

Here are a few more examples (but please see the real documentation for a list of all the features!). I include these mostly so people won't break the wiki again, trying to add new questions that answer this stuff.

${string:2:1}   # The third character of string (0, 1, 2 = third)
${string:1}     # The string starting from the second character
                # Note: this is equivalent to ${string#?}
${string%?}     # The string with its last character removed.
${string: -1}   # The last character of string
${string:(-1)}  # The last character of string, alternate syntax
                # Note: string:-1 means something entirely different.

${file%.mp3}    # The filename without the .mp3 extension
                # Very useful in loops of the form: for file in *.mp3; do ...
${file%.*}      # The filename without its extension (assuming there was
                # only one extension in the first place...).
${file%%.*}     # The filename without all of its extensions
${file##*.}     # The extension only.

74. How do I get the effects of those nifty Bash Parameter Expansions in older shells?

The extended forms of ParameterSubstitution work with BASH, KornShell, KornShell93, but not with the older BourneShell. If the code needs to be portable to that shell as well, sed and expr can often be used.

For example, to remove the filename extension part:

    for file in *.doc
    do
        base=`echo "$file" | sed 's/\.[^.]*$//'`    # remove everything starting with last '.'
        mv "$file" "$base".txt
    done

Another example, this time to remove the last character of a variable:

    var=`expr "$var" : '\(.*\).'`

or (using sed):

    var=`echo "$var" | sed 's/.$//'`

75. How do I use 'find'? I can't understand the man page at all!

See UsingFind.

76. How do I get the sum of all the numbers in a column?

This and all similar questions are best answered with an AWK one-liner.

awk '{sum += $1} END {print sum}' myfile 

A small bit of effort can adapt this to most similar tasks (finding the average, skipping lines with the wrong number of fields, etc.).

For more examples of using awk, see handy one-liners for awk.

77. How do I log history or "secure" bash against history removal?

This is a question which has no answer applicable to bash. You are here because you asked or wanted to know how to find out what a user had executed when they unset or /dev/nulled their shell history. There are several problems with this.

The first issue is:

  • kill -9 $$

This innocuous looking command does what you would presume it to: it kills the current shell off. However, the .bash_history is ONLY written to when bash is allowed to exit cleanly. As such, sending SIGKILL to bash will prevent logging to .bash_history

Users may also set variables that disable shell history, or simply make their .bash_history a symlink to /dev/null. All of these will defeat any attempt to spy on them through their .bash_history file.

The second issue is permissions. The bash shell is executed as a user. This means that the user can read or write all content produced by or handled by the shell. Any location you would try to log to, MUST be writeable to by the user, and not a privileged user. This is because the shell specifically tries to ensure the user does not exceed its privileges. Imagine a regular user writing a root read/write only history. This is creative license for exploiting and gaining escalated privileges on the server, and thus an extremely bad idea.

The third issue is location. Assume that you pursue a chroot jail for your bash users. This is a fantastic idea, and a good step towards securing your server. However, placing your users in a chroot jail conversely affects the ability to log the users' actions. Once jailed, your user can only write to content within its specific jail. This makes finding user writeable extraneous logs a simple matter, and enables the attacker to find your hidden logs much easier than would otherwise be the case.

Where does this leave you? Unfortunately, nowhere good, and definitely not what you wanted to know. If you want to record all of the commands issues to BASH by a user, your best bet is to modify BASH so that it actually records them, in real time, as they are executed -- not when the user logs off. This is still not reliable, though, because end users may simply upload their own shell and run that instead of your hacked BASH. Or they may use one of the other shells already on your system, instead of your hacked BASH. But, for those who absolutely must have some form of patch available, you can use the patch located at http://wooledge.org/~greg/bash_logging.txt (patch submitted by _sho_ -- use at your own risk. The results of a code-review with improvements are here: http://phpfi.com/220302 -- Heiner).

For a more serious approach to this problem, consider BSD process accounting (kernel-based) instead of focusing on shells.

78. I want to set a user's password using the Unix passwd command, but how do I script that? It doesn't read standard input!

OK, first of all, I know there are going to be some people reading this, right now, who don't even understand the question. Here, this does not work:

{ echo oldpass; echo newpass; echo newpass; } | passwd
# This DOES NOT WORK!

Nothing you can do in bash can possibly work. passwd(1) does not read from standard input. This is intentional. It is for your protection. Passwords were never intended to be put into programs, or generated by programs. They were intended to be entered only by the fingers of an actual human being, with a functional brain, and never, ever written down anywhere.

Nonetheless, we get hordes of users asking how they can circumvent 35 years of Unix security.

You have three choices. The first is to manually generate your own hashed password strings (for example, using http://wooledge.org/~greg/crypt/ or a similar tool) and then write them to your system's local password-hash file (which may be /etc/passwd, or /etc/shadow, or /etc/master.passwd, or /etc/security/passwd, or ...). This requires that you read the relevant man pages on your system, find out where the password hash goes, what formatting the file requires, and then construct code that writes it out in that format.

The second is to use expect. I think it even has this exact problem as one of its canonical examples.

The third is to use some system-specific tools which may or may not exist on your platform. For example, some GNU/Linux systems have a chpasswd(8) tool which can be coerced into doing these sorts of things.

A fourth option that works at least on linux (if not other systems) is  echo "password" | passwd --stdin username . Check your passwd manpage before using.

See also FAQ #69.

79. How can I grep for lines containing foo AND bar, foo OR bar? Or for files containing foo AND bar, possibly on separate lines?

The easiest way to match lines that contain both foo AND bar is to use two grep commands:

grep foo | grep bar
grep foo "$myfile" | grep bar   # for those who need the hand-holding

It can also be done with one egrep, although (as you can probably guess) this doesn't really scale well to more than two patterns:

egrep 'foo.*bar|bar.*foo'

If you prefer, you can achieve this in one sed or awk statement. (The awk example is probably the most scalable.)

sed -n '/foo/{/bar/p}'
awk '/foo/ && /bar/'

To match lines containing foo OR bar, egrep is the natural choice, but it can also be done with sed, awk, etc.

egrep 'foo|bar'
# some people prefer grep -E 'foo|bar'

# This is another option, some people prefer:
grep -e 'foo' -e 'bar'

egrep is the oldest and most portable form of the grep command using Extended Regular Expressions (EREs). -E is a POSIX-required switch.

If you want to match files (rather than lines) that contain both "foo" and "bar", there are several possible approaches. The simplest (although not necessarily the most efficient) is to read the file twice:

grep -q foo "$myfile" && grep -q bar "$myfile" && echo "Found both"

Another approach is to read the file once, keeping track of what you've seen as you go along. There are several ways to do this in awk - the first example reads the whole file, and, after it reads the whole file, it checks if both were found:

awk '/foo/ { foo=1 } /bar/ { bar=1 } END { if (foo && bar) print "found both" }'

The second, more efficient one avoids reading the whole file by checking if the other string was already matched, and, if so, exiting:

awk 'function found() { print "Found both!"; exit } /foo/ { a=1; if (b) found() } /bar/ { b=1; if (a) found() }'

The double grep -q solution has the advantage of stopping each read whenever it finds a match; so if you have a huge file, but the matched words are both near the top, it will only read the first part of the file. The first awk solution reads the whole file one time, while the second one stops reading the file at the second match; if you want to do additional checking of the file contents, the awk solution can be adapted far more readily.

80. How can I make an alias that takes an argument?

You can't. Aliases in bash are extremely rudimentary, and not really suitable to any serious purpose. The bash man page even says so explicitly:

  • There is no mechanism for using arguments in the replacement text. If arguments are needed, a shell function should be used (see FUNCTIONS below).

Use a function instead. For example,

settitle() { case $TERM in *xterm*|*rxvt*) echo -en "\e]2;$1\a";; esac; }

Oh, by the way: aliases are not allowed in scripts. They're only allowed in interactive shells, and that's simply because users would cry too loudly if they were removed altogether. If you're writing a script, always use a function instead.

81. How can I determine whether a command exists anywhere in my PATH?

In BASH, there are a couple builtins that are suitable for this purpose: hash and type. Here's an example using hash:

# This one works in bash
if hash qwerty 2>/dev/null; then
  echo qwerty exists
else
  echo qwerty does not exist
fi

Or, if you prefer type:

# Also a bash solution
if type -P qwerty >/dev/null; then
  echo qwerty exists
else
  echo qwerty does not exist
fi

Korn shell has whence instead:

# Here's a ksh solution
if whence -p qwerty >/dev/null; then
  echo qwerty exists
else
  echo qwerty does not exist
fi

If these builtins are not available (because you're in a Bourne shell, or whatever), then you may have to rely on the external command which (which is often a csh script, although sometimes a compiled binary). Unfortunately, which may not set a useful exit code, and it may not even write errors to stderr. In those cases, one must parse its output.

# Last resort -- using which(1)
tmpval=`LC_ALL=C which qwerty 2>&1`
if test $rc -ne 0; then
  # FOR NOW, we'll assume that if this machine's which(1) sets a nonzero
  # exit status, that it actually failed.  I've yet to see any case where
  # which(1) sets an erroneous failure -- just erroneous "successes".
  echo "qwerty is not installed.  Please install it."

else
    case "$tmpval" in
      *no\ *\ in\ *|*not\ found*|'')
        echo "qwerty is not installed.  Please install it."
        ;;
      *)
        echo "Congratulations -- it seems you have qwerty (in $tmpval)."
        ;;
    esac
fi

Note that which(1)'s output when a command is not found is not consistent across platforms. On HP-UX 10.20, for example, it prints no qwerty in /path /path /path ...; on OpenBSD 4.1, it prints qwerty: Command not found.; on Debian (3.1 and 4.0 at least) and SuSE, it prints nothing at all; on Red Hat 5.2, it prints which: no qwerty in (/path:/path:...); on Red Hat 6.2, it writes the same message, but on standard error instead of standard output; and on Gentoo, it writes something on stderr.

So our best portable solution is to match the words no and in, or the phrase not found, in the combined stdout + stderr and pray.

Note to the person who tried to put a function in this FAQ as a legacy Bourne shell example: legacy Bourne shells don't have functions, at all. POSIX shells have them, but the syntax is:

foo() {
  commands
}

You may not use the word function, and you may especially not use the combination of the word function and the () symbols. No shell but bash accepts that.

The approach used in configure scripts is usually to iterate over the components of $PATH and explicitly test for the presence of the command in each directory. That should tell you how unreliable which(1) is. Here's a simplified version of such a test:

save_IFS=$IFS
IFS=:
found=no
for dir in $PATH; do
  if test -x "$dir/qwerty"; then
    echo "qwerty is installed (in $dir)"
    found=yes
    break
  fi
done
IFS=$save_IFS
if test $found = no; then
  echo "qwerty is not installed"
fi

Real configure scripts are generally much more complicated than this, since they may deal with systems where $PATH is not delimited by colons; or systems where executable programs may have optional extensions like .EXE; or $PATH variables that have the current working directory included in them as an empty string; etc. If you're interested in such things, I suggest reading an actual GNU autoconf-generated configure script. They're far too large and complicated to include in this FAQ.

82. Why is $(...) preferred over `...` (backticks)?

For several reasons:

  • It's easier to read. The character ` is difficult to read with small or unusual fonts.

  • It's easier to type. The physical key to produce the character may be located in an obscure place on non-US keyboards.
  • The backtick is easily confused with a single quote. People who see $() don't normally press the wrong keys. On the other hand, some people who see `cmd` may mangle it into 'cmd' because they don't know what a backtick is.

  • It makes nesting command substitutions easier. Compare:
    •   x=$(grep $(dirname "$path") file)
        x=`grep \`dirname "$path"\` file`
    It just gets uglier and uglier after two levels.
  • Backslashes (\) inside backticks are handled in a non-obvious manner:
    •   $ echo "`echo \\a`" "$(echo \\a)"
        a \a
        $ echo "`echo \\\\a`" "$(echo \\\\a)"
        \a \\a
  • Nested quoting inside $() is far more convenient.

    •   echo "x is $(echo "$y" | sed ...)"

    In this example, the quotes around $y are treated as a pair, because they are inside $(). This is confusing at first glance, because most C programmers would expect the quote before x and the quote before $y to be treated as a pair; but that isn't correct in shells. On the other hand,

    •   echo "x is `echo \"$y\" | sed ...`"
    requires backslashes around the internal quotes in order to be portable. Bourne and Korn shells require these backslashes, while Bash and dash don't.

The only time backticks are preferred is when writing code for the oldest Bourne shells, which do not know about $().

83. How do I determine whether a variable is already defined? Or a function?

There are several ways to determine whether a variable is defined to have a non-empty value. Here are the most common ones, in order from most portable to least portable:

if test -n "$var"
if [ -n "$var" ]
if test "$var"
if [ "$var" ]
if [[ -n $var ]]
if [[ $var ]]

If you need to distinguish between a variable that is undefined and one that is defined but empty, then it becomes much trickier. There is no explicit shell command to test for existence of a variable, but there are some parameter expansion tricks that can be used. Here is the simplest:

if [[ ${foo+defined} ]]
# This expansion results in nothing if foo is undefined.  Therefore [[ returns false.
# If foo is defined (to either "" or something longer), the expansion returns "defined",
# and therefore [[ returns true.
# You could use any non-empty string in place of "defined", but readability is nice.

For determining whether a function with a given name is already defined, there are several answers, all of which require Bash (or at least, non-Bourne) commands:

# These two are best:
if [[ $(declare -f foo) ]]     # it prints nothing, if undefined
if declare -f foo >/dev/null   # it also sets the exit status

# These are a little more obvious, but...
if [[ $(type foo 2>&1) = *\ is\ a\ function* ]]
if type foo >/dev/null 2>&1 && ! type -f foo >/dev/null 2>&1

A related question is, Why on earth does anyone want this? Why not just define the function already?

I don't know. I think it has something to do with reflection. But people keep asking it, so....

84. How do I return a string from a function? "return" only lets me give a number.

Functions in Bash (as well as all the other Bourne-family shells) work like commands; that is, they only "return" an exit status, which is a number from 0 to 255 inclusive. This is intended to be used only for signaling errors, not for returning the results of computations, or other data.

If you need to send back arbitrary data from a function to its caller, there are at least three methods by which this can be achieved:

  • You may have your function write the data to stdout, and then have the caller capture stdout.
    •   foo() {
           echo "this is my data"
        }
        x=$(foo)
        echo "foo returned '$x'"

    The drawback of this method is that the function is executed in a subshell, which means that any variable assignments, etc. performed in the function will not take effect in the caller's environment. This may or may not be a problem, depending on the needs of your program and your function.

  • You may assign data to global variables, and then refer to those variables in the caller.
    •   foo() {
           RETURN="this is my data"
        }
        foo
        echo "foo returned '$RETURN'"

    The drawback of this method is that, if the function is executed in a subshell, then the assignment to a global variable inside the function will not be seen by the caller. This means you would not be able to use the function in a pipeline, for example.

  • Your function may write its data to a file, from which the caller can read it.
    •   foo() {
           echo "this is my data" > "$1"
        }
        # This is NOT solid code for handling temp files!
        TMPFILE=$(mktemp)   # GNU/Linux
        foo "$TMPFILE"
        echo "foo returned '$(<"$TMPFILE")'"
        rm "$TMPFILE"
        # In the event that this were a real program, there
        # would have been error checking, and a trap.
    The drawbacks of this method should be obvious: you need to manage a temporary file, which is always inconvenient; there must be a writable directory somewhere, and sufficient space to hold the data therein; etc. On the positive side, it will work regardless of whether your function is executed in a subshell.

    For more information about handling temporary files within a shell script, see FAQ 62.

85. How to write several times to a fifo without having to reopen it?

The basic use of named pipes is:

mkfifo myfifo
cat < myfifo &
echo 'a' > myfifo

This works, but cat dies after reading one line. (In fact, what happens is each time the named pipe is closed by the writer, this signals an end of file condition for the reader. So cat, the reader, terminates because it saw the end of its input.)

What if we want to write several times to the pipe without having to restart the reader? We have to arrange for all our data to be sent without opening and closing the pipe multiple times.

If the commands are consecutive, they can be grouped:

cat < myfifo &
{ echo 'a'; echo 'b'; echo 'c'; } > myfifo

But if they can't be grouped for some reason, a better way is to assign a file descriptor to the pipe and write there:

cat < myfifo &

# assigning fd 3 to the pipe
exec 3>myfifo

# writing to fd 3 instead of reopening the pipe
echo 'a' >&3
echo 'b' >&3
echo 'c' >&3

# closing the fd
exec 3>&-

Closing the fd causes the pipe's reader to receive the end of file indication.

86. How to ignore aliases or functions when running a command?

Sometimes it's useful to ignore aliases (and functions, including shell built-in functions). For example, on your system you might have this set:

alias grep='grep --color=auto'

But sometimes, you need to do a one-liner with pipes where the colors mess things up. You could use any of the following:

unalias grep; grep ...    #1
unalias -a; grep ...      #2
"grep" ...                #3
\grep ...                 #4
command grep              #5

#1 unaliases grep before using it, doing nothing if grep wasn't aliased. However, the alias is then gone for the rest of that shell session.

#2 is similar, but removing all aliases.

#3 and #4 are the same, allowing you to run grep once while ignoring the grep alias.

#5 is different from the others in that ignores aliases and functions. It has a few options which you might want to use, see help command.

Option #6 would be to write your function such that it does not commit undesirable behavior when standard output is not a terminal. Thus:

ls() {
  if test -t 1; then
    command ls -FC "$@"
  else
    command ls "$@"
  fi
}

Using this instead of alias ls='ls -FC' will turn off the special flags when the function is being used in a pipeline (or any other case where stdout isn't a terminal).

See FAQ #80 for more discussion of using functions instead of aliases.

87. How can I get the permissions of a file without parsing ls -l output?

There are several potential ways, most of which are system-specific. They also depend on precisely why you want the permissions.

The majority of the cases where you might ask this question -- such as I want to find any files with the setuid bit set -- can be answered by the information in UsingFind#permissions. As the page name implies, those answers are based on the find(1) command.

For some questions, such as I want to make sure this file has 0644 permissions, you don't actually need to check what the permissions are. You can just use chmod 0644 myfile and set them directly.

If your needs aren't met by any of those, then we can look at a few alternatives:

  • On GNU/Linux systems, and possibly others, there is a command called stat(1). On older GNU/Linux systems, this command take no options -- just a filename -- and you will have to parse its output.

    $ stat / 
       File: "/"
       Size: 1024         Filetype: Directory
       Mode: (0755/drwxr-xr-x)         Uid: (    0/    root)  Gid: (    0/    root)
     Device:  8,0   Inode: 2         Links: 25   
     Access: Wed Oct 17 14:58:02 2007(00000.00:00:01)
     Modify: Wed Feb 28 15:42:14 2007(00230.22:15:49)
     Change: Wed Feb 28 15:42:14 2007(00230.22:15:49)

    In this case, one could extract the 0755 from the Mode: line, using awk or similar commands.

  • On newer GNU/Linux systems, the stat command takes arguments which allow you to specify which information you want:

    $ stat -c %a / 
     755
    That's obviously a lot easier to parse.
  • On systems with perl 5, you can use:
     perl -e 'printf "%o\n", 07777 & (stat $ARGV[0])[2]' "$filename"

    This returns the same octal string that the stat -c %a example does, but is far more portable. (And slower.)

BashFAQ (last edited 2007-09-19 00:34:48 by GreyCat)