I asked “What do you want” and you said scripting. Which is good, because I have felt like scripting lately!

I help a website hosting company, Idologic, on the weekends. (Side note: I highly recommend Idologic. I have worked with and been a customer of many other hosting companies. I really doubt you will find better customer service elsewhere.) Like many businesses these days, Idologic has quite a few Linux servers. When presented with many servers, I typically want to parallelize my work.

As such, I have written a script called dssh (previous version), which allows you to execute commands on n hosts, in parallel. This can be used to find information on the hosts, such as load average, number of processes by user, number of processes by process name, etc.

There are other options such as pssh and p-run, however I wanted to create a shell solution which could be easily and simply “installed”. Dssh reads standard input. It expects one host per line. Host specific ssh options are supported. Here is my sample hosts file:

$ cat hosts
mojito
-l noland kodiak
mojito
kodiak
-C mojito
-i /home/noland/.ssh/id_rsa kodiak

There is nothing restricting you from generating this output from some type of meta data (I.E. database). Here are some examples of output:

$ ./dssh.sh "uptime" < hosts
First time huh? Think your cmd over and then try again.
$ ./dssh.sh "uptime" < hosts
mojito:O:0:19:16:45 up 3 days, 14 min,  5 users,  load average: 0.22, 0.22, 0.20
kodiak:O:0:13:24:00 up 20:00,  1 user,  load average: 0.42, 0.16, 0.05
mojito:O:0:19:16:45 up 3 days, 14 min,  5 users,  load average: 0.22, 0.22, 0.20
kodiak:O:0:13:24:00 up 20:00,  1 user,  load average: 0.42, 0.16, 0.05
mojito:O:0:19:16:45 up 3 days, 14 min,  5 users,  load average: 0.22, 0.22, 0.20
kodiak:O:0:13:24:00 up 20:00,  1 user,  load average: 0.42, 0.16, 0.0
$ ./dssh.sh "pgrep -u noland | wc -w" < hosts
mojito:O:0:60
kodiak:O:0:5
mojito:O:0:60
kodiak:O:0:5
mojito:O:0:60
kodiak:O:0:5
$ ./dssh.sh "ls not_a_file" < hosts
mojito:E:2:ls: not_a_file: No such file or directory
kodiak:E:2:ls: not_a_file: No such file or directory
mojito:E:2:ls: not_a_file: No such file or directory
kodiak:E:2:ls: not_a_file: No such file or directory
mojito:E:2:ls: not_a_file: No such file or directory
kodiak:E:2:ls: not_a_file: No such file or directory

Notes:

  1. With great power comes even greater responsibility. Running rm -rf / as root with this script would do exactly that.
  2. I don’t reccomend doing anything with this script that “changes state”.
  3. I make no warranties or promises.
  4. You need ssh keys to use this. I recommend using ssh-agent.
  5. By default dssh will execute 10 children in parallel. If you have a large host, increase this.
  6. When looping through the hosts, if the maximum number of children are still processing, the script will sleep 500ms. If your version of sleep does not support fractional seconds, you will need to change this.

Here is an outline of the script:

  1. Read from standard input a list of hosts
  2. Configure trap to remove temporary files on exit
  3. For each host
    1. Sleep while we have more children than the maximum number of children
    2. Generate three temporary files, one for each of
      1. Standard Output
      2. Standard Error
      3. Exit value
    3. Create a child process saving stdin, stderr, and the exit value in their respective files.
  4. Wait for all children to exit
  5. For each host
    1. If the standard output or error files are of size greater than zero, print the content, prefacing each line with the hostname, standard error/output indicator, and exit status.
    2. Else print something to indicate we executed a process and have an exit value.

Once again, here is the script I am calling dssh.

13 Responses to “dssh - executing an arbitrary command in parallel on an arbitrary number of hosts”

  1. Douglas Says:

    That could be pretty useful I think.

  2. Pages tagged "arbitrary" Says:

    […] bookmarks tagged arbitrary dssh - executing an arbitrary command in parallel …saved by 1 others: firecracker1995 bookmarked on 02/02/08 | […]

  3. Kevin Burton Says:

    Hey.

    I wrote a similar script for Tailrank.

    The main difference is that the first argument is the name of a profile for the hosts.

    I use ‘www’ or ‘robot’ or any arbitrary profile name.

    This is then loaded from /etc/dssh/robot or /etc/dssh/www

    Every other argument is passed to ssh. This can be done with ’shift’.

    Mine works synchronous which has pros/cons. I have an async one too which uses the same model you have as well.

    Async support isn’t such a big deal for me because 99% of the time I’m pushing a new build so running the file copy from a central server to multiple nodes in parallel would actually be slower.

    One thing I added was dscp so that you could copy files to N hosts.

    I have been MEANING on putting this in Google code and will probably do so tonight…

    The concept itself is very simple. It would be nice to get this stuff into Debian at some point.

    Kevin

  4. Randy Says:

    This script is not compatible with Bash 2.0
    I get the following error:
    + [[ ! -z mybox ]]
    + ‘hosts[’ 0 ‘]=mybox’
    ./dssh.sh: hosts[: command not found

  5. admin Says:

    @Randy,

    I don’t think bash 2.0 supports arrays…

    Brock

  6. admin Says:

    I’ll check the bash version and exit if arrays are not supported.

  7. Randy Says:

    I fixed it by removing the ’spaces’ for the hosts variable:
    [[ ! -z “${host}” ]] && hosts[ ${#hosts[@]} ]=${host}

    replaced by

    [[ ! -z “${host}” ]] && hosts[${#hosts[@]}]=${host}

  8. Randy Says:

    BTW it works really nice in Bash 2.0 after the change.

  9. admin Says:

    @Randy,

    Thanks! I will update the script tonight.

  10. dssh version 0.2 Says:

    […] updated dssh (see executing an arbitrary command in parallel on an arbitrary number of hosts) orĀ  download the new version. […]

  11. admin Says:

    I updated dssh, see the changes.

  12. Andrew McNabb Says:

    Frank Sorenson has a multi-threaded program that does a great job of running commands on many machines. Enjoy:

    http://www.tuxrocks.com/Projects/p-run/

  13. admin Says:

    @Andrew,

    Thanks for the link. I will add that to the article where I mention pssh.

    Also…I just read your page on Mrs. I think I might be in love.

Leave a Reply

If Wordpress eats your comment (shell output, loops, ex..) email the text to me.