dssh - executing an arbitrary command in parallel on an arbitrary number of hosts
January 21st, 2008
I asked “What do you want” and you said scripting. Which is good, because I have felt like scripting lately!
I help a website hosting company, Idologic, on the weekends. (Side note: I highly recommend Idologic. I have worked with and been a customer of many other hosting companies. I really doubt you will find better customer service elsewhere.) Like many businesses these days, Idologic has quite a few Linux servers. When presented with many servers, I typically want to parallelize my work.
As such, I have written a script called dssh (previous version), which allows you to execute commands on n hosts, in parallel. This can be used to find information on the hosts, such as load average, number of processes by user, number of processes by process name, etc.
There are other options such as pssh and p-run, however I wanted to create a shell solution which could be easily and simply “installed”. Dssh reads standard input. It expects one host per line. Host specific ssh options are supported. Here is my sample hosts file:
$ cat hosts mojito -l noland kodiak mojito kodiak -C mojito -i /home/noland/.ssh/id_rsa kodiak
There is nothing restricting you from generating this output from some type of meta data (I.E. database). Here are some examples of output:
$ ./dssh.sh "uptime" < hosts First time huh? Think your cmd over and then try again. $ ./dssh.sh "uptime" < hosts mojito:O:0:19:16:45 up 3 days, 14 min, 5 users, load average: 0.22, 0.22, 0.20 kodiak:O:0:13:24:00 up 20:00, 1 user, load average: 0.42, 0.16, 0.05 mojito:O:0:19:16:45 up 3 days, 14 min, 5 users, load average: 0.22, 0.22, 0.20 kodiak:O:0:13:24:00 up 20:00, 1 user, load average: 0.42, 0.16, 0.05 mojito:O:0:19:16:45 up 3 days, 14 min, 5 users, load average: 0.22, 0.22, 0.20 kodiak:O:0:13:24:00 up 20:00, 1 user, load average: 0.42, 0.16, 0.0
$ ./dssh.sh "pgrep -u noland | wc -w" < hosts mojito:O:0:60 kodiak:O:0:5 mojito:O:0:60 kodiak:O:0:5 mojito:O:0:60 kodiak:O:0:5
$ ./dssh.sh "ls not_a_file" < hosts mojito:E:2:ls: not_a_file: No such file or directory kodiak:E:2:ls: not_a_file: No such file or directory mojito:E:2:ls: not_a_file: No such file or directory kodiak:E:2:ls: not_a_file: No such file or directory mojito:E:2:ls: not_a_file: No such file or directory kodiak:E:2:ls: not_a_file: No such file or directory
Notes:
- With great power comes even greater responsibility. Running rm -rf / as root with this script would do exactly that.
- I don’t reccomend doing anything with this script that “changes state”.
- I make no warranties or promises.
- You need ssh keys to use this. I recommend using ssh-agent.
- By default dssh will execute 10 children in parallel. If you have a large host, increase this.
- When looping through the hosts, if the maximum number of children are still processing, the script will sleep 500ms. If your version of sleep does not support fractional seconds, you will need to change this.
Here is an outline of the script:
- Read from standard input a list of hosts
- Configure trap to remove temporary files on exit
- For each host
- Sleep while we have more children than the maximum number of children
- Generate three temporary files, one for each of
- Standard Output
- Standard Error
- Exit value
- Create a child process saving stdin, stderr, and the exit value in their respective files.
- Wait for all children to exit
- For each host
- If the standard output or error files are of size greater than zero, print the content, prefacing each line with the hostname, standard error/output indicator, and exit status.
- Else print something to indicate we executed a process and have an exit value.
Once again, here is the script I am calling dssh.


January 21st, 2008 at 9:55 pm
That could be pretty useful I think.
February 2nd, 2008 at 9:16 pm
[…] bookmarks tagged arbitrary dssh - executing an arbitrary command in parallel …saved by 1 others: firecracker1995 bookmarked on 02/02/08 | […]
February 10th, 2008 at 3:22 pm
Hey.
I wrote a similar script for Tailrank.
The main difference is that the first argument is the name of a profile for the hosts.
I use ‘www’ or ‘robot’ or any arbitrary profile name.
This is then loaded from /etc/dssh/robot or /etc/dssh/www
Every other argument is passed to ssh. This can be done with ’shift’.
Mine works synchronous which has pros/cons. I have an async one too which uses the same model you have as well.
Async support isn’t such a big deal for me because 99% of the time I’m pushing a new build so running the file copy from a central server to multiple nodes in parallel would actually be slower.
One thing I added was dscp so that you could copy files to N hosts.
I have been MEANING on putting this in Google code and will probably do so tonight…
The concept itself is very simple. It would be nice to get this stuff into Debian at some point.
Kevin
February 11th, 2008 at 5:18 pm
This script is not compatible with Bash 2.0
I get the following error:
+ [[ ! -z mybox ]]
+ ‘hosts[’ 0 ‘]=mybox’
./dssh.sh: hosts[: command not found
February 12th, 2008 at 12:22 pm
@Randy,
I don’t think bash 2.0 supports arrays…
Brock
February 12th, 2008 at 12:23 pm
I’ll check the bash version and exit if arrays are not supported.
February 13th, 2008 at 1:29 pm
I fixed it by removing the ’spaces’ for the hosts variable:
[[ ! -z “${host}” ]] && hosts[ ${#hosts[@]} ]=${host}
replaced by
[[ ! -z “${host}” ]] && hosts[${#hosts[@]}]=${host}
February 13th, 2008 at 1:31 pm
BTW it works really nice in Bash 2.0 after the change.
February 13th, 2008 at 2:23 pm
@Randy,
Thanks! I will update the script tonight.
February 23rd, 2008 at 12:43 am
[…] updated dssh (see executing an arbitrary command in parallel on an arbitrary number of hosts) orĀ download the new version. […]
February 23rd, 2008 at 12:45 am
I updated dssh, see the changes.
March 12th, 2008 at 11:41 am
Frank Sorenson has a multi-threaded program that does a great job of running commands on many machines. Enjoy:
http://www.tuxrocks.com/Projects/p-run/
March 12th, 2008 at 11:49 am
@Andrew,
Thanks for the link. I will add that to the article where I mention pssh.
Also…I just read your page on Mrs. I think I might be in love.