Yesterday, I wrote a post titled “Why do CGI scripts and shell scripts fail when they contain carriage returns?” I got a comment and a few emails saying in the words of Stan “I’d be interested in the process by which you narrowed it down to the exec.c file. You jumped straight to the catch, but I wanted to see the chase.”

This is the chase. If you were not interested in yesterday’s post, this will probably interest you even less. However, I do have a two posts on command line tips in the works. Please stay tuned.

I didn’t believe (mistakenly) that BASH and Apache simply sent the name of shell script name into a system call. Surely there had to be more magic involved! Thus, I downloaded BASH 3.2 and unzipped the source. I grep’ed “bad interpreter” which is printed when you try and execute a script with a carriage return after the interpreter. That found a few matches, but only one one match in a c source file.

noland@mojito:~/bash-3.2$ grep -R 'bad interpreter' .
Binary file ./po/en@quot.gmo matches
./po/ru.po:msgid "%s: %s: bad interpreter"
./po/en@boldquot.po:msgid "%s: %s: bad interpreter"
./po/en@boldquot.po:msgstr "%s: %s: bad interpreter"
./po/en@quot.po:msgid "%s: %s: bad interpreter"
./po/en@quot.po:msgstr "%s: %s: bad interpreter"
./po/bash.pot:msgid "%s: %s: bad interpreter"
Binary file ./po/en@boldquot.gmo matches
./execute_cmd.c:              sys_error (_("%s: %s: bad interpreter"), command, interp ? interp : "");

I opened “execute_cmd.c” with less and searched for “bad interpreter”. Seeing the following code:

3339    execve (command, args, env);

I placed the following code directly above that statement. NOTE: If you copy this code from my blog, you have to replace the quotes as WP does something stupid with them:

          printf("{");
          int z = 0;
          for(; z < strlen(command); z++) {
                  if(command[z] == '\r') {
                          printf("(carriage return)");
                  } else {
                          printf("%c", command[z]);
                  }
          }
          printf("}\n");

And then compiled bash:

noland@mojito:~/bash-3.2$ ./configure && make
checking build system type... i686-pc-linux-gnu
.....
output removed
.....
ls -l bash
-rwxr-xr-x 1 noland noland 2070786 2008-01-08 18:40 bash
size bash
   text    data     bss     dec     hex filename
 681571   19288   19848  720707   aff43 bash

I created the test file and made it executable:

noland@mojito:~/bash-3.2$ echo -e '#!/bin/bash\r\n/usr/bin/id' >id.sh
noland@mojito:~/bash-3.2$ chmod +x id.sh

Then I switched to my new shell and ran the executable:

noland@mojito:~/bash-3.2$ ./bash
noland@mojito:~/bash-3.2$ ./id.sh
{./id.sh}
bash: ./id.sh: /bin/bash^M: bad interpreter: No such file or directory

Rats, the shell is just sending in the file name to the execve() call. I then noticed the following comment after the execve() call:

          /* The file has the execute bits set, but the kernel refuses to
             run it for some reason.

Score! Time to do some kernel modifications…. When I first started using Linux, I used to compile the kernel for lack of anything better to do. (I lived 50 miles from civilization and my win modem didn’t work with Linux.) As such, I knew I could get it up and running, but I doubted my ability to intercept the actual execve() call.

Regardless, I downloaded the kernel and ran the following grep:

noland@mojito:~/linux-2.6.23.9$ grep -R 'execve' .
./fs/compat.c: * compat_do_execve() is mostly a copy of do_execve(), with the exception
./fs/compat.c:int compat_do_execve(char * filename,
./fs/compat.c:          /* execve success */
./fs/exec.c: * sys_execve() executes a new program.
./fs/exec.c:int do_execve(char * filename,
....output removed

Jackpot! I then setup a VMWare instance and compiled the kernel with the default config from distribution. On boot, a few things didn’t start, but I didn’t care. I booted it without any modifications first so I would know when and if my modifications were the cause of a kernel panic. I’ll post how I did this later this week.

First off, I wanted to know what was coming into the do_execve() as the filename parameter. I didn’t think I would have printf() available to me. (Maybe it is, I don’t know.) So I grep’ed for print to see what the kernel used:

noland@mojito:~/linux-2.6.23.9$ grep -R ' print' . | grep 'c.:' | head -n 4
./fs/reiserfs/prints.c:    printk ("reiserfs_put_super: session statistics: balances %d, fix_nodes %d, \
./net/decnet/dn_nsp_out.c:              /* printk(KERN_DEBUG "ack: %s %04x %04x\n", ack ? "ACK" : "SKIP", (int)cb2->segnum, (int)acknum); */
./arch/um/kernel/sysrq.c:        printk("Call Trace: \n");
./arch/ppc/platforms/residual.c:                if ( did.BusId & PNPISADEVICE ) printk("PNPISA Device:");

It looked like printk was available and looked much like printf. Giddy up. I placed a variation of the the code I used above to print filename. When I rebooted, a TON of stuff was printed to the terminal, but when I executed my script, the output was exactly the same as it was before. Then this portion of the do_exec() function caught my eye:

        retval = search_binary_handler(bprm,regs);
        if (retval >= 0) {
                /* execve success */
                free_arg_pages(bprm);
                security_bprm_free(bprm);
                acct_update_integrals(current);
                kfree(bprm);
                return retval;
        }

The function signature of search_binary_handler() is:

int search_binary_handler(struct linux_binprm *bprm,struct pt_regs *regs);

I found the declaration of linux_binprm in “./include/linux/binfmts.h”

noland@mojito:~/linux-2.6.23.9$ grep -R linux_binprm . | grep '\.h'
./arch/ia64/ia32/ia32priv.h:struct linux_binprm;
./arch/ia64/ia32/ia32priv.h:extern int ia32_setup_arg_pages (struct linux_binprm *bprm, int exec_stack);
./security/selinux/include/objsec.h:    struct linux_binprm *bprm;     /* back pointer to bprm object */
./include/linux/binfmts.h:/* sizeof(linux_binprm->buf) */
./include/linux/binfmts.h:struct linux_binprm{
....output removed

The following variables caught my eye in that structure declaration:

struct linux_binprm{
        char buf[BINPRM_BUF_SIZE];
...stuff removed....
        char * filename;        /* Name of binary as seen by procps */
        char * interp;          /* Name of the binary really executed. Most
                                   of the time same as filename, but could be
                                   different for binfmt_{misc,script} */

The comment for interp, seemed to be hinting at the answer! I tried the same code for interp and the results were interesting, but not what I wanted. I then tried printing buf immediately before the last return statement and boom, it worked! Pick up the “Rest of the Story“.

2 Responses to “On the case of carriage returns and kernel exec system calls”

  1. Stan Says:

    Nice sleuthing. I like seeing the process others follow to track down issues such as this one.

    You’ve got me a kernel compiling mood, all of a sudden ;)

  2. Why do CGI scripts and shell scripts fail when they contain carriage returns? Says:

    […] In response to reader requests, I wrote an explanation of my investigation: On the case of carriage returns and kernel exec system calls. Category: Debugging, Shell, kernel, way too much information, […]

Leave a Reply

If Wordpress eats your comment (shell output, loops, ex..) email the text to me.