Sorry for the long posting hiatus. I have enough readers now that it’s hard to write the usual random nonsense.

I was recently reminded of an old problem using CVS over SSH, which was an interesting example of various different instances of reasonable behaviour adding up to a bug. It’s possible that this bug has been fixed, but I’ll assume that it hasn’t. The bug is that “cvs diff 2>&1 | less” will sometimes drop data, leaving you looking at an incomplete diff.

When CVS invokes SSH, it sets up file descriptors 0 (standard input) and 1 (standard output), but not 2 (standard error). Thus SSH inherits file descriptor 2 from CVS. This means that any SSH errors get reported to the standard error passed to CVS, which is what you want to have happen. However, when using 2>&1, this means that SSH’s file descriptor 2 will be the same as CVS’s file descriptors 1 and 2.

SSH puts its file descriptors 0, 1, and 2 into nonblocking mode, so that it can use select to send data back and forth without blocking. This means that SSH puts CVS’s file descriptor 2 into nonblocking mode. When using 2>&1, file descriptors 1 and 2 are the same, so this puts CVS’s file descriptor 1 into nonblocking mode.

CVS uses stdio to output data to standard output. When writing to a pipe, the buffer can fill up. CVS naturally never puts the descriptor into nonblocking mode, but when SSH has done it indirectly, and the buffer fills, the data being written will be discarded. This is what causes the bug: the discarded data is never seen by the user.

So what’s the fix? It’s reasonable for SSH to put its descriptors into nonblocking mode. It’s reasonable for CVS to pass its file descriptor 2 to SSH. It’s reasonable for CVS to use stdio to output data. It’s reasonable for stdio to not specially handle a nonblocking file descriptor—any program which wants to use a nonblocking descriptor needs to handle I/O retries itself. It’s reasonable for 2>&1 to mean that file descriptors 1 and 2 refer to the same underlying pipe. It’s reasonable for the user to use 2>&1 when piping cvs diff output to less.

I think the only remaining link in the sequence leading to the bug is that when SSH sets its file descriptor 2 to nonblocking mode, this affects the file in CVS. This is a consequence of the Unix file model, in which file descriptors refer to underlying files. A file descriptor has only one flag: whether it is closed if the exec system cal is run. All the other information is attached to the underlying file. Using 2>&1 means that two file descriptors point to the same file. Forking and execing SSH does not change this–in fact, it adds two more file descriptors, in the SSH process, which point to the same file. Any change in the flags associated with that file is seen by all the associated file descriptors.

This separation of file descriptor and file is what makes 2>&1 work. It’s also what makes >> work; >> opens a file in append mode, and the append flag is inherited by other processes which refer to that file. In any case, what really counts here is not the exec, but the fork; forking a process should not change the flags associated with a file. Further, I’m sure there programs which depend on the fact that changing the flags on a file after a fork affects the file as seen by the parent process.

It’s possible to imagine that file descriptors point to a new shared structure which then points to the underlying file. The file position and some flags would stay with the underlying flie. The new shared structure would just hold some flags which need not always be shared: O_NONBLOCK, O_ASYNC, etc. Calling fork would not create a new shared structure, but calling exec would, copying the existing structure. That would let some flags not be copied across exec process boundaries.

However, that would be a significant change to the Unix file model, a model which has lasted for decades and is not clearly broken. Absent that change, we are left with a complex bug, in which all choices are reasonable.

The workaround for the problem is to invoke SSH with 2>/dev/null, and assume that SSH never writes anything useful to standard error. The 2>/dev/null disassociates SSH file descriptor 2 from CVS file descriptor 2, so CVS file descriptor 1 is not accidentally set into nonblocking mode.


  1. fche said,

    July 19, 2011 @ 4:06 am

    Interesting UNIX idiosyncrasy!

    “SSH puts its file descriptors 0, 1, and 2 into nonblocking mode, so that it can use select to send data back and forth without blocking”

    Knowing that ssh is used in such nested-process circumstances, perhaps its authors would consider switching to blockable threads instead of file handle flagging.

  2. lev said,

    July 19, 2011 @ 12:14 pm

    It’s not really true that the O_NONBLOCK is attached “to the underlying file”. Ie, if I re-open() the same file in a different process, I won’t inherit the same flags. Rather, in the laguage of susv3, the flags, etc, attach to an “open file description” of which several can refer to the same file, and several “file descriptors” can refer to the same “open file description”.

    Interestingly, CYGWIN applies flags such as O_NONBLOCK to file descriptors, not file descriptions (at least it did last time I checked the source, which was some years ago). I’m not aware of any misbehaviour associated with this breaking of POSIX behaviour.

  3. Ian Lance Taylor said,

    July 19, 2011 @ 2:08 pm

    I was just using language sloppily. When I talk about the underlying file I mean the structure representing the open file within the kernel. Of course opening a new file gets an entirely new set of flags.

    Thanks for the note on cygwin, interesting that it works. I would have thought it would fail on O_APPEND, at least, but perhaps that state is being recorded in the Windows handle somehow.

  4. lev said,

    July 19, 2011 @ 2:56 pm

    I just checked, and it seems the O_APPEND and O_NONBLOCK status are kept with the file descriptor (along with the FD_CLOEXEC, which is supposed to be per-descriptor). This is at least as far as fcntl( …, F_GETFL, …) is concerned — I didn’t check whether the blocking/nonblocking and append/nonappend *behaviour* is set on the underlying open file description, but I’m fairly sure that is not the case.

    Do you have some specific situation where you think this would fail? I’d be interested to test.

  5. Simetrical said,

    July 19, 2011 @ 4:13 pm

    If you have a significant number of readers now, it’s because they liked what you posted before. It doesn’t make sense to change what you post on your blog for fear of alienating people who only came here because they liked what you posted to start with. Apparently people like your idea of random nonsense.

  6. Ian Lance Taylor said,

    July 20, 2011 @ 9:48 pm

    lev: the kind of case where I would check for failure would be
    prog >>foo.txt 2>&1
    where prog writes to both standard output and standard error. On Unix all the output should be appended to foo.txt, because the O_APPEND flag, and, for that matter, the file position, will apply to the underlying file structure referenced by both file descriptors. If the O_APPEND flag does not apply, then it seems possible that prog’s output would overwrite itself.

    If the file position is shared, but the O_APPEND flag is not, then try a program which opens an existing non-empty file with O_APPEND, dups the file descriptor, calls lseek to change the file position to 0, and then write. On Unix O_APPEND means that the file position is reset to the end of the file before each write, and I assume cygwin does the same. The question is whether this also applies to the dup’ed descriptor, as it would on Unix.

  7. lev said,

    July 20, 2011 @ 11:08 pm

    I tested (see below) and cygwin really doesn’t follow Unix. Nevertheless, cygwin has about 2000 packages and *many* users and I’m not aware of any bugs resulting from this misbehaviour. prog >>foo.txt 2>&1 because the dup()d descriptor 2 inherits the append flag from 1 at the time of the dup(). Given the lack of bugs on cygwin, I would suggest that it could be a relatively safe change to make O_APPEND a per-descriptor flag.


    int main(void)
    int f,g;
    f=open(“testfile”, O_RDWR, S_IRUSR | S_IWUSR);
    fcntl(f, F_SETFL, fcntl(f, F_GETFL) | O_APPEND);
    write(g, “x”, 1);

    $ gcc test.c && echo ‘no overwrite plz!’ > testfile && ./a.exe && cat testfile
    xxxoverwrite plz!

RSS feed for comments on this post · TrackBack URI

You must be logged in to post a comment.