Having a very simplified example (still I'm not sure it would be totally reproducible at any env) So there's a socket pipe
func SocketPair() (*os.File, *os.File, error) {
fds, err := syscall.Socketpair(syscall.AF_UNIX, syscall.SOCK_STREAM, 0)
if err != nil {
return nil, nil, err
}
f0 := os.NewFile(uintptr(fds[0]), "socket-0")
f1 := os.NewFile(uintptr(fds[1]), "socket-1")
return f0, f1, nil
}
And a simple cmd call
func main() {
f0, f1, err := utils.SocketPair()
if err != nil {
return panic(err)
}
cmd := exec.CommandContext(ctx, "cat")
cmd.Stdin = f0
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
// pipe routine
go func() {
size, err := io.Copy(f1, os.Stdin)
fmt.Printf("------res %d, %v", size, err)
f1.Close()
}()
err := cmd.Run()
if err != nil {
return panic(err)
}
}
So calling this with something like
echo "abc\ndef\nghi" | app
makes output similar to
------res 11, <nil>
abc
def
ghi
and hangs. Which actually tells that pipe routine
successfully delivered stdin data to the socket pair. Yet the cmd input is not still EOF.
For this exact simple example the issue (in my env) can be solved with two options
pipe routine
just before cmd := exec.
linepipe routine
at the very initial position instead make it waiting to enter the execution as follows started := make(chan byte)
go func() {
close(started)
size, err := io.Copy(f1, os.Stdin)
fmt.Printf("------res %d, %v", size, err)
f1.Close()
}()
<-started
So both these solutions resolves the issue and application gracefully exits.
Still in more complex cases with deeper go routines chain even this doesn't help. Instead simple call time.Sleep(time.Second)
just before the cmd.Run()
works.
It very looks like there's a race condition for the moment of start reading within io.Copy / cmd.Run
matters a lot.
So solving the issue I don't want to play with time.Sleep
here searching for the optimal interval (which is a bad idea if this is really a race condition)
Yet my crucial question here: what is the root cause of that behavior. What is really the matter for who starts reading first.
Thanks
Having a very simplified example (still I'm not sure it would be totally reproducible at any env) So there's a socket pipe
func SocketPair() (*os.File, *os.File, error) {
fds, err := syscall.Socketpair(syscall.AF_UNIX, syscall.SOCK_STREAM, 0)
if err != nil {
return nil, nil, err
}
f0 := os.NewFile(uintptr(fds[0]), "socket-0")
f1 := os.NewFile(uintptr(fds[1]), "socket-1")
return f0, f1, nil
}
And a simple cmd call
func main() {
f0, f1, err := utils.SocketPair()
if err != nil {
return panic(err)
}
cmd := exec.CommandContext(ctx, "cat")
cmd.Stdin = f0
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
// pipe routine
go func() {
size, err := io.Copy(f1, os.Stdin)
fmt.Printf("------res %d, %v", size, err)
f1.Close()
}()
err := cmd.Run()
if err != nil {
return panic(err)
}
}
So calling this with something like
echo "abc\ndef\nghi" | app
makes output similar to
------res 11, <nil>
abc
def
ghi
and hangs. Which actually tells that pipe routine
successfully delivered stdin data to the socket pair. Yet the cmd input is not still EOF.
For this exact simple example the issue (in my env) can be solved with two options
pipe routine
just before cmd := exec.
linepipe routine
at the very initial position instead make it waiting to enter the execution as follows started := make(chan byte)
go func() {
close(started)
size, err := io.Copy(f1, os.Stdin)
fmt.Printf("------res %d, %v", size, err)
f1.Close()
}()
<-started
So both these solutions resolves the issue and application gracefully exits.
Still in more complex cases with deeper go routines chain even this doesn't help. Instead simple call time.Sleep(time.Second)
just before the cmd.Run()
works.
It very looks like there's a race condition for the moment of start reading within io.Copy / cmd.Run
matters a lot.
So solving the issue I don't want to play with time.Sleep
here searching for the optimal interval (which is a bad idea if this is really a race condition)
Yet my crucial question here: what is the root cause of that behavior. What is really the matter for who starts reading first.
Thanks
When searching for more details on syscall.Socketpair
, I stumbled on this gist :
func Socketpair() (net.Conn, net.Conn, error) {
fds, err := syscall.Socketpair(syscall.AF_LOCAL, syscall.SOCK_STREAM, 0)
if err != nil {
return nil, nil, err
}
c1, err := fdToFileConn(fds[0])
if err != nil {
return nil, nil, err
}
c2, err := fdToFileConn(fds[1])
if err != nil {
c1.Close()
return nil, nil, err
}
return c1, c2, err
}
func fdToFileConn(fd int) (net.Conn, error) {
f := os.NewFile(uintptr(fd), "")
defer f.Close()
return net.FileConn(f)
}
Pluging this into your code sample fixes the issue on my linux system.
complete playground sample: https://go.dev/play/p/B24cowycU1G
note: running it on the playground does not give the same behavior as on my machine (either the time package is tweaked in a way that hinders the timeouts, or interprocess signalling is just forbidden ...), if you copy/paste the code to a go file on your machine you should get:
$ go run foo.go
===== net.Conn pair
Hello World!
===== *os.File pair
Hello World!
panic: signal: killed # <- timeout triggered
goroutine 1 [running]:
main.main()
/tmp/foo.go:105 +0xdb
exit status 2
I haven't looked in complete details the differences between os.NewFile()
and net.FileConn()
, the first obvious difference I spotted is that os.NewFile()
wraps the file descriptor using os.newFile()
, while net.FileConn
uses net.newFileFD()
, and both functions have a very different initialization sequence.
I guess, the hanging problem you're seeing is a race condition between when the reading and writing of the pipe gets set up. It's tricky because sometimes it works and sometimes it doesn't!
Here's why both your solutions work:
Moving the pipe routine earlier: This gives the goroutine time to start up before cmd.Run()
tries to read from it.
Using the channel sync:
started := make(chan byte)
go func() {
close(started)
// your copy code
}()
<-started
This makes sure your goroutine is actually running before moving on.
Instead of using time.Sleep()
, for example:
// Set up a WaitGroup to coordinate everything
var wg sync.WaitGroup
wg.Add(1)
go func() {
defer wg.Done()
defer f1.Close()
size, err := io.Copy(f1, os.Stdin)
fmt.Printf("------res %d, %v", size, err)
}()
err := cmd.Run()
wg.Wait()
The root issue is that we need to make sure the goroutine handling the pipe is ready before the command starts trying to read from it.