Rsync system() execution in Perl fails randomly with protocol error

276 Views Asked by At

We commonly call rsync in (Mod)Perl (and PHP) and haven't run into too many issues but we are running into an occassional protocol error when running a command that when run on a subsequent request goes through fine. The funny thing is even if I retry in code, for the same http request, it will fail every time, but if you make another http request it will likely succeed.

Code looks like this:

$cmd = sprintf('rsync -auvvv --rsync-path="rsync --log-file=/tmp/ui-rsync.log" %s %s', "$fromDir/$fromBase/", "$path/$toBase/");
$exitCode = system($cmd);

The --rsync-path argument was added in there later for debugging. It was unhelpful. It fails either way.

The errors look like this:

rsync: connection unexpectedly closed (0 bytes received so far) [sender]
rsync error: error in rsync protocol data stream (code 12) at io.c(600) [sender=3.0.6]

or like this:

unexpected tag 93 [receiver]
rsync error: error in rsync protocol data stream (code 12) at io.c(1134) [receiver=3.0.6]
rsync: connection unexpectedly closed (9 bytes received so far) [sender]
rsync error: error in rsync protocol data stream (code 12) at io.c(600) [sender=3.0.6]

I debug the actual generated commands and I can run them by-hand fine.

The http user can run the commands fine.

Again, a programmatic retry never works, but a manually retry (hitting the same http endpoint that triggers it), works almost always.

Appreciate any help as this has been driving us crazy for...a long time, with many fixes tried.

2

There are 2 best solutions below

0
Shovas On BEST ANSWER

The solution was to change system() to backticks. Seriously. I don't know why it works.

The change is literally this:

# BAD:
$exitCode = system($cmd);
# GOOD:
`$cmd`;

If I had to guess I'd say there's some subtle difference with how the shell is being initialized, maybe some environment variables or memory locations not being cleaned properly. I really don't know, though.

1
Kjetil S. On

If this is a real heisenbug, you could retry rsync maybe three times with some sleep between:

for my $n (1..3){
   my $exitCode = system($cmd);
   my_log_sub("SUCCESS: rsync succeded on try $n") + last if $exitCode==0;
   my_log_sub("ERROR: rsync $n of 3 failed: $cmd $! $?");
   sleep(1) if $n<3;
}

Have you checked your local and remote logs? Try sudo ls -rtl /var/log/ or sudo ls -rtl /var/log/httpd/ right after a fail and tail -f /var/log/one_of_the_newest_logs while retrying.

Have you checked if the remote disk is full or if the directory exists? Firewall issue? Remote and local rsync or ssh versions are (very) different? (although I guess that should show a clearer error message)