Fix bug 12252: updxlrator - handle \ in filename

Message ID 9fdfa42e-cb04-be00-3605-1284d294f5c5@mail.com
State Dropped
Headers
Series Fix bug 12252: updxlrator - handle \ in filename |

Commit Message

Justin Luth Dec. 5, 2019, 1:07 p.m. UTC
  Recently some Microsoft patches are alternating between using
\ and / as directory separators, leaving files behind in
/var/updatecache/download that can only be removed in a terminal.

What was happening was that the lock file was created using the
URL name containing %5c while wget was actually writing to a filename
with a real backslash. The zero-sized lockfile was moved into the uuid
folder, and the actual .cab file was orphaned in the download folder,
indicated in the GUI as a stranded file that couldn't be deleted.

An alternate way to fix the problem could be to force wget to
use --restrict-file-names=windows - in which case \ | : ? " * < >
would all be saved as escaped codes. That would seem to better
match the URL mangling that apparently is happening.
Cons:
-removing sourceurl mangling means re-downloading existing caches.
-less human-readable URLs / filesnames
-conceptual alternative: I didn't prove/test this theory since it
means re-downloading all oddly-named files.

P.S. I didn't change perl's utime command to use updatefile_shellEscaped
since the filepath is in quotes - the space didn't actually need to be
escaped.

---

  config/updxlrator/download   | 20 ++++++++++++++++----
  config/updxlrator/updxlrator |  2 ++
  2 files changed, 18 insertions(+), 4 deletions(-)


--
2.17.1
  

Comments

Michael Tremer Dec. 6, 2019, 6:55 a.m. UTC | #1
Hi,


Thank you very much for looking into this.

This seems to be a tricky one.

> On 5 Dec 2019, at 13:07, Justin Luth <jluth@mail.com> wrote:
> 
> Recently some Microsoft patches are alternating between using
> \ and / as directory separators, leaving files behind in
> /var/updatecache/download that can only be removed in a terminal.
> 
> What was happening was that the lock file was created using the
> URL name containing %5c while wget was actually writing to a filename
> with a real backslash. The zero-sized lockfile was moved into the uuid
> folder, and the actual .cab file was orphaned in the download folder,
> indicated in the GUI as a stranded file that couldn't be deleted.
> 
> An alternate way to fix the problem could be to force wget to
> use --restrict-file-names=windows - in which case \ | : ? " * < >
> would all be saved as escaped codes. That would seem to better
> match the URL mangling that apparently is happening.
> Cons:
> -removing sourceurl mangling means re-downloading existing caches.
> -less human-readable URLs / filesnames
> -conceptual alternative: I didn't prove/test this theory since it
> means re-downloading all oddly-named files.
> 
> P.S. I didn't change perl's utime command to use updatefile_shellEscaped
> since the filepath is in quotes - the space didn't actually need to be
> escaped.
> 
> ---
> 
>  config/updxlrator/download   | 20 ++++++++++++++++----
>  config/updxlrator/updxlrator |  2 ++
>  2 files changed, 18 insertions(+), 4 deletions(-)
> 
> diff --git a/config/updxlrator/download b/config/updxlrator/download
> index afa6e6cb9..df4a58725 100644
> --- a/config/updxlrator/download
> +++ b/config/updxlrator/download
> @@ -26,6 +26,7 @@ my @http_header=();
>  my $remote_size=0;
>  my $remote_mtime=0;
>  my $updatefile='';
> +my $updatefile_shellEscaped='';  #double escaped for re-processing by
> shell/system calls
>  my $unique=0;
>  my $mirror=1;
> 
> @@ -38,11 +39,24 @@ my $restartdl = defined($ARGV[3]) ? $ARGV[3] : 0;
> 
>  umask(0002);
> 
> +# Hopefully correct documentation: based on debugging a "\" problem...
> +# WGET doesn't use the filename from the URL, but gets the filename
> from the server info.
> +#   -mangles that name according to --restrict-file-names.
> +#   -for unix, only mangles "/", so +, ~, \ etc. are not escaped/mangled.
> +#      -(which suggests the entry for %2f is incorrect BTW)
> +# The URL passed to this module is mangled (correct? - perhaps by
> http://search.cpan.org/dist/URI/lib/URI/Escape.pm)
> +#   -but $updatefile needs to match wget's download name, regardless of
> the URL's encoding.
> +#       -achievable either by mangling the URL and/or the filename.
> +#       -*make sure "updxlrator" has the same adjustments*
>  $sourceurl =~ s@\%2b@+@ig;
>  $sourceurl =~ s@\%2f@/@ig;
>  $sourceurl =~ s@\%7e@~@ig;
>  $updatefile = substr($sourceurl,rindex($sourceurl,"/")+1);
> +$updatefile_shellEscaped = $updatefile;
> +$updatefile_shellEscaped =~ s@\%20@\\ @ig;
> +$updatefile_shellEscaped =~ s@\%5c@\\\\@ig;
>  $updatefile =~ s@\%20@ @ig;
> +$updatefile =~ s@\%5c@\\@ig;
>  $vendorid =~ tr/A-Z/a-z/;

This is really messy code. Not yours. What has been here before.

It is quite hard to figure out what was actually supposed to happen here. The replacements look more or less like fixes instead of thinking ahead and decoding any special characters.

>  unless (-d "$repository/download/$vendorid")
> @@ -58,7 +72,7 @@ if($restartdl == 0)
> 
>      # hinder multiple downloads from starting simultaneously. Create
> empty "lock" file.
>      # TODO: Another thread may sneak in between these two commands -
> so not fool-proof, but good enough?
> -    system("touch $repository/download/$vendorid/$updatefile");
> +    system("touch
> $repository/download/$vendorid/$updatefile_shellEscaped");

However, this is all needed because we are calling shell commands here to move a file.

I suppose that that is quite dangerous because any forged filenames could be used to run any shell commands.

If we would use perl’s move() function, then we could avoid escaping the characters here. They should all be allowed to be used in the file system:

  https://perldoc.perl.org/File/Copy.html#move

Would you like to try that option?

> 
>  }
>  else
> @@ -169,11 +183,9 @@ if ($_ == 0)
>      }
> 
>      &writelog("Moving file to the cache directory: $vendorid/$uuid");
> -    $updatefile =~ s@ @\\ @ig;
> -    system("mv $repository/download/$vendorid/$updatefile
> $repository/$vendorid/$uuid");
> +    system("mv $repository/download/$vendorid/$updatefile_shellEscaped
> $repository/$vendorid/$uuid");

Same here.

>      # Workaround for IPCop's mv bug:
>      utime time,$remote_mtime,"$repository/$vendorid/$uuid/$updatefile";
> -    $updatefile =~ s@\\ @ @ig;
> 
> &UPDXLT::setcachestatus("$repository/$vendorid/$uuid/source.url",$sourceurl);
> &UPDXLT::setcachestatus("$repository/$vendorid/$uuid/status",$UPDXLT::sfOk);
> diff --git a/config/updxlrator/updxlrator b/config/updxlrator/updxlrator
> index cdc7eeb50..1febae51e 100644
> --- a/config/updxlrator/updxlrator
> +++ b/config/updxlrator/updxlrator
> @@ -338,11 +338,13 @@ sub check_cache
>      my $sourceurl=$_[0];
>      my $cfmirror=$_[4];
> 
> +    # Any changes made here must be mirrored in "download"
>      $sourceurl =~ s@\%2b@+@ig;
>      $sourceurl =~ s@\%2f@/@ig;
>      $sourceurl =~ s@\%7e@~@ig;
>      $updfile = substr($sourceurl,rindex($sourceurl,"/")+1);
>      $updfile =~ s@\%20@ @ig;
> +    $updfile =~ s@\%5c@\\@ig;

I think we will still need this one then, don’t we?

> 
>      if ($cfmirror)
>      {
> 
> —
> 2.17.1
> 

-Michael
  
Justin Luth Dec. 9, 2019, 5:57 p.m. UTC | #2
My minimally updated patch is re-included at the end of the message, and
comments are responded to inline with the original context. I just
tweaked some comment messages, and replaced a system call that I
introduced recently with a perl-native implementation.

On 12/6/19 9:55 AM, Michael Tremer wrote:
> This seems to be a tricky one.
>
> +# Hopefully correct documentation: based on debugging a "\" problem...
> +# WGET doesn't use the filename from the URL, but gets the filename
> from the server info.
> +#   -mangles that name according to --restrict-file-names.
> +#   -for unix, only mangles "/", so +, ~, \ etc. are not escaped/mangled.
> +#      -(which suggests the entry for %2f is incorrect BTW)
> +# The URL passed to this module is mangled (correct? - perhaps by
> http://search.cpan.org/dist/URI/lib/URI/Escape.pm)
> +#   -but $updatefile needs to match wget's download name, regardless of
> the URL's encoding.
> +#       -achievable either by mangling the URL and/or the filename.
> +#       -*make sure "updxlrator" has the same adjustments*
>   $sourceurl =~ s@\%2b@+@ig;
>   $sourceurl =~ s@\%2f@/@ig;
>   $sourceurl =~ s@\%7e@~@ig;
>   $updatefile = substr($sourceurl,rindex($sourceurl,"/")+1);
> +$updatefile_shellEscaped = $updatefile;
> +$updatefile_shellEscaped =~ s@\%20@\\ @ig;
> +$updatefile_shellEscaped =~ s@\%5c@\\\\@ig;
>   $updatefile =~ s@\%20@ @ig;
> +$updatefile =~ s@\%5c@\\@ig;
>   $vendorid =~ tr/A-Z/a-z/;
>
> It is quite hard to figure out what was actually supposed to happen here. The replacements look more or less like fixes instead of thinking ahead and decoding any special characters.
Maybe, but most of the escaped characters are really weird for use in
filenames, so likely the approach was "instead of decoding every
possibility, lets just do the ones that are actually in use."  And in 7
years I haven't noticed backslashes being a problem - but suddenly a new
patch URL has popped up (wrongly) requesting a backslash. So now we need
to deal with that situation or else have our ipfire cache look
increasingly messy.
>
>>   unless (-d "$repository/download/$vendorid")
>> @@ -58,7 +72,7 @@ if($restartdl == 0)
>>
>>       # hinder multiple downloads from starting simultaneously. Create
>> empty "lock" file.
>>       # TODO: Another thread may sneak in between these two commands -
>> so not fool-proof, but good enough?
>> -    system("touch $repository/download/$vendorid/$updatefile");
>> +    system("touch
>> $repository/download/$vendorid/$updatefile_shellEscaped");
> However, this is all needed because we are calling shell commands here to move a file.
Not ALL of this is needed because of shell commands.  Only the duplicate
_shellEscaped variable is needed for the shell commands. The addition of

$updatefile =~ s@\%5c@\\@ig;

is required regardless of using shell commands or not.
> If we would use perl’s move() function, then we could avoid escaping the characters here. They should all be allowed to be used in the file system:
>
>    https://perldoc.perl.org/File/Copy.html#move
>
> Would you like to try that option?
Not really, since I'm just a bugfixer and not a perl programmer. But I
did try it, and it (silently) failed to work.  I also tried to use
make_path instead of mkdir -p and that also silently failed. So I
abandoned the attempt to remove all system calls. I'm also not eager to
introduce regressions, and that is exactly what would happen if I
started playing with things that I don't understand and don't know how
to debug.
>
> -Michael

Recently some Microsoft patches are alternating between using
\ and / as directory separators, leaving files behind in
updatecache/download that can only be removed in a terminal.

What was happening was that the lock file was created using the
URL name containing %5c while wget was actually writing to a filename
with a real backslash. The zero-sized lockfile was moved into the uuid
folder, and the actual .cab file was orphaned in the download folder,
indicated in the GUI as a stranded file that couldn't be deleted.

An alternate way to fix the problem could be to force wget to
use --restrict-file-names=windows - in which case \ | : ? " * < >
would all be saved as escaped codes. That would seem to better
match the URL mangling that apparently is happening.
Cons:
-removing sourceurl mangling means re-downloading existing caches.
-less human-readable URLs / filenames
-conceptual alternative: I didn't prove/test this theory since it
means re-downloading all oddly-named files.

P.S. I didn't change perl's utime command to use updatefile_shellEscaped
since the filepath is in quotes - the space didn't actually need to be
escaped.
---
  config/updxlrator/download   | 21 +++++++++++++++++----
  config/updxlrator/updxlrator |  2 ++
  2 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/config/updxlrator/download b/config/updxlrator/download
index afa6e6cb9..ab89171e4 100644
--- a/config/updxlrator/download
+++ b/config/updxlrator/download
@@ -26,6 +26,7 @@ my @http_header=();
  my $remote_size=0;
  my $remote_mtime=0;
  my $updatefile='';
+my $updatefile_shellEscaped='';  #double escaped for re-processing by shell/system calls
  my $unique=0;
  my $mirror=1;

@@ -38,11 +39,24 @@ my $restartdl = defined($ARGV[3]) ? $ARGV[3] : 0;

  umask(0002);

+# Hopefully correct documentation: based on debugging a "\" problem...
+# WGET doesn't use the filename from the URL, but gets the filename from the server info.
+#   -mangles that name according to --restrict-file-names.
+#   -for unix, only mangles "/", so +, ~, \ etc. are not escaped/mangled.
+#      -(which suggests the entry for %2f is incorrect BTW)
+# The URL passed to this module is mangled (correct? - perhaps by http://search.cpan.org/dist/URI/lib/URI/Escape.pm)
+#   -but $updatefile needs to match wget's download name, regardless of the URL's encoding.
+#   	-achievable either by mangling the URL and/or the filename.
+#   	-*make sure "updxlrator" has the same adjustments*
  $sourceurl =~ s@\%2b@+@ig;
  $sourceurl =~ s@\%2f@/@ig;
  $sourceurl =~ s@\%7e@~@ig;
  $updatefile = substr($sourceurl,rindex($sourceurl,"/")+1);
+$updatefile_shellEscaped = $updatefile;
+$updatefile_shellEscaped =~ s@\%20@\\ @ig;
+$updatefile_shellEscaped =~ s@\%5c@\\\\@ig;
  $updatefile =~ s@\%20@ @ig;
+$updatefile =~ s@\%5c@\\@ig;    #added to fix ipfire bug 12252
  $vendorid =~ tr/A-Z/a-z/;

  unless (-d "$repository/download/$vendorid")
@@ -58,8 +72,9 @@ if($restartdl == 0)

  	# hinder multiple downloads from starting simultaneously. Create empty "lock" file.
  	# TODO: Another thread may sneak in between these two commands - so not fool-proof, but good enough?
-	system("touch $repository/download/$vendorid/$updatefile");

+	#use a native perl implementation of:  system("touch $repository/download/$vendorid/$updatefile_shellEscaped");
+	{ open my $lock_file, ">>", "$repository/download/$vendorid/$updatefile" }
  }
  else
  {
@@ -169,11 +184,9 @@ if ($_ == 0)
  	}

  	&writelog("Moving file to the cache directory: $vendorid/$uuid");
-	$updatefile =~ s@ @\\ @ig;
-	system("mv $repository/download/$vendorid/$updatefile $repository/$vendorid/$uuid");
+	system("mv $repository/download/$vendorid/$updatefile_shellEscaped $repository/$vendorid/$uuid");
  	# Workaround for IPCop's mv bug:
  	utime time,$remote_mtime,"$repository/$vendorid/$uuid/$updatefile";
-	$updatefile =~ s@\\ @ @ig;

  	&UPDXLT::setcachestatus("$repository/$vendorid/$uuid/source.url",$sourceurl);
  	&UPDXLT::setcachestatus("$repository/$vendorid/$uuid/status",$UPDXLT::sfOk);
diff --git a/config/updxlrator/updxlrator b/config/updxlrator/updxlrator
index cdc7eeb50..1febae51e 100644
--- a/config/updxlrator/updxlrator
+++ b/config/updxlrator/updxlrator
@@ -338,11 +338,13 @@ sub check_cache
  	my $sourceurl=$_[0];
  	my $cfmirror=$_[4];

+	# Any changes made here must be mirrored in "download"
  	$sourceurl =~ s@\%2b@+@ig;
  	$sourceurl =~ s@\%2f@/@ig;
  	$sourceurl =~ s@\%7e@~@ig;
  	$updfile = substr($sourceurl,rindex($sourceurl,"/")+1);
  	$updfile =~ s@\%20@ @ig;
+	$updfile =~ s@\%5c@\\@ig;

  	if ($cfmirror)
  	{
--
2.17.1
  
Michael Tremer Dec. 13, 2019, 10:45 p.m. UTC | #3
Hi,

Sorry for not replying earlier.

It is the run-up to Christmas and I am quite busy with a thousand little things. So stay tuned if you do not hear from me very swiftly.

> On 9 Dec 2019, at 17:57, Justin Luth <jluth@mail.com> wrote:
> 
> My minimally updated patch is re-included at the end of the message, and
> comments are responded to inline with the original context. I just
> tweaked some comment messages, and replaced a system call that I
> introduced recently with a perl-native implementation.
> 
> On 12/6/19 9:55 AM, Michael Tremer wrote:
>> This seems to be a tricky one.
>> 
>> +# Hopefully correct documentation: based on debugging a "\" problem...
>> +# WGET doesn't use the filename from the URL, but gets the filename
>> from the server info.
>> +#   -mangles that name according to --restrict-file-names.
>> +#   -for unix, only mangles "/", so +, ~, \ etc. are not escaped/mangled.
>> +#      -(which suggests the entry for %2f is incorrect BTW)
>> +# The URL passed to this module is mangled (correct? - perhaps by
>> http://search.cpan.org/dist/URI/lib/URI/Escape.pm)
>> +#   -but $updatefile needs to match wget's download name, regardless of
>> the URL's encoding.
>> +#       -achievable either by mangling the URL and/or the filename.
>> +#       -*make sure "updxlrator" has the same adjustments*
>>  $sourceurl =~ s@\%2b@+@ig;
>>  $sourceurl =~ s@\%2f@/@ig;
>>  $sourceurl =~ s@\%7e@~@ig;
>>  $updatefile = substr($sourceurl,rindex($sourceurl,"/")+1);
>> +$updatefile_shellEscaped = $updatefile;
>> +$updatefile_shellEscaped =~ s@\%20@\\ @ig;
>> +$updatefile_shellEscaped =~ s@\%5c@\\\\@ig;
>>  $updatefile =~ s@\%20@ @ig;
>> +$updatefile =~ s@\%5c@\\@ig;
>>  $vendorid =~ tr/A-Z/a-z/;
>> 
>> It is quite hard to figure out what was actually supposed to happen here. The replacements look more or less like fixes instead of thinking ahead and decoding any special characters.
> Maybe, but most of the escaped characters are really weird for use in
> filenames, so likely the approach was "instead of decoding every
> possibility, lets just do the ones that are actually in use."  And in 7
> years I haven't noticed backslashes being a problem - but suddenly a new
> patch URL has popped up (wrongly) requesting a backslash. So now we need
> to deal with that situation or else have our ipfire cache look
> increasingly messy.

Yeah it looks like it, but try and error isn’t always the best approach for software development.

>> 
>>>  unless (-d "$repository/download/$vendorid")
>>> @@ -58,7 +72,7 @@ if($restartdl == 0)
>>> 
>>>      # hinder multiple downloads from starting simultaneously. Create
>>> empty "lock" file.
>>>      # TODO: Another thread may sneak in between these two commands -
>>> so not fool-proof, but good enough?
>>> -    system("touch $repository/download/$vendorid/$updatefile");
>>> +    system("touch
>>> $repository/download/$vendorid/$updatefile_shellEscaped");
>> However, this is all needed because we are calling shell commands here to move a file.
> Not ALL of this is needed because of shell commands.  Only the duplicate
> _shellEscaped variable is needed for the shell commands. The addition of
> 
> $updatefile =~ s@\%5c@\\@ig;
> 
> is required regardless of using shell commands or not.
>> If we would use perl’s move() function, then we could avoid escaping the characters here. They should all be allowed to be used in the file system:
>> 
>>   https://perldoc.perl.org/File/Copy.html#move
>> 
>> Would you like to try that option?
> Not really, since I'm just a bugfixer and not a perl programmer. But I
> did try it, and it (silently) failed to work.  I also tried to use
> make_path instead of mkdir -p and that also silently failed. So I
> abandoned the attempt to remove all system calls. I'm also not eager to
> introduce regressions, and that is exactly what would happen if I
> started playing with things that I don't understand and don't know how
> to debug.

LOL, nobody is eager to introduce any regressions :)

If you have IO operations like mkdir() you can catch any errors with the “die” keyword:

  mkdir(“$path/file”) or die “Could not create directory: $!”;

That will cause the script to abort and log the message.

>> 
>> -Michael
> 
> Recently some Microsoft patches are alternating between using
> \ and / as directory separators, leaving files behind in
> updatecache/download that can only be removed in a terminal.
> 
> What was happening was that the lock file was created using the
> URL name containing %5c while wget was actually writing to a filename
> with a real backslash. The zero-sized lockfile was moved into the uuid
> folder, and the actual .cab file was orphaned in the download folder,
> indicated in the GUI as a stranded file that couldn't be deleted.
> 
> An alternate way to fix the problem could be to force wget to
> use --restrict-file-names=windows - in which case \ | : ? " * < >
> would all be saved as escaped codes. That would seem to better
> match the URL mangling that apparently is happening.
> Cons:
> -removing sourceurl mangling means re-downloading existing caches.
> -less human-readable URLs / filenames
> -conceptual alternative: I didn't prove/test this theory since it
> means re-downloading all oddly-named files.

I would consider that that is a change that has more implications for the rest of the code which is not robust enough against it.

Best,
-Michael

> 
> P.S. I didn't change perl's utime command to use updatefile_shellEscaped
> since the filepath is in quotes - the space didn't actually need to be
> escaped.
> ---
> config/updxlrator/download   | 21 +++++++++++++++++----
> config/updxlrator/updxlrator |  2 ++
> 2 files changed, 19 insertions(+), 4 deletions(-)
> 
> diff --git a/config/updxlrator/download b/config/updxlrator/download
> index afa6e6cb9..ab89171e4 100644
> --- a/config/updxlrator/download
> +++ b/config/updxlrator/download
> @@ -26,6 +26,7 @@ my @http_header=();
> my $remote_size=0;
> my $remote_mtime=0;
> my $updatefile='';
> +my $updatefile_shellEscaped='';  #double escaped for re-processing by shell/system calls
> my $unique=0;
> my $mirror=1;
> 
> @@ -38,11 +39,24 @@ my $restartdl = defined($ARGV[3]) ? $ARGV[3] : 0;
> 
> umask(0002);
> 
> +# Hopefully correct documentation: based on debugging a "\" problem...
> +# WGET doesn't use the filename from the URL, but gets the filename from the server info.
> +#   -mangles that name according to --restrict-file-names.
> +#   -for unix, only mangles "/", so +, ~, \ etc. are not escaped/mangled.
> +#      -(which suggests the entry for %2f is incorrect BTW)
> +# The URL passed to this module is mangled (correct? - perhaps by http://search.cpan.org/dist/URI/lib/URI/Escape.pm)
> +#   -but $updatefile needs to match wget's download name, regardless of the URL's encoding.
> +#   	-achievable either by mangling the URL and/or the filename.
> +#   	-*make sure "updxlrator" has the same adjustments*
> $sourceurl =~ s@\%2b@+@ig;
> $sourceurl =~ s@\%2f@/@ig;
> $sourceurl =~ s@\%7e@~@ig;
> $updatefile = substr($sourceurl,rindex($sourceurl,"/")+1);
> +$updatefile_shellEscaped = $updatefile;
> +$updatefile_shellEscaped =~ s@\%20@\\ @ig;
> +$updatefile_shellEscaped =~ s@\%5c@\\\\@ig;
> $updatefile =~ s@\%20@ @ig;
> +$updatefile =~ s@\%5c@\\@ig;    #added to fix ipfire bug 12252
> $vendorid =~ tr/A-Z/a-z/;
> 
> unless (-d "$repository/download/$vendorid")
> @@ -58,8 +72,9 @@ if($restartdl == 0)
> 
> 	# hinder multiple downloads from starting simultaneously. Create empty "lock" file.
> 	# TODO: Another thread may sneak in between these two commands - so not fool-proof, but good enough?
> -	system("touch $repository/download/$vendorid/$updatefile");
> 
> +	#use a native perl implementation of:  system("touch $repository/download/$vendorid/$updatefile_shellEscaped");
> +	{ open my $lock_file, ">>", "$repository/download/$vendorid/$updatefile" }
> }
> else
> {
> @@ -169,11 +184,9 @@ if ($_ == 0)
> 	}
> 
> 	&writelog("Moving file to the cache directory: $vendorid/$uuid");
> -	$updatefile =~ s@ @\\ @ig;
> -	system("mv $repository/download/$vendorid/$updatefile $repository/$vendorid/$uuid");
> +	system("mv $repository/download/$vendorid/$updatefile_shellEscaped $repository/$vendorid/$uuid");
> 	# Workaround for IPCop's mv bug:
> 	utime time,$remote_mtime,"$repository/$vendorid/$uuid/$updatefile";
> -	$updatefile =~ s@\\ @ @ig;
> 
> 	&UPDXLT::setcachestatus("$repository/$vendorid/$uuid/source.url",$sourceurl);
> 	&UPDXLT::setcachestatus("$repository/$vendorid/$uuid/status",$UPDXLT::sfOk);
> diff --git a/config/updxlrator/updxlrator b/config/updxlrator/updxlrator
> index cdc7eeb50..1febae51e 100644
> --- a/config/updxlrator/updxlrator
> +++ b/config/updxlrator/updxlrator
> @@ -338,11 +338,13 @@ sub check_cache
> 	my $sourceurl=$_[0];
> 	my $cfmirror=$_[4];
> 
> +	# Any changes made here must be mirrored in "download"
> 	$sourceurl =~ s@\%2b@+@ig;
> 	$sourceurl =~ s@\%2f@/@ig;
> 	$sourceurl =~ s@\%7e@~@ig;
> 	$updfile = substr($sourceurl,rindex($sourceurl,"/")+1);
> 	$updfile =~ s@\%20@ @ig;
> +	$updfile =~ s@\%5c@\\@ig;
> 
> 	if ($cfmirror)
> 	{
> --
> 2.17.1
>
  

Patch

diff --git a/config/updxlrator/download b/config/updxlrator/download
index afa6e6cb9..df4a58725 100644
--- a/config/updxlrator/download
+++ b/config/updxlrator/download
@@ -26,6 +26,7 @@  my @http_header=();
  my $remote_size=0;
  my $remote_mtime=0;
  my $updatefile='';
+my $updatefile_shellEscaped='';  #double escaped for re-processing by
shell/system calls
  my $unique=0;
  my $mirror=1;

@@ -38,11 +39,24 @@  my $restartdl = defined($ARGV[3]) ? $ARGV[3] : 0;

  umask(0002);

+# Hopefully correct documentation: based on debugging a "\" problem...
+# WGET doesn't use the filename from the URL, but gets the filename
from the server info.
+#   -mangles that name according to --restrict-file-names.
+#   -for unix, only mangles "/", so +, ~, \ etc. are not escaped/mangled.
+#      -(which suggests the entry for %2f is incorrect BTW)
+# The URL passed to this module is mangled (correct? - perhaps by
http://search.cpan.org/dist/URI/lib/URI/Escape.pm)
+#   -but $updatefile needs to match wget's download name, regardless of
the URL's encoding.
+#       -achievable either by mangling the URL and/or the filename.
+#       -*make sure "updxlrator" has the same adjustments*
  $sourceurl =~ s@\%2b@+@ig;
  $sourceurl =~ s@\%2f@/@ig;
  $sourceurl =~ s@\%7e@~@ig;
  $updatefile = substr($sourceurl,rindex($sourceurl,"/")+1);
+$updatefile_shellEscaped = $updatefile;
+$updatefile_shellEscaped =~ s@\%20@\\ @ig;
+$updatefile_shellEscaped =~ s@\%5c@\\\\@ig;
  $updatefile =~ s@\%20@ @ig;
+$updatefile =~ s@\%5c@\\@ig;
  $vendorid =~ tr/A-Z/a-z/;

  unless (-d "$repository/download/$vendorid")
@@ -58,7 +72,7 @@  if($restartdl == 0)

      # hinder multiple downloads from starting simultaneously. Create
empty "lock" file.
      # TODO: Another thread may sneak in between these two commands -
so not fool-proof, but good enough?
-    system("touch $repository/download/$vendorid/$updatefile");
+    system("touch
$repository/download/$vendorid/$updatefile_shellEscaped");

  }
  else
@@ -169,11 +183,9 @@  if ($_ == 0)
      }

      &writelog("Moving file to the cache directory: $vendorid/$uuid");
-    $updatefile =~ s@ @\\ @ig;
-    system("mv $repository/download/$vendorid/$updatefile
$repository/$vendorid/$uuid");
+    system("mv $repository/download/$vendorid/$updatefile_shellEscaped
$repository/$vendorid/$uuid");
      # Workaround for IPCop's mv bug:
      utime time,$remote_mtime,"$repository/$vendorid/$uuid/$updatefile";
-    $updatefile =~ s@\\ @ @ig;

&UPDXLT::setcachestatus("$repository/$vendorid/$uuid/source.url",$sourceurl);
&UPDXLT::setcachestatus("$repository/$vendorid/$uuid/status",$UPDXLT::sfOk);
diff --git a/config/updxlrator/updxlrator b/config/updxlrator/updxlrator
index cdc7eeb50..1febae51e 100644
--- a/config/updxlrator/updxlrator
+++ b/config/updxlrator/updxlrator
@@ -338,11 +338,13 @@  sub check_cache
      my $sourceurl=$_[0];
      my $cfmirror=$_[4];

+    # Any changes made here must be mirrored in "download"
      $sourceurl =~ s@\%2b@+@ig;
      $sourceurl =~ s@\%2f@/@ig;
      $sourceurl =~ s@\%7e@~@ig;
      $updfile = substr($sourceurl,rindex($sourceurl,"/")+1);
      $updfile =~ s@\%20@ @ig;
+    $updfile =~ s@\%5c@\\@ig;

      if ($cfmirror)
      {