Fix bug 10504: match download's sourceurl mangling in, updxlrator

Message ID 287ec1cd-00b7-b6b8-2fcd-13206ef08d5b@mail.com
State Dropped
Headers
Series Fix bug 10504: match download's sourceurl mangling in, updxlrator |

Commit Message

Justin Luth Dec. 30, 2017, 1:12 a.m. UTC
  Updatexlrator stores its files in a hash of the URL.

The download utility mangles the URL for [+/~], but
the updxlrator only does it for [/]. Thus, download
stores the result as one hash, and updxlrator looks for it
with a different hash. The result is that the file is
re-downloaded every time by both the client, and updxlrator.

This is fixed by making updxlrator mangle the url in the
same way as the downloader. apt-get install g++ would
be a good test for this.

Signed-off-by: Justin Luth  <jluth@mail.com>

---
I submitted the bug report and attached this patch three years
ago, but the maintainer of updxlrator - although he
incorporated it into his own ipcop packagea - has never
released it (for ipcop) or any of the other promised updates that
he has been working on (in ipfire).

I have a few more fixes for updxlrator that I want to submit,
if this process goes well.
---
  config/updxlrator/updxlrator | 2 ++
  1 file changed, 2 insertions(+)
  

Comments

Michael Tremer Dec. 31, 2017, 1:27 a.m. UTC | #1
Hello Justin,

and welcome to the team.

Thanks for your submission. Indeed we have not been touching update accelerator
much in the last years. Some people have been working on forks but never
submitted their changes and just uploaded them somewhere. Therefore I am happy
that you had a look.

On Fri, 2017-12-29 at 17:12 +0300, Justin Luth wrote:
> Updatexlrator stores its files in a hash of the URL.
> 
> The download utility mangles the URL for [+/~], but
> the updxlrator only does it for [/]. Thus, download
> stores the result as one hash, and updxlrator looks for it
> with a different hash. The result is that the file is
> re-downloaded every time by both the client, and updxlrator.

Wouldn't it be a better idea to generally escape/unescape the URLs? There is a
perl module that does that:

  http://search.cpan.org/dist/URI/lib/URI/Escape.pm

Your changes certainly make sense, but there are more characters that could
cause the same problem here.

Let me know if that would make sense, too.

Best,
-Michael

> This is fixed by making updxlrator mangle the url in the
> same way as the downloader. apt-get install g++ would
> be a good test for this.
> 
> Signed-off-by: Justin Luth  <jluth@mail.com>
> 
> ---
> I submitted the bug report and attached this patch three years
> ago, but the maintainer of updxlrator - although he
> incorporated it into his own ipcop packagea - has never
> released it (for ipcop) or any of the other promised updates that
> he has been working on (in ipfire).
> 
> I have a few more fixes for updxlrator that I want to submit,
> if this process goes well.
> ---
>   config/updxlrator/updxlrator | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/config/updxlrator/updxlrator b/config/updxlrator/updxlrator
> index 2ddc6d8e4..b728902f6 100644
> --- a/config/updxlrator/updxlrator
> +++ b/config/updxlrator/updxlrator
> @@ -345,7 +345,9 @@ sub check_cache
>       my $sourceurl=$_[0];
>       my $cfmirror=$_[4];
> 
> +    $sourceurl =~ s@\%2b@+@ig;
>       $sourceurl =~ s@\%2f@/@ig;
> +    $sourceurl =~ s@\%7e@~@ig;
>       $updfile = substr($sourceurl,rindex($sourceurl,"/")+1);
>       $updfile =~ s@\%20@ @ig;
>
  
Justin Luth Dec. 31, 2017, 2:57 a.m. UTC | #2
On 12/30/2017 05:27 PM, Michael Tremer wrote:
> On Fri, 2017-12-29 at 17:12 +0300, Justin Luth wrote:
>> Updatexlrator stores its files in a hash of the URL.
>>
>> The download utility mangles the URL for [+/~], but
>> the updxlrator only does it for [/]. Thus, download
>> stores the result as one hash, and updxlrator looks for it
>> with a different hash. The result is that the file is
>> re-downloaded every time by both the client, and updxlrator.
> Wouldn't it be a better idea to generally escape/unescape the URLs? There is a
> perl module that does that:
>
>    http://search.cpan.org/dist/URI/lib/URI/Escape.pm
>
> Your changes certainly make sense, but there are more characters that could
> cause the same problem here.
>
> Let me know if that would make sense, too.
>
> Best,
> -Michael

I don't know what the impact would be of unescaping more characters. It 
might affect negatively
affect cached downloads - causing them to be redownloaded. So I am 
reluctant to do anything
other than to fix the obvious bug, especially since I'm not at all a 
perl programmer.

Your portrayal of "my patch fixes some things, but others could cause 
the same problem" isn't
exactly accurate. I'm only fixing the "one module does it one way, but 
the other module
does it another way" problem. In that sense, the two modules are now 
identical, and so
there are no more similar changes that can be made.

I'm not sure what "problem" was caused by these characters in the first 
place, and why they needed
to be unescaped. The safest thing is to simply match the download 
mangling and leave it at that.
That's all that I'm comfortable changing - I'm happy if a real 
programmer takes over from here and
looks at the "duplicates" like bug 10344.

Updatexlrator is the main tool that keeps us using IPFire, so thanks for 
looking at my patches.
I have one more big one (bug 11558) that I've seriously reworked today 
and want to test on my IPFire
to confirm that it still works as expected.

One consideration is that many other people may have made changes to 
updxlrator (for adding
additional caches for example) and since it has been unchanged for so 
long they might
unexpectedly lose those customizations. So, a warning at least is due in 
the release notes. It
might also be good to just hold off on these patches until I get my 4th 
one approved. In any case, I hope
to test it heavily next week, and get it submitted.
Justin
  
Michael Tremer Jan. 8, 2018, 6:53 a.m. UTC | #3
Hello,

I just merged all the patches into next so they are queued for Core
Update 118 now.

Please have a look that everything got applied correctly.

And of course please test :)

Best,
-Michael

On Sat, 2017-12-30 at 18:57 +0300, Justin Luth wrote:
> On 12/30/2017 05:27 PM, Michael Tremer wrote:
> > On Fri, 2017-12-29 at 17:12 +0300, Justin Luth wrote:
> > > Updatexlrator stores its files in a hash of the URL.
> > > 
> > > The download utility mangles the URL for [+/~], but
> > > the updxlrator only does it for [/]. Thus, download
> > > stores the result as one hash, and updxlrator looks for it
> > > with a different hash. The result is that the file is
> > > re-downloaded every time by both the client, and updxlrator.
> > 
> > Wouldn't it be a better idea to generally escape/unescape the URLs? There is a
> > perl module that does that:
> > 
> >    http://search.cpan.org/dist/URI/lib/URI/Escape.pm
> > 
> > Your changes certainly make sense, but there are more characters that could
> > cause the same problem here.
> > 
> > Let me know if that would make sense, too.
> > 
> > Best,
> > -Michael
> 
> I don't know what the impact would be of unescaping more characters. It 
> might affect negatively
> affect cached downloads - causing them to be redownloaded. So I am 
> reluctant to do anything
> other than to fix the obvious bug, especially since I'm not at all a 
> perl programmer.
> 
> Your portrayal of "my patch fixes some things, but others could cause 
> the same problem" isn't
> exactly accurate. I'm only fixing the "one module does it one way, but 
> the other module
> does it another way" problem. In that sense, the two modules are now 
> identical, and so
> there are no more similar changes that can be made.
> 
> I'm not sure what "problem" was caused by these characters in the first 
> place, and why they needed
> to be unescaped. The safest thing is to simply match the download 
> mangling and leave it at that.
> That's all that I'm comfortable changing - I'm happy if a real 
> programmer takes over from here and
> looks at the "duplicates" like bug 10344.
> 
> Updatexlrator is the main tool that keeps us using IPFire, so thanks for 
> looking at my patches.
> I have one more big one (bug 11558) that I've seriously reworked today 
> and want to test on my IPFire
> to confirm that it still works as expected.
> 
> One consideration is that many other people may have made changes to 
> updxlrator (for adding
> additional caches for example) and since it has been unchanged for so 
> long they might
> unexpectedly lose those customizations. So, a warning at least is due in 
> the release notes. It
> might also be good to just hold off on these patches until I get my 4th 
> one approved. In any case, I hope
> to test it heavily next week, and get it submitted.
> Justin
> 
> 
>
  

Patch

diff --git a/config/updxlrator/updxlrator b/config/updxlrator/updxlrator
index 2ddc6d8e4..b728902f6 100644
--- a/config/updxlrator/updxlrator
+++ b/config/updxlrator/updxlrator
@@ -345,7 +345,9 @@  sub check_cache
      my $sourceurl=$_[0];
      my $cfmirror=$_[4];

+    $sourceurl =~ s@\%2b@+@ig;
      $sourceurl =~ s@\%2f@/@ig;
+    $sourceurl =~ s@\%7e@~@ig;
      $updfile = substr($sourceurl,rindex($sourceurl,"/")+1);
      $updfile =~ s@\%20@ @ig;