connscheduler.cgi: Remove cleanhtml command from Remark

Message ID 20240305184418.3431741-1-adolf.belka@ipfire.org
State Dropped
Headers
Series connscheduler.cgi: Remove cleanhtml command from Remark |

Commit Message

Adolf Belka March 5, 2024, 6:44 p.m. UTC
  - Using cleanhtml on Remarks means that all characters with diacritical marks such as
   umlauts or grave accents etc get encoded into other characters.
- If Freifunk München e.V. is entered as a remark it gets converted to
   Freifunk München e.V.
- cleanhtml is only removed from Remarks or Comment fields. In other places it has been
   left in place.

Tested-by: Adolf Belka <adolf.belka@ipfire.org>
Signed-off-by: Adolf Belka <adolf.belka@ipfire.org>
---
 html/cgi-bin/connscheduler.cgi | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
  

Comments

Michael Tremer March 6, 2024, 9:28 p.m. UTC | #1
Hello Adolf,

I believe that I cannot merge these patches.

The reason simply is that it would create a store cross-site scripting attack vector because someone could store some <script> tags with some JS which will be executed if another admin opens the same page.

That is why we escape the content so that if there are any special characters like <> for HTML tags they won’t be interpreted by the browser.

We might have problems where we accidentally call the cleanhtml function twice which should show garbage. We might also have a problem where the function is not giving us the output that we want.

Which strings have been causing problems? Just German umlauts like “äöü”?

-Michael

> On 5 Mar 2024, at 19:44, Adolf Belka <adolf.belka@ipfire.org> wrote:
> 
> - Using cleanhtml on Remarks means that all characters with diacritical marks such as
>   umlauts or grave accents etc get encoded into other characters.
> - If Freifunk München e.V. is entered as a remark it gets converted to
>   Freifunk München e.V.
> - cleanhtml is only removed from Remarks or Comment fields. In other places it has been
>   left in place.
> 
> Tested-by: Adolf Belka <adolf.belka@ipfire.org>
> Signed-off-by: Adolf Belka <adolf.belka@ipfire.org>
> ---
> html/cgi-bin/connscheduler.cgi | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/html/cgi-bin/connscheduler.cgi b/html/cgi-bin/connscheduler.cgi
> index cc78cbc1b..817247cc4 100644
> --- a/html/cgi-bin/connscheduler.cgi
> +++ b/html/cgi-bin/connscheduler.cgi
> @@ -2,7 +2,7 @@
> ###############################################################################
> #                                                                             #
> # IPFire.org - A linux based firewall                                         #
> -# Copyright (C) 2007  Michael Tremer & Christian Schmidt                      #
> +# Copyright (C) 2007-2024  IPFire Team  <info@ipfire.org>                     #
> #                                                                             #
> # This program is free software: you can redistribute it and/or modify        #
> # it under the terms of the GNU General Public License as published by        #
> @@ -186,7 +186,7 @@ if ( ($cgiparams{'ACTION'} eq 'add') || ($cgiparams{'ACTION'} eq 'update') )
>   $CONNSCHED::config[$i]{'DAYSTYPE'} = lc($cgiparams{'ACTION_DAYSTYPE'});
>   $CONNSCHED::config[$i]{'DAYS'} = $l_days;
>   $CONNSCHED::config[$i]{'WEEKDAYS'} = $l_weekdays;
> -  $CONNSCHED::config[$i]{'COMMENT'} = &Header::cleanhtml($cgiparams{'ACTION_COMMENT'});
> +  $CONNSCHED::config[$i]{'COMMENT'} = $cgiparams{'ACTION_COMMENT'};
> 
>   &CONNSCHED::WriteConfig;
> }
> -- 
> 2.44.0
>
  
Adolf Belka March 6, 2024, 10:23 p.m. UTC | #2
Hi Michael,

On 06/03/2024 22:28, Michael Tremer wrote:
> Hello Adolf,
> 
> I believe that I cannot merge these patches.
Then you need to also look back at the dns.cgi patch for the bug fix due 
to german umlauts being changed. The acceptance of that patch is what 
made me create these patches as they all had the same problem with 
remarks as well. If this can't be accepted as is then that patch needs 
to be reverted.

https://git.ipfire.org/?p=ipfire-2.x.git;a=commit;h=7c6ff5ff12331a53f416080a44c8d6145e78bfac
> 
> The reason simply is that it would create a store cross-site scripting attack vector because someone could store some <script> tags with some JS which will be executed if another admin opens the same page.
> 
> That is why we escape the content so that if there are any special characters like <> for HTML tags they won’t be interpreted by the browser.
> 
> We might have problems where we accidentally call the cleanhtml function twice which should show garbage. We might also have a problem where the function is not giving us the output that we want.
> 
> Which strings have been causing problems? Just German umlauts like “äöü”?
It is any character that has an accent or other diacritical mark.

I just tried entering the following into the remark section for a dns 
server entry as an example
Ä ã ö â á à

and the remark was changed to
Ä ã ö â á Ã

and if I edit the entry but don't change the remark the new characters 
above get changed again into
Ä ã ö â á ÃÂ


I would have expected that running cleanhtml should result in characters 
that are considered safe after one run through but it seems that the 
encoding creates characters that are encoded again by cleanhtml as being 
unsafe and then those ones are again still considered unsafe.

If the cleanhtml command needs to stay being used also for the remark 
entries then I have no idea how to allow german umlauts and other 
accented characters to be shown correctly because they are all higher 
bit ascii characters and those are encoded by default by the cleanhtml 
process as being considered unsafe so I would either need some help on 
how to deal with it or maybe someone else needs to pick up the original 
bug#12395

The only thing I found is that cleanhtml calls the escape function which 
calls the HTML::Entities::encode_entities command but that command can 
have an additional option which defines the characters that are 
considered unsafe, but then I would need some guidance on which 
characters are considered unsafe and if that set applies to all 
invocations of the cleanhtml command. ie should a modified cleanhtml 
command with the extra option for the HTML::Entities::encode_entities 
command only be used for some of the cleanhtml calls and if so would 
that be only the remark/comment entries?

Regards,
Adolf.

> 
> -Michael
> 
>> On 5 Mar 2024, at 19:44, Adolf Belka <adolf.belka@ipfire.org> wrote:
>>
>> - Using cleanhtml on Remarks means that all characters with diacritical marks such as
>>    umlauts or grave accents etc get encoded into other characters.
>> - If Freifunk München e.V. is entered as a remark it gets converted to
>>    Freifunk München e.V.
>> - cleanhtml is only removed from Remarks or Comment fields. In other places it has been
>>    left in place.
>>
>> Tested-by: Adolf Belka <adolf.belka@ipfire.org>
>> Signed-off-by: Adolf Belka <adolf.belka@ipfire.org>
>> ---
>> html/cgi-bin/connscheduler.cgi | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/html/cgi-bin/connscheduler.cgi b/html/cgi-bin/connscheduler.cgi
>> index cc78cbc1b..817247cc4 100644
>> --- a/html/cgi-bin/connscheduler.cgi
>> +++ b/html/cgi-bin/connscheduler.cgi
>> @@ -2,7 +2,7 @@
>> ###############################################################################
>> #                                                                             #
>> # IPFire.org - A linux based firewall                                         #
>> -# Copyright (C) 2007  Michael Tremer & Christian Schmidt                      #
>> +# Copyright (C) 2007-2024  IPFire Team  <info@ipfire.org>                     #
>> #                                                                             #
>> # This program is free software: you can redistribute it and/or modify        #
>> # it under the terms of the GNU General Public License as published by        #
>> @@ -186,7 +186,7 @@ if ( ($cgiparams{'ACTION'} eq 'add') || ($cgiparams{'ACTION'} eq 'update') )
>>    $CONNSCHED::config[$i]{'DAYSTYPE'} = lc($cgiparams{'ACTION_DAYSTYPE'});
>>    $CONNSCHED::config[$i]{'DAYS'} = $l_days;
>>    $CONNSCHED::config[$i]{'WEEKDAYS'} = $l_weekdays;
>> -  $CONNSCHED::config[$i]{'COMMENT'} = &Header::cleanhtml($cgiparams{'ACTION_COMMENT'});
>> +  $CONNSCHED::config[$i]{'COMMENT'} = $cgiparams{'ACTION_COMMENT'};
>>
>>    &CONNSCHED::WriteConfig;
>> }
>> -- 
>> 2.44.0
>>
>
  
Adolf Belka March 7, 2024, 11:18 a.m. UTC | #3
Hi Michael,

I think I know how to solve the problem.

I tested out using HTML::Entities::encode_entities in a very simple Perl 
program and found I got the same type of entity encoding as in the WUI 
CGI pages.

However, if I treated the string of characters as utf8 then the 
HTML::Entities::encode_entities gave the results expected.

So I need to figure out how to treat the remark strings as utf8 and 
hopefully that should fix the problem. At least I have a view of a path 
forward on this issue now, that will keep the protection of the 
cleanhtml command while also allowing characters with diacritical marks, 
plus special characters such as the Cyrillic alphabet and also things 
like the german eszet that currently all get mangled.

Will let you know how I get on.

Additionally I will also later on create patches for the WUI CGI pages 
for the Firewall Groups and for WIO as they do not use the cleanhtml 
command at all yet they also have many Remark entries. I will also check 
out the other WUI pages that don't use the cleanhtml command to see if 
they have remarks etc that should use it.

Regards,

Adolf.

On 06/03/2024 23:23, Adolf Belka wrote:
> Hi Michael,
>
> On 06/03/2024 22:28, Michael Tremer wrote:
>> Hello Adolf,
>>
>> I believe that I cannot merge these patches.
> Then you need to also look back at the dns.cgi patch for the bug fix 
> due to german umlauts being changed. The acceptance of that patch is 
> what made me create these patches as they all had the same problem 
> with remarks as well. If this can't be accepted as is then that patch 
> needs to be reverted.
>
> https://git.ipfire.org/?p=ipfire-2.x.git;a=commit;h=7c6ff5ff12331a53f416080a44c8d6145e78bfac 
>
>>
>> The reason simply is that it would create a store cross-site 
>> scripting attack vector because someone could store some <script> 
>> tags with some JS which will be executed if another admin opens the 
>> same page.
>>
>> That is why we escape the content so that if there are any special 
>> characters like <> for HTML tags they won’t be interpreted by the 
>> browser.
>>
>> We might have problems where we accidentally call the cleanhtml 
>> function twice which should show garbage. We might also have a 
>> problem where the function is not giving us the output that we want.
>>
>> Which strings have been causing problems? Just German umlauts like 
>> “äöü”?
> It is any character that has an accent or other diacritical mark.
>
> I just tried entering the following into the remark section for a dns 
> server entry as an example
> Ä ã ö â á à
>
> and the remark was changed to
> Ä ã ö â á Ã
>
> and if I edit the entry but don't change the remark the new characters 
> above get changed again into
> Ä ã ö â á ÃÂ
>
>
> I would have expected that running cleanhtml should result in 
> characters that are considered safe after one run through but it seems 
> that the encoding creates characters that are encoded again by 
> cleanhtml as being unsafe and then those ones are again still 
> considered unsafe.
>
> If the cleanhtml command needs to stay being used also for the remark 
> entries then I have no idea how to allow german umlauts and other 
> accented characters to be shown correctly because they are all higher 
> bit ascii characters and those are encoded by default by the cleanhtml 
> process as being considered unsafe so I would either need some help on 
> how to deal with it or maybe someone else needs to pick up the 
> original bug#12395
>
> The only thing I found is that cleanhtml calls the escape function 
> which calls the HTML::Entities::encode_entities command but that 
> command can have an additional option which defines the characters 
> that are considered unsafe, but then I would need some guidance on 
> which characters are considered unsafe and if that set applies to all 
> invocations of the cleanhtml command. ie should a modified cleanhtml 
> command with the extra option for the HTML::Entities::encode_entities 
> command only be used for some of the cleanhtml calls and if so would 
> that be only the remark/comment entries?
>
> Regards,
> Adolf.
>
>>
>> -Michael
>>
>>> On 5 Mar 2024, at 19:44, Adolf Belka <adolf.belka@ipfire.org> wrote:
>>>
>>> - Using cleanhtml on Remarks means that all characters with 
>>> diacritical marks such as
>>>    umlauts or grave accents etc get encoded into other characters.
>>> - If Freifunk München e.V. is entered as a remark it gets converted to
>>>    Freifunk München e.V.
>>> - cleanhtml is only removed from Remarks or Comment fields. In other 
>>> places it has been
>>>    left in place.
>>>
>>> Tested-by: Adolf Belka <adolf.belka@ipfire.org>
>>> Signed-off-by: Adolf Belka <adolf.belka@ipfire.org>
>>> ---
>>> html/cgi-bin/connscheduler.cgi | 4 ++--
>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/html/cgi-bin/connscheduler.cgi 
>>> b/html/cgi-bin/connscheduler.cgi
>>> index cc78cbc1b..817247cc4 100644
>>> --- a/html/cgi-bin/connscheduler.cgi
>>> +++ b/html/cgi-bin/connscheduler.cgi
>>> @@ -2,7 +2,7 @@
>>> ############################################################################### 
>>>
>>> # #
>>> # IPFire.org - A linux based 
>>> firewall                                         #
>>> -# Copyright (C) 2007  Michael Tremer & Christian 
>>> Schmidt                      #
>>> +# Copyright (C) 2007-2024  IPFire Team 
>>> <info@ipfire.org>                     #
>>> # #
>>> # This program is free software: you can redistribute it and/or 
>>> modify        #
>>> # it under the terms of the GNU General Public License as published 
>>> by        #
>>> @@ -186,7 +186,7 @@ if ( ($cgiparams{'ACTION'} eq 'add') || 
>>> ($cgiparams{'ACTION'} eq 'update') )
>>>    $CONNSCHED::config[$i]{'DAYSTYPE'} = 
>>> lc($cgiparams{'ACTION_DAYSTYPE'});
>>>    $CONNSCHED::config[$i]{'DAYS'} = $l_days;
>>>    $CONNSCHED::config[$i]{'WEEKDAYS'} = $l_weekdays;
>>> -  $CONNSCHED::config[$i]{'COMMENT'} = 
>>> &Header::cleanhtml($cgiparams{'ACTION_COMMENT'});
>>> +  $CONNSCHED::config[$i]{'COMMENT'} = $cgiparams{'ACTION_COMMENT'};
>>>
>>>    &CONNSCHED::WriteConfig;
>>> }
>>> -- 
>>> 2.44.0
>>>
>>
>
  
Adolf Belka March 7, 2024, 1:19 p.m. UTC | #4
Hi Michael,

On 07/03/2024 12:18, Adolf Belka wrote:
> Hi Michael,
>
> I think I know how to solve the problem.
>
> I tested out using HTML::Entities::encode_entities in a very simple Perl program and found I got the same type of entity encoding as in the WUI CGI pages.
>
> However, if I treated the string of characters as utf8 then the HTML::Entities::encode_entities gave the results expected.
>
> So I need to figure out how to treat the remark strings as utf8 and hopefully that should fix the problem. At least I have a view of a path forward on this issue now, that will keep the protection of the cleanhtml command while also allowing characters with diacritical marks, plus special characters such as the Cyrillic alphabet and also things like the german eszet that currently all get mangled.
>
> Will let you know how I get on.
>
I got it to work. I used the dns.cgi page with the cleanhtml line still in it. I then ran decode("UTF-8", "Remark string") before running the cleanhtml command on the same string.

I entered ß Ф Ч < > Ӧ ü £ μ ô ò ó õ å ä ã â á à and after it was accepted the WUI page still showed the same characters so it looked to have worked. In the servers file the characters are all encoded entities with the names top be expected

&szlig; &#x424; &#x427; &lt; &gt; &#x4E6; &uuml; &pound; &mu; &ocirc; &ograve; &oacute; &otilde; &aring; &auml; &atilde; &acirc; &aacute; &agrave;

To make that work I had to add use Encode at the top of the dns.cgi page.

So unless there is any indication back that this approach is not a good one I will start to work on new patch updates for the various pages that will keep the existing cleanhtml commands but decode the strings from UTF-8 to enable the HTML::Entities command to work correctly.

Regards,

Adolf


> Additionally I will also later on create patches for the WUI CGI pages for the Firewall Groups and for WIO as they do not use the cleanhtml command at all yet they also have many Remark entries. I will also check out the other WUI pages that don't use the cleanhtml command to see if they have remarks etc that should use it.
>
> Regards,
>
> Adolf.
>
> On 06/03/2024 23:23, Adolf Belka wrote:
>> Hi Michael,
>>
>> On 06/03/2024 22:28, Michael Tremer wrote:
>>> Hello Adolf,
>>>
>>> I believe that I cannot merge these patches.
>> Then you need to also look back at the dns.cgi patch for the bug fix due to german umlauts being changed. The acceptance of that patch is what made me create these patches as they all had the same problem with remarks as well. If this can't be accepted as is then that patch needs to be reverted.
>>
>> https://git.ipfire.org/?p=ipfire-2.x.git;a=commit;h=7c6ff5ff12331a53f416080a44c8d6145e78bfac
>>>
>>> The reason simply is that it would create a store cross-site scripting attack vector because someone could store some <script> tags with some JS which will be executed if another admin opens the same page.
>>>
>>> That is why we escape the content so that if there are any special characters like <> for HTML tags they won’t be interpreted by the browser.
>>>
>>> We might have problems where we accidentally call the cleanhtml function twice which should show garbage. We might also have a problem where the function is not giving us the output that we want.
>>>
>>> Which strings have been causing problems? Just German umlauts like “äöü”?
>> It is any character that has an accent or other diacritical mark.
>>
>> I just tried entering the following into the remark section for a dns server entry as an example
>> Ä ã ö â á à
>>
>> and the remark was changed to
>> Ä ã ö â á Ã
>>
>> and if I edit the entry but don't change the remark the new characters above get changed again into
>> Ä ã ö â á ÃÂ
>>
>>
>> I would have expected that running cleanhtml should result in characters that are considered safe after one run through but it seems that the encoding creates characters that are encoded again by cleanhtml as being unsafe and then those ones are again still considered unsafe.
>>
>> If the cleanhtml command needs to stay being used also for the remark entries then I have no idea how to allow german umlauts and other accented characters to be shown correctly because they are all higher bit ascii characters and those are encoded by default by the cleanhtml process as being considered unsafe so I would either need some help on how to deal with it or maybe someone else needs to pick up the original bug#12395
>>
>> The only thing I found is that cleanhtml calls the escape function which calls the HTML::Entities::encode_entities command but that command can have an additional option which defines the characters that are considered unsafe, but then I would need some guidance on which characters are considered unsafe and if that set applies to all invocations of the cleanhtml command. ie should a modified cleanhtml command with the extra option for the HTML::Entities::encode_entities command only be used for some of the cleanhtml calls and if so would that be only the remark/comment entries?
>>
>> Regards,
>> Adolf.
>>
>>>
>>> -Michael
>>>
>>>> On 5 Mar 2024, at 19:44, Adolf Belka <adolf.belka@ipfire.org> wrote:
>>>>
>>>> - Using cleanhtml on Remarks means that all characters with diacritical marks such as
>>>>    umlauts or grave accents etc get encoded into other characters.
>>>> - If Freifunk München e.V. is entered as a remark it gets converted to
>>>>    Freifunk München e.V.
>>>> - cleanhtml is only removed from Remarks or Comment fields. In other places it has been
>>>>    left in place.
>>>>
>>>> Tested-by: Adolf Belka <adolf.belka@ipfire.org>
>>>> Signed-off-by: Adolf Belka <adolf.belka@ipfire.org>
>>>> ---
>>>> html/cgi-bin/connscheduler.cgi | 4 ++--
>>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/html/cgi-bin/connscheduler.cgi b/html/cgi-bin/connscheduler.cgi
>>>> index cc78cbc1b..817247cc4 100644
>>>> --- a/html/cgi-bin/connscheduler.cgi
>>>> +++ b/html/cgi-bin/connscheduler.cgi
>>>> @@ -2,7 +2,7 @@
>>>> ###############################################################################
>>>> # #
>>>> # IPFire.org - A linux based firewall                                         #
>>>> -# Copyright (C) 2007  Michael Tremer & Christian Schmidt                      #
>>>> +# Copyright (C) 2007-2024  IPFire Team <info@ipfire.org>                     #
>>>> # #
>>>> # This program is free software: you can redistribute it and/or modify        #
>>>> # it under the terms of the GNU General Public License as published by        #
>>>> @@ -186,7 +186,7 @@ if ( ($cgiparams{'ACTION'} eq 'add') || ($cgiparams{'ACTION'} eq 'update') )
>>>>    $CONNSCHED::config[$i]{'DAYSTYPE'} = lc($cgiparams{'ACTION_DAYSTYPE'});
>>>>    $CONNSCHED::config[$i]{'DAYS'} = $l_days;
>>>>    $CONNSCHED::config[$i]{'WEEKDAYS'} = $l_weekdays;
>>>> -  $CONNSCHED::config[$i]{'COMMENT'} = &Header::cleanhtml($cgiparams{'ACTION_COMMENT'});
>>>> +  $CONNSCHED::config[$i]{'COMMENT'} = $cgiparams{'ACTION_COMMENT'};
>>>>
>>>>    &CONNSCHED::WriteConfig;
>>>> }
>>>> -- 
>>>> 2.44.0
>>>>
>>>
>>
>
  

Patch

diff --git a/html/cgi-bin/connscheduler.cgi b/html/cgi-bin/connscheduler.cgi
index cc78cbc1b..817247cc4 100644
--- a/html/cgi-bin/connscheduler.cgi
+++ b/html/cgi-bin/connscheduler.cgi
@@ -2,7 +2,7 @@ 
 ###############################################################################
 #                                                                             #
 # IPFire.org - A linux based firewall                                         #
-# Copyright (C) 2007  Michael Tremer & Christian Schmidt                      #
+# Copyright (C) 2007-2024  IPFire Team  <info@ipfire.org>                     #
 #                                                                             #
 # This program is free software: you can redistribute it and/or modify        #
 # it under the terms of the GNU General Public License as published by        #
@@ -186,7 +186,7 @@  if ( ($cgiparams{'ACTION'} eq 'add') || ($cgiparams{'ACTION'} eq 'update') )
   $CONNSCHED::config[$i]{'DAYSTYPE'} = lc($cgiparams{'ACTION_DAYSTYPE'});
   $CONNSCHED::config[$i]{'DAYS'} = $l_days;
   $CONNSCHED::config[$i]{'WEEKDAYS'} = $l_weekdays;
-  $CONNSCHED::config[$i]{'COMMENT'} = &Header::cleanhtml($cgiparams{'ACTION_COMMENT'});
+  $CONNSCHED::config[$i]{'COMMENT'} = $cgiparams{'ACTION_COMMENT'};
 
   &CONNSCHED::WriteConfig;
 }