Stop unbound-dhcp-leases-bridge from continually restarting unbound

Message ID f4f78133-2e8b-4115-9eb6-d7035968efe0@howitts.co.uk
State New
Headers
Series Stop unbound-dhcp-leases-bridge from continually restarting unbound |

Commit Message

Nick Howitt March 24, 2024, 3:25 p.m. UTC
  Hi all,

Please bear with me as I am new to IPFire and not at all used to this 
style of development, so any guidance would be appreciated.

There is a bug, https://bugzilla.ipfire.org/show_bug.cgi?id=13254, where 
unbound-dhcp-leases-bridge restarts unbound a lot when it is totally 
unnecessary. It appears to be because when a lease is renewed, the 
script gets triggered, created a new /etc/unbound/dhcp-leases.conf then 
restarts Unbound to read the file. Generally this seems to be 
unnecessary as, with a renewed lease, the old and new 
/etc/unbound/dhcp-leases.conf files are the same. With 5 leases in that 
file (one for a machine not active) I am getting 3-4 restarts an hour of 
Unbound when I have a min/max lease time of 60/120min.

Looking at the code, it is fairly easy to fix. The current code creates 
a temp file then copies it into place then restarts unbound. All I am 
doing is doing a file comparison before the copy and skipping the 
restart if the files are the same.

There were a couple of gotchas because setting the file attributes and 
copying the file were done inside the "with" block for generating the 
temporary file. This meant a file comparison always returned False as 
the temp file was still open and so never the same is the old file.

I moved those two statements outside the "with". This forced me to 
change the fchmod to chmod.

It could be argued that the file copy should not be done if the files 
are not different, but it would take an extra "if" and you'd have to 
remember to delete the temp file. If required, I can make that change.

Also, one small thing I noticed once is that the old and new 
dhcp-leases.conf files could occasionally contain the same leases but in 
a different order. I have been unable to reproduce, but to sidestep it, 
instead of stepping through the leases variable directly, I sort it and 
step through that. It should make the resulting file completely 
deterministic and make the file comparison more effective.

My patch is:

 From 73873d4944944a2f02317a73074a6894726f36f7 Mon Sep 17 00:00:00 2001
From: Nick Howitt <nick@howitts.co.uk>
Date: Sun, 24 Mar 2024 15:17:19 +0000
Subject: [PATCH] Stop unbound-dhcp-leases-bridge from restarting unbound 
every
  time it write dhcp-leases.conf as it is very often unchanged and does not
  require a restart.

---
  config/unbound/unbound-dhcp-leases-bridge | 28 +++++++++++++++--------
  1 file changed, 19 insertions(+), 9 deletions(-)

  		command = ["unbound-control"]
  

Comments

Adolf Belka March 24, 2024, 3:38 p.m. UTC | #1
Hi Nick,

You need to send the patch, as you have it below to the ipfire 
development list.

With PATCH in the subject line it will be recognised by the IPFire 
development mailing list as a patch and will also be picked up in the 
IPFire Patchwork system.

https://patchwork.ipfire.org/project/ipfire/list/

As this email has a line that contains Subject: [PATCH] it has already 
been picked up by Patchwork. You can check via the above link./

Then the patch will be reviewed on this mailing list. If accepted, it 
will be merged by one of the devs and if not accepted you will get a 
response as a reply to the patch so a history will be kept of the whole 
patch review/discussion etc.

There is a wiki page on patch submission for more details.

https://www.ipfire.org/docs/devel/submit-patches

Regards,

Adolf.


On 24/03/2024 16:25, Nick Howitt wrote:
> Hi all,
>
> Please bear with me as I am new to IPFire and not at all used to this 
> style of development, so any guidance would be appreciated.
>
> There is a bug, https://bugzilla.ipfire.org/show_bug.cgi?id=13254, 
> where unbound-dhcp-leases-bridge restarts unbound a lot when it is 
> totally unnecessary. It appears to be because when a lease is renewed, 
> the script gets triggered, created a new /etc/unbound/dhcp-leases.conf 
> then restarts Unbound to read the file. Generally this seems to be 
> unnecessary as, with a renewed lease, the old and new 
> /etc/unbound/dhcp-leases.conf files are the same. With 5 leases in 
> that file (one for a machine not active) I am getting 3-4 restarts an 
> hour of Unbound when I have a min/max lease time of 60/120min.
>
> Looking at the code, it is fairly easy to fix. The current code 
> creates a temp file then copies it into place then restarts unbound. 
> All I am doing is doing a file comparison before the copy and skipping 
> the restart if the files are the same.
>
> There were a couple of gotchas because setting the file attributes and 
> copying the file were done inside the "with" block for generating the 
> temporary file. This meant a file comparison always returned False as 
> the temp file was still open and so never the same is the old file.
>
> I moved those two statements outside the "with". This forced me to 
> change the fchmod to chmod.
>
> It could be argued that the file copy should not be done if the files 
> are not different, but it would take an extra "if" and you'd have to 
> remember to delete the temp file. If required, I can make that change.
>
> Also, one small thing I noticed once is that the old and new 
> dhcp-leases.conf files could occasionally contain the same leases but 
> in a different order. I have been unable to reproduce, but to sidestep 
> it, instead of stepping through the leases variable directly, I sort 
> it and step through that. It should make the resulting file completely 
> deterministic and make the file comparison more effective.
>
> My patch is:
>
> From 73873d4944944a2f02317a73074a6894726f36f7 Mon Sep 17 00:00:00 2001
> From: Nick Howitt <nick@howitts.co.uk>
> Date: Sun, 24 Mar 2024 15:17:19 +0000
> Subject: [PATCH] Stop unbound-dhcp-leases-bridge from restarting 
> unbound every
>  time it write dhcp-leases.conf as it is very often unchanged and does 
> not
>  require a restart.
>
> ---
>  config/unbound/unbound-dhcp-leases-bridge | 28 +++++++++++++++--------
>  1 file changed, 19 insertions(+), 9 deletions(-)
>
> diff --git a/config/unbound/unbound-dhcp-leases-bridge 
> b/config/unbound/unbound-dhcp-leases-bridge
> index e9f022aff..d22772066 100644
> --- a/config/unbound/unbound-dhcp-leases-bridge
> +++ b/config/unbound/unbound-dhcp-leases-bridge
> @@ -22,6 +22,7 @@
>  import argparse
>  import datetime
>  import daemon
> +import filecmp
>  import functools
>  import ipaddress
>  import logging
> @@ -516,26 +517,35 @@ class UnboundConfigWriter(object):
>
>      def update_dhcp_leases(self, leases):
>          # Write out all leases
> -        self.write_dhcp_leases(leases)
> +        if self.write_dhcp_leases(leases):
>
> -        log.debug("Reloading Unbound...")
> +            log.debug("Reloading Unbound...")
>
> -        # Reload the configuration without dropping the cache
> -        self._control("reload_keep_cache")
> +            # Reload the configuration without dropping the cache
> +            self._control("reload_keep_cache")
> +
> +        else:
> +
> +            log.debug("Not reloading Unbound. Leases not changed.")
>
>      def write_dhcp_leases(self, leases):
>          log.debug("Writing DHCP leases...")
>
>          with tempfile.NamedTemporaryFile(mode="w", delete=False) as f:
> -            for l in leases:
> +            for l in sorted(leases):
>                  for rr in l.rrset:
>                      f.write("local-data: \"%s\"\n" % " ".join(rr))
>
> -            # Make file readable for everyone
> -            os.fchmod(f.fileno(), 
> stat.S_IRUSR|stat.S_IWUSR|stat.S_IRGRP|stat.S_IROTH)
> +        filecmp.clear_cache()
> +        RequireUnboundReload = not filecmp.cmp(f.name, self.path, 
> shallow=False)
> +
> +        # Make file readable for everyone
> +        os.chmod(f.name, 
> stat.S_IRUSR|stat.S_IWUSR|stat.S_IRGRP|stat.S_IROTH)
> +
> +        # Move the file to its destination
> +        os.rename(f.name, self.path)
>
> -            # Move the file to its destination
> -            os.rename(f.name, self.path)
> +        return RequireUnboundReload
>
>      def _control(self, *args):
>          command = ["unbound-control"]
  
Nick Howitt March 24, 2024, 3:51 p.m. UTC | #2
Hi Adolf,

Thanks. I'd read that link before but missed changing the Subject line 
when I created the email. As you see, I've resubmitted it.

Regards,

Nick

On 24/03/2024 15:38, Adolf Belka wrote:
> 
> Hi Nick,
> 
> You need to send the patch, as you have it below to the ipfire 
> development list.
> 
> With PATCH in the subject line it will be recognised by the IPFire 
> development mailing list as a patch and will also be picked up in the 
> IPFire Patchwork system.
> 
> https://patchwork.ipfire.org/project/ipfire/list/
> 
> As this email has a line that contains Subject: [PATCH] it has already 
> been picked up by Patchwork. You can check via the above link./
> 
> Then the patch will be reviewed on this mailing list. If accepted, it 
> will be merged by one of the devs and if not accepted you will get a 
> response as a reply to the patch so a history will be kept of the whole 
> patch review/discussion etc.
> 
> There is a wiki page on patch submission for more details.
> 
> https://www.ipfire.org/docs/devel/submit-patches
> 
> Regards,
> 
> Adolf.
> 
> 
> On 24/03/2024 16:25, Nick Howitt wrote:
>> Hi all,
>>
>> Please bear with me as I am new to IPFire and not at all used to this 
>> style of development, so any guidance would be appreciated.
>>
>> There is a bug, https://bugzilla.ipfire.org/show_bug.cgi?id=13254, 
>> where unbound-dhcp-leases-bridge restarts unbound a lot when it is 
>> totally unnecessary. It appears to be because when a lease is renewed, 
>> the script gets triggered, created a new /etc/unbound/dhcp-leases.conf 
>> then restarts Unbound to read the file. Generally this seems to be 
>> unnecessary as, with a renewed lease, the old and new 
>> /etc/unbound/dhcp-leases.conf files are the same. With 5 leases in 
>> that file (one for a machine not active) I am getting 3-4 restarts an 
>> hour of Unbound when I have a min/max lease time of 60/120min.
>>
>> Looking at the code, it is fairly easy to fix. The current code 
>> creates a temp file then copies it into place then restarts unbound. 
>> All I am doing is doing a file comparison before the copy and skipping 
>> the restart if the files are the same.
>>
>> There were a couple of gotchas because setting the file attributes and 
>> copying the file were done inside the "with" block for generating the 
>> temporary file. This meant a file comparison always returned False as 
>> the temp file was still open and so never the same is the old file.
>>
>> I moved those two statements outside the "with". This forced me to 
>> change the fchmod to chmod.
>>
>> It could be argued that the file copy should not be done if the files 
>> are not different, but it would take an extra "if" and you'd have to 
>> remember to delete the temp file. If required, I can make that change.
>>
>> Also, one small thing I noticed once is that the old and new 
>> dhcp-leases.conf files could occasionally contain the same leases but 
>> in a different order. I have been unable to reproduce, but to sidestep 
>> it, instead of stepping through the leases variable directly, I sort 
>> it and step through that. It should make the resulting file completely 
>> deterministic and make the file comparison more effective.
>>
>> My patch is:
>>
>> From 73873d4944944a2f02317a73074a6894726f36f7 Mon Sep 17 00:00:00 2001
>> From: Nick Howitt <nick@howitts.co.uk>
>> Date: Sun, 24 Mar 2024 15:17:19 +0000
>> Subject: [PATCH] Stop unbound-dhcp-leases-bridge from restarting 
>> unbound every
>>  time it write dhcp-leases.conf as it is very often unchanged and does 
>> not
>>  require a restart.
>>
>> ---
>>  config/unbound/unbound-dhcp-leases-bridge | 28 +++++++++++++++--------
>>  1 file changed, 19 insertions(+), 9 deletions(-)
>>
>> diff --git a/config/unbound/unbound-dhcp-leases-bridge 
>> b/config/unbound/unbound-dhcp-leases-bridge
>> index e9f022aff..d22772066 100644
>> --- a/config/unbound/unbound-dhcp-leases-bridge
>> +++ b/config/unbound/unbound-dhcp-leases-bridge
>> @@ -22,6 +22,7 @@
>>  import argparse
>>  import datetime
>>  import daemon
>> +import filecmp
>>  import functools
>>  import ipaddress
>>  import logging
>> @@ -516,26 +517,35 @@ class UnboundConfigWriter(object):
>>
>>      def update_dhcp_leases(self, leases):
>>          # Write out all leases
>> -        self.write_dhcp_leases(leases)
>> +        if self.write_dhcp_leases(leases):
>>
>> -        log.debug("Reloading Unbound...")
>> +            log.debug("Reloading Unbound...")
>>
>> -        # Reload the configuration without dropping the cache
>> -        self._control("reload_keep_cache")
>> +            # Reload the configuration without dropping the cache
>> +            self._control("reload_keep_cache")
>> +
>> +        else:
>> +
>> +            log.debug("Not reloading Unbound. Leases not changed.")
>>
>>      def write_dhcp_leases(self, leases):
>>          log.debug("Writing DHCP leases...")
>>
>>          with tempfile.NamedTemporaryFile(mode="w", delete=False) as f:
>> -            for l in leases:
>> +            for l in sorted(leases):
>>                  for rr in l.rrset:
>>                      f.write("local-data: \"%s\"\n" % " ".join(rr))
>>
>> -            # Make file readable for everyone
>> -            os.fchmod(f.fileno(), 
>> stat.S_IRUSR|stat.S_IWUSR|stat.S_IRGRP|stat.S_IROTH)
>> +        filecmp.clear_cache()
>> +        RequireUnboundReload = not filecmp.cmp(f.name, self.path, 
>> shallow=False)
>> +
>> +        # Make file readable for everyone
>> +        os.chmod(f.name, 
>> stat.S_IRUSR|stat.S_IWUSR|stat.S_IRGRP|stat.S_IROTH)
>> +
>> +        # Move the file to its destination
>> +        os.rename(f.name, self.path)
>>
>> -            # Move the file to its destination
>> -            os.rename(f.name, self.path)
>> +        return RequireUnboundReload
>>
>>      def _control(self, *args):
>>          command = ["unbound-control"]
>
  
Michael Tremer March 25, 2024, 11:21 a.m. UTC | #3
Hello,

> On 24 Mar 2024, at 15:25, Nick Howitt <nick@howitts.co.uk> wrote:
> 
> Hi all,
> 
> Please bear with me as I am new to IPFire and not at all used to this style of development, so any guidance would be appreciated.

No worries, feel free to ask any questions that you have… There is documentation around all of this here:

  https://www.ipfire.org/docs/devel

Especially here:

  https://www.ipfire.org/docs/devel/submit-patches

> There is a bug, https://bugzilla.ipfire.org/show_bug.cgi?id=13254, where unbound-dhcp-leases-bridge restarts unbound a lot when it is totally unnecessary. It appears to be because when a lease is renewed, the script gets triggered, created a new /etc/unbound/dhcp-leases.conf then restarts Unbound to read the file. Generally this seems to be unnecessary as, with a renewed lease, the old and new /etc/unbound/dhcp-leases.conf files are the same. With 5 leases in that file (one for a machine not active) I am getting 3-4 restarts an hour of Unbound when I have a min/max lease time of 60/120min.

I am still not entirely sure why it is such an issue to reload Unbound that often, but I am happy to avoid doing this whenever possible.

> Looking at the code, it is fairly easy to fix. The current code creates a temp file then copies it into place then restarts unbound. All I am doing is doing a file comparison before the copy and skipping the restart if the files are the same.

Yeah, I think this is a workable solution to mitigate the problem slightly.

Note that Jon has sent a couple of ideas a while back solving this problem by removing the bridge and running shell commands instead, but I did not have the time to look into how robust that solution would be. With the bridge, we have at least a certain amount of confidence.

> There were a couple of gotchas because setting the file attributes and copying the file were done inside the "with" block for generating the temporary file. This meant a file comparison always returned False as the temp file was still open and so never the same is the old file.
> 
> I moved those two statements outside the "with". This forced me to change the fchmod to chmod.

In theory you could have left it there, because even if the file is being removed later, there is no harm in changing permissions before.

> It could be argued that the file copy should not be done if the files are not different, but it would take an extra "if" and you'd have to remember to delete the temp file. If required, I can make that change.

I think I can live with that. This while thing is very far from efficient anyways and serves as some glue between Unbound and ISC DHCP. The latter will definitely go soon as it has become EOL and whenever we make that transition we will have to come up with something new which hopefully does not have the shortcomings that the bridge has.

> Also, one small thing I noticed once is that the old and new dhcp-leases.conf files could occasionally contain the same leases but in a different order. I have been unable to reproduce, but to sidestep it, instead of stepping through the leases variable directly, I sort it and step through that. It should make the resulting file completely deterministic and make the file comparison more effective.

I think we are evaluating where to do the maths. I always considered it perfectly safe to just throw the new file into the Unbound configuration, have Unbound read it again and then everything is good. I would also argue that that works without any problems even in places where I have 2000 active leases with a rather short lease time. However, there is the log message…

Sorting creates some extra work, but I am sure that this is a lot cheaper than reloading Unbound. So once again, I am happy to go down this route.

> 
> My patch is:
> 
> From 73873d4944944a2f02317a73074a6894726f36f7 Mon Sep 17 00:00:00 2001
> From: Nick Howitt <nick@howitts.co.uk>
> Date: Sun, 24 Mar 2024 15:17:19 +0000
> Subject: [PATCH] Stop unbound-dhcp-leases-bridge from restarting unbound every
> time it write dhcp-leases.conf as it is very often unchanged and does not
> require a restart.
> 
> ---
> config/unbound/unbound-dhcp-leases-bridge | 28 +++++++++++++++--------
> 1 file changed, 19 insertions(+), 9 deletions(-)
> 
> diff --git a/config/unbound/unbound-dhcp-leases-bridge b/config/unbound/unbound-dhcp-leases-bridge
> index e9f022aff..d22772066 100644
> --- a/config/unbound/unbound-dhcp-leases-bridge
> +++ b/config/unbound/unbound-dhcp-leases-bridge
> @@ -22,6 +22,7 @@
> import argparse
> import datetime
> import daemon
> +import filecmp

This module does however not seem very fit for purpose for me. It tries to check the metadata of the file using stat() calls which will not do at all what we want them to do and it will cache any results which is also not at all helpful.

I would suggest to simply implement this function yourself like so:

def compare(f1, f2):
	buffersize = 16 * 1024

	# Reset file handles
	f1.seek(0)
	f2.seek(0)

	while True:
		b1 = f1.read(buffersize)
		b2 = f2.read(buffersize)

		if not b1 == b2:
			return False

		elif not b1 or not b2:
			break

	return True

You can pass the file descriptors to each of the files and since we are now changing a couple of things, former assumptions can be dropped:

* Unbound used to reload the configuration file, but there could have been the case of concurrent reads. Therefore the entire atomic file replacement can be removed.
* With that, you could remove the delete=False attribute from the temporary file. You already have the target file open and you can just copy the content from the temporary file. Because of the former assumption I would consider that safe.

That way, you will be able to avoid copying anything if the files don’t match and you will not have to introduce any extra code to remove the temporary file. And the chmod statement can go, too :)

> import functools
> import ipaddress
> import logging
> @@ -516,26 +517,35 @@ class UnboundConfigWriter(object):
> 
> def update_dhcp_leases(self, leases):
> # Write out all leases
> - self.write_dhcp_leases(leases)
> + if self.write_dhcp_leases(leases):
> 
> - log.debug("Reloading Unbound...")
> + log.debug("Reloading Unbound...")
> 
> - # Reload the configuration without dropping the cache
> - self._control("reload_keep_cache")
> + # Reload the configuration without dropping the cache
> + self._control("reload_keep_cache")
> +
> + else:
> +
> + log.debug("Not reloading Unbound. Leases not changed.")
> 
> def write_dhcp_leases(self, leases):
> log.debug("Writing DHCP leases...")
> 
> with tempfile.NamedTemporaryFile(mode="w", delete=False) as f:
> - for l in leases:
> + for l in sorted(leases):
> for rr in l.rrset:
> f.write("local-data: \"%s\"\n" % " ".join(rr))
> 
> - # Make file readable for everyone
> - os.fchmod(f.fileno(), stat.S_IRUSR|stat.S_IWUSR|stat.S_IRGRP|stat.S_IROTH)
> + filecmp.clear_cache()
> + RequireUnboundReload = not filecmp.cmp(f.name, self.path, shallow=False)

Oh and please use snake case for variables. It is Python after all.

> + # Make file readable for everyone
> + os.chmod(f.name, stat.S_IRUSR|stat.S_IWUSR|stat.S_IRGRP|stat.S_IROTH)
> +
> + # Move the file to its destination
> + os.rename(f.name, self.path)
> 
> - # Move the file to its destination
> - os.rename(f.name, self.path)
> + return RequireUnboundReload
> 
> def _control(self, *args):
> command = ["unbound-control"]
> -- 
> 2.43.0
> 
> I have this locally on a branch on my firewall. If it is OK, do I need some sort of rights to push my branch and then how I get it merged?

I can create your own Git repository for you to backup/manage your own changes, but we do not allow anyone apart from two people to push into the master repository.

I am also very happy to see your first code contribution to IPFire!

-Michael

> 
> Regards,
> 
> Nick
  
Nick Howitt March 25, 2024, 11:46 a.m. UTC | #4
On 25/03/2024 11:21, Michael Tremer wrote:
> 
> Hello,
> 
>> On 24 Mar 2024, at 15:25, Nick Howitt <nick@howitts.co.uk> wrote:
>>
>> Hi all,
>>
>> Please bear with me as I am new to IPFire and not at all used to this style of development, so any guidance would be appreciated.
> 
> No worries, feel free to ask any questions that you have… There is documentation around all of this here:
> 
>    https://www.ipfire.org/docs/devel
> 
> Especially here:
> 
>    https://www.ipfire.org/docs/devel/submit-patches
> 
>> There is a bug, https://bugzilla.ipfire.org/show_bug.cgi?id=13254, where unbound-dhcp-leases-bridge restarts unbound a lot when it is totally unnecessary. It appears to be because when a lease is renewed, the script gets triggered, created a new /etc/unbound/dhcp-leases.conf then restarts Unbound to read the file. Generally this seems to be unnecessary as, with a renewed lease, the old and new /etc/unbound/dhcp-leases.conf files are the same. With 5 leases in that file (one for a machine not active) I am getting 3-4 restarts an hour of Unbound when I have a min/max lease time of 60/120min.
> 
> I am still not entirely sure why it is such an issue to reload Unbound that often, but I am happy to avoid doing this whenever possible.
> 
I am seeing odd pauses in browsing and I was wondering if it was related 
to having to wait on DNS lookups as Unbound was reloading. That is when 
I bumped into the bugzilla report and noticed others possibly 
experiencing something similar.
>> Looking at the code, it is fairly easy to fix. The current code creates a temp file then copies it into place then restarts unbound. All I am doing is doing a file comparison before the copy and skipping the restart if the files are the same.
> 
> Yeah, I think this is a workable solution to mitigate the problem slightly.
> 
> Note that Jon has sent a couple of ideas a while back solving this problem by removing the bridge and running shell commands instead, but I did not have the time to look into how robust that solution would be. With the bridge, we have at least a certain amount of confidence.
> 
I like what Jon has done and want to evaluate it. Also, as it is in Bash 
I have a better chance of understanding all of it. I would consider my 
solution as a short term sticking plaster and Jon's as the proper way to 
go longer term.
>> There were a couple of gotchas because setting the file attributes and copying the file were done inside the "with" block for generating the temporary file. This meant a file comparison always returned False as the temp file was still open and so never the same is the old file.
>>
>> I moved those two statements outside the "with". This forced me to change the fchmod to chmod.
> 
> In theory you could have left it there, because even if the file is being removed later, there is no harm in changing permissions before.
> 
Agreed. It just didn't seem to be the right place inside the with 
statement. It also meant using a file descriptor rather than a file name 
which I find harder to read/understand as I am not a programmer, just a 
hacker. As I was in that area, I thought it a good opportunity to change it.
>> It could be argued that the file copy should not be done if the files are not different, but it would take an extra "if" and you'd have to remember to delete the temp file. If required, I can make that change.
> 
> I think I can live with that. This while thing is very far from efficient anyways and serves as some glue between Unbound and ISC DHCP. The latter will definitely go soon as it has become EOL and whenever we make that transition we will have to come up with something new which hopefully does not have the shortcomings that the bridge has.
> 
>> Also, one small thing I noticed once is that the old and new dhcp-leases.conf files could occasionally contain the same leases but in a different order. I have been unable to reproduce, but to sidestep it, instead of stepping through the leases variable directly, I sort it and step through that. It should make the resulting file completely deterministic and make the file comparison more effective.
> 
> I think we are evaluating where to do the maths. I always considered it perfectly safe to just throw the new file into the Unbound configuration, have Unbound read it again and then everything is good. I would also argue that that works without any problems even in places where I have 2000 active leases with a rather short lease time. However, there is the log message…
> 
> Sorting creates some extra work, but I am sure that this is a lot cheaper than reloading Unbound. So once again, I am happy to go down this route.
> 
>>
>> My patch is:
>>
>>  From 73873d4944944a2f02317a73074a6894726f36f7 Mon Sep 17 00:00:00 2001
>> From: Nick Howitt <nick@howitts.co.uk>
>> Date: Sun, 24 Mar 2024 15:17:19 +0000
>> Subject: [PATCH] Stop unbound-dhcp-leases-bridge from restarting unbound every
>> time it write dhcp-leases.conf as it is very often unchanged and does not
>> require a restart.
>>
>> ---
>> config/unbound/unbound-dhcp-leases-bridge | 28 +++++++++++++++--------
>> 1 file changed, 19 insertions(+), 9 deletions(-)
>>
>> diff --git a/config/unbound/unbound-dhcp-leases-bridge b/config/unbound/unbound-dhcp-leases-bridge
>> index e9f022aff..d22772066 100644
>> --- a/config/unbound/unbound-dhcp-leases-bridge
>> +++ b/config/unbound/unbound-dhcp-leases-bridge
>> @@ -22,6 +22,7 @@
>> import argparse
>> import datetime
>> import daemon
>> +import filecmp
> 
> This module does however not seem very fit for purpose for me. It tries to check the metadata of the file using stat() calls which will not do at all what we want them to do and it will cache any results which is also not at all helpful.
> 
> I would suggest to simply implement this function yourself like so:
> 
> def compare(f1, f2):
> 	buffersize = 16 * 1024
> 
> 	# Reset file handles
> 	f1.seek(0)
> 	f2.seek(0)
> 
> 	while True:
> 		b1 = f1.read(buffersize)
> 		b2 = f2.read(buffersize)
> 
> 		if not b1 == b2:
> 			return False
> 
> 		elif not b1 or not b2:
> 			break
> 
> 	return True
> 
> You can pass the file descriptors to each of the files and since we are now changing a couple of things, former assumptions can be dropped:
> 
> * Unbound used to reload the configuration file, but there could have been the case of concurrent reads. Therefore the entire atomic file replacement can be removed.
> * With that, you could remove the delete=False attribute from the temporary file. You already have the target file open and you can just copy the content from the temporary file. Because of the former assumption I would consider that safe.
> 
> That way, you will be able to avoid copying anything if the files don’t match and you will not have to introduce any extra code to remove the temporary file. And the chmod statement can go, too :)
> 
If you ise filecmp.cmp with "shallow=False" it does not use stat() and 
does a proper file comparison. It that is OK with you, I'd like to keep 
using filecmp as it is a one-liner.
>> import functools
>> import ipaddress
>> import logging
>> @@ -516,26 +517,35 @@ class UnboundConfigWriter(object):
>>
>> def update_dhcp_leases(self, leases):
>> # Write out all leases
>> - self.write_dhcp_leases(leases)
>> + if self.write_dhcp_leases(leases):
>>
>> - log.debug("Reloading Unbound...")
>> + log.debug("Reloading Unbound...")
>>
>> - # Reload the configuration without dropping the cache
>> - self._control("reload_keep_cache")
>> + # Reload the configuration without dropping the cache
>> + self._control("reload_keep_cache")
>> +
>> + else:
>> +
>> + log.debug("Not reloading Unbound. Leases not changed.")
>>
>> def write_dhcp_leases(self, leases):
>> log.debug("Writing DHCP leases...")
>>
>> with tempfile.NamedTemporaryFile(mode="w", delete=False) as f:
>> - for l in leases:
>> + for l in sorted(leases):
>> for rr in l.rrset:
>> f.write("local-data: \"%s\"\n" % " ".join(rr))
>>
>> - # Make file readable for everyone
>> - os.fchmod(f.fileno(), stat.S_IRUSR|stat.S_IWUSR|stat.S_IRGRP|stat.S_IROTH)
>> + filecmp.clear_cache()
>> + RequireUnboundReload = not filecmp.cmp(f.name, self.path, shallow=False)
> 
> Oh and please use snake case for variables. It is Python after all. >
Never heard of the expression, but then I'm a hacker, not a programmer! 
I'll change it.
>> + # Make file readable for everyone
>> + os.chmod(f.name, stat.S_IRUSR|stat.S_IWUSR|stat.S_IRGRP|stat.S_IROTH)
>> +
>> + # Move the file to its destination
>> + os.rename(f.name, self.path)
>>
>> - # Move the file to its destination
>> - os.rename(f.name, self.path)
>> + return RequireUnboundReload
>>
>> def _control(self, *args):
>> command = ["unbound-control"]
>> -- 
>> 2.43.0
>>
>> I have this locally on a branch on my firewall. If it is OK, do I need some sort of rights to push my branch and then how I get it merged?
> 
> I can create your own Git repository for you to backup/manage your own changes, but we do not allow anyone apart from two people to push into the master repository.
> 
I only have basic knowledge of Git and am not sure how much I will do as 
contribs. I am happy to work locally. If I had my own git repo, would I 
then have to create a merge request as well as submit by email?
> I am also very happy to see your first code contribution to IPFire!
> 
> -Michael
> 
>>
>> Regards,
>>
>> Nick
>
  
Michael Tremer March 26, 2024, 10:44 a.m. UTC | #5
Hello,

> On 25 Mar 2024, at 11:46, Nick Howitt <nick@howitts.co.uk> wrote:
> 
> 
> 
> On 25/03/2024 11:21, Michael Tremer wrote:
>> Hello,
>>> On 24 Mar 2024, at 15:25, Nick Howitt <nick@howitts.co.uk> wrote:
>>> 
>>> Hi all,
>>> 
>>> Please bear with me as I am new to IPFire and not at all used to this style of development, so any guidance would be appreciated.
>> No worries, feel free to ask any questions that you have… There is documentation around all of this here:
>>   https://www.ipfire.org/docs/devel
>> Especially here:
>>   https://www.ipfire.org/docs/devel/submit-patches
>>> There is a bug, https://bugzilla.ipfire.org/show_bug.cgi?id=13254, where unbound-dhcp-leases-bridge restarts unbound a lot when it is totally unnecessary. It appears to be because when a lease is renewed, the script gets triggered, created a new /etc/unbound/dhcp-leases.conf then restarts Unbound to read the file. Generally this seems to be unnecessary as, with a renewed lease, the old and new /etc/unbound/dhcp-leases.conf files are the same. With 5 leases in that file (one for a machine not active) I am getting 3-4 restarts an hour of Unbound when I have a min/max lease time of 60/120min.
>> I am still not entirely sure why it is such an issue to reload Unbound that often, but I am happy to avoid doing this whenever possible.
> I am seeing odd pauses in browsing and I was wondering if it was related to having to wait on DNS lookups as Unbound was reloading. That is when I bumped into the bugzilla report and noticed others possibly experiencing something similar.

Hmm, the setup where I tested this initially and which was the reason for implementing some changes has about 2000 clients in a day. There is a lot of activity and we would noticed any problems if Unbound went to sleep for a little while every now and then. They have a reload pretty much every few seconds.

>>> Looking at the code, it is fairly easy to fix. The current code creates a temp file then copies it into place then restarts unbound. All I am doing is doing a file comparison before the copy and skipping the restart if the files are the same.
>> Yeah, I think this is a workable solution to mitigate the problem slightly.
>> Note that Jon has sent a couple of ideas a while back solving this problem by removing the bridge and running shell commands instead, but I did not have the time to look into how robust that solution would be. With the bridge, we have at least a certain amount of confidence.
> I like what Jon has done and want to evaluate it. Also, as it is in Bash I have a better chance of understanding all of it. I would consider my solution as a short term sticking plaster and Jon's as the proper way to go longer term.

There is really to way around modifying Unbounds configuration. If we can really rule out that there is no problems when it’s reloading I don’t think we need Jon’s approach. If we can prove that there is a small outage, then we do.

>>> There were a couple of gotchas because setting the file attributes and copying the file were done inside the "with" block for generating the temporary file. This meant a file comparison always returned False as the temp file was still open and so never the same is the old file.
>>> 
>>> I moved those two statements outside the "with". This forced me to change the fchmod to chmod.
>> In theory you could have left it there, because even if the file is being removed later, there is no harm in changing permissions before.
> Agreed. It just didn't seem to be the right place inside the with statement. It also meant using a file descriptor rather than a file name which I find harder to read/understand as I am not a programmer, just a hacker. As I was in that area, I thought it a good opportunity to change it.
>>> It could be argued that the file copy should not be done if the files are not different, but it would take an extra "if" and you'd have to remember to delete the temp file. If required, I can make that change.
>> I think I can live with that. This while thing is very far from efficient anyways and serves as some glue between Unbound and ISC DHCP. The latter will definitely go soon as it has become EOL and whenever we make that transition we will have to come up with something new which hopefully does not have the shortcomings that the bridge has.
>>> Also, one small thing I noticed once is that the old and new dhcp-leases.conf files could occasionally contain the same leases but in a different order. I have been unable to reproduce, but to sidestep it, instead of stepping through the leases variable directly, I sort it and step through that. It should make the resulting file completely deterministic and make the file comparison more effective.
>> I think we are evaluating where to do the maths. I always considered it perfectly safe to just throw the new file into the Unbound configuration, have Unbound read it again and then everything is good. I would also argue that that works without any problems even in places where I have 2000 active leases with a rather short lease time. However, there is the log message…
>> Sorting creates some extra work, but I am sure that this is a lot cheaper than reloading Unbound. So once again, I am happy to go down this route.
>>> 
>>> My patch is:
>>> 
>>> From 73873d4944944a2f02317a73074a6894726f36f7 Mon Sep 17 00:00:00 2001
>>> From: Nick Howitt <nick@howitts.co.uk>
>>> Date: Sun, 24 Mar 2024 15:17:19 +0000
>>> Subject: [PATCH] Stop unbound-dhcp-leases-bridge from restarting unbound every
>>> time it write dhcp-leases.conf as it is very often unchanged and does not
>>> require a restart.
>>> 
>>> ---
>>> config/unbound/unbound-dhcp-leases-bridge | 28 +++++++++++++++--------
>>> 1 file changed, 19 insertions(+), 9 deletions(-)
>>> 
>>> diff --git a/config/unbound/unbound-dhcp-leases-bridge b/config/unbound/unbound-dhcp-leases-bridge
>>> index e9f022aff..d22772066 100644
>>> --- a/config/unbound/unbound-dhcp-leases-bridge
>>> +++ b/config/unbound/unbound-dhcp-leases-bridge
>>> @@ -22,6 +22,7 @@
>>> import argparse
>>> import datetime
>>> import daemon
>>> +import filecmp
>> This module does however not seem very fit for purpose for me. It tries to check the metadata of the file using stat() calls which will not do at all what we want them to do and it will cache any results which is also not at all helpful.
>> I would suggest to simply implement this function yourself like so:
>> def compare(f1, f2):
>> buffersize = 16 * 1024
>> # Reset file handles
>> f1.seek(0)
>> f2.seek(0)
>> while True:
>> b1 = f1.read(buffersize)
>> b2 = f2.read(buffersize)
>> if not b1 == b2:
>> return False
>> elif not b1 or not b2:
>> break
>> return True
>> You can pass the file descriptors to each of the files and since we are now changing a couple of things, former assumptions can be dropped:
>> * Unbound used to reload the configuration file, but there could have been the case of concurrent reads. Therefore the entire atomic file replacement can be removed.
>> * With that, you could remove the delete=False attribute from the temporary file. You already have the target file open and you can just copy the content from the temporary file. Because of the former assumption I would consider that safe.
>> That way, you will be able to avoid copying anything if the files don’t match and you will not have to introduce any extra code to remove the temporary file. And the chmod statement can go, too :)
> If you ise filecmp.cmp with "shallow=False" it does not use stat() and does a proper file comparison. It that is OK with you, I'd like to keep using filecmp as it is a one-liner.

But then you will have to live with the cache storing useless information. I find the “clear_cache()” statement really quite unsettling.

>>> import functools
>>> import ipaddress
>>> import logging
>>> @@ -516,26 +517,35 @@ class UnboundConfigWriter(object):
>>> 
>>> def update_dhcp_leases(self, leases):
>>> # Write out all leases
>>> - self.write_dhcp_leases(leases)
>>> + if self.write_dhcp_leases(leases):
>>> 
>>> - log.debug("Reloading Unbound...")
>>> + log.debug("Reloading Unbound...")
>>> 
>>> - # Reload the configuration without dropping the cache
>>> - self._control("reload_keep_cache")
>>> + # Reload the configuration without dropping the cache
>>> + self._control("reload_keep_cache")
>>> +
>>> + else:
>>> +
>>> + log.debug("Not reloading Unbound. Leases not changed.")
>>> 
>>> def write_dhcp_leases(self, leases):
>>> log.debug("Writing DHCP leases...")
>>> 
>>> with tempfile.NamedTemporaryFile(mode="w", delete=False) as f:
>>> - for l in leases:
>>> + for l in sorted(leases):
>>> for rr in l.rrset:
>>> f.write("local-data: \"%s\"\n" % " ".join(rr))
>>> 
>>> - # Make file readable for everyone
>>> - os.fchmod(f.fileno(), stat.S_IRUSR|stat.S_IWUSR|stat.S_IRGRP|stat.S_IROTH)
>>> + filecmp.clear_cache()
>>> + RequireUnboundReload = not filecmp.cmp(f.name, self.path, shallow=False)
>> Oh and please use snake case for variables. It is Python after all. >
> Never heard of the expression, but then I'm a hacker, not a programmer! I'll change it.

It is just that I like consistency. Yes really. I know that there is a lot of code that is anything but. It just helps with reading when your eyes can learn a pattern.

>>> + # Make file readable for everyone
>>> + os.chmod(f.name, stat.S_IRUSR|stat.S_IWUSR|stat.S_IRGRP|stat.S_IROTH)
>>> +
>>> + # Move the file to its destination
>>> + os.rename(f.name, self.path)
>>> 
>>> - # Move the file to its destination
>>> - os.rename(f.name, self.path)
>>> + return RequireUnboundReload
>>> 
>>> def _control(self, *args):
>>> command = ["unbound-control"]
>>> -- 
>>> 2.43.0
>>> 
>>> I have this locally on a branch on my firewall. If it is OK, do I need some sort of rights to push my branch and then how I get it merged?
>> I can create your own Git repository for you to backup/manage your own changes, but we do not allow anyone apart from two people to push into the master repository.
> I only have basic knowledge of Git and am not sure how much I will do as contribs. I am happy to work locally. If I had my own git repo, would I then have to create a merge request as well as submit by email?

No, we don’t send merge requests here. Only very rarely. Maybe less than once a year.

We simply email patch files to the mailing list for a paper trail and because it is the best way to put comments on the code :)

-Michael

>> I am also very happy to see your first code contribution to IPFire!
>> -Michael
>>> 
>>> Regards,
>>> 
>>> Nick
  

Patch

diff --git a/config/unbound/unbound-dhcp-leases-bridge 
b/config/unbound/unbound-dhcp-leases-bridge
index e9f022aff..d22772066 100644
--- a/config/unbound/unbound-dhcp-leases-bridge
+++ b/config/unbound/unbound-dhcp-leases-bridge
@@ -22,6 +22,7 @@ 
  import argparse
  import datetime
  import daemon
+import filecmp
  import functools
  import ipaddress
  import logging
@@ -516,26 +517,35 @@  class UnboundConfigWriter(object):

  	def update_dhcp_leases(self, leases):
  		# Write out all leases
-		self.write_dhcp_leases(leases)
+		if self.write_dhcp_leases(leases):

-		log.debug("Reloading Unbound...")
+			log.debug("Reloading Unbound...")

-		# Reload the configuration without dropping the cache
-		self._control("reload_keep_cache")
+			# Reload the configuration without dropping the cache
+			self._control("reload_keep_cache")
+
+		else:
+
+			log.debug("Not reloading Unbound. Leases not changed.")

  	def write_dhcp_leases(self, leases):
  		log.debug("Writing DHCP leases...")

  		with tempfile.NamedTemporaryFile(mode="w", delete=False) as f:
-			for l in leases:
+			for l in sorted(leases):
  				for rr in l.rrset:
  					f.write("local-data: \"%s\"\n" % " ".join(rr))

-			# Make file readable for everyone
-			os.fchmod(f.fileno(), 
stat.S_IRUSR|stat.S_IWUSR|stat.S_IRGRP|stat.S_IROTH)
+		filecmp.clear_cache()
+		RequireUnboundReload = not filecmp.cmp(f.name, self.path, shallow=False)
+
+		# Make file readable for everyone
+		os.chmod(f.name, stat.S_IRUSR|stat.S_IWUSR|stat.S_IRGRP|stat.S_IROTH)
+
+		# Move the file to its destination
+		os.rename(f.name, self.path)

-			# Move the file to its destination
-			os.rename(f.name, self.path)
+		return RequireUnboundReload

  	def _control(self, *args):