[1/5] location-functions.pl: Use a single script-wide db_handle.

Message ID 20201107184724.3590-1-stefan.schantl@ipfire.org
State Accepted
Commit b62d7e0cc71cc1ff23d66dd8baf0f5f3c5c7a29b
Headers
Series [1/5] location-functions.pl: Use a single script-wide db_handle. |

Commit Message

Stefan Schantl Nov. 7, 2020, 6:47 p.m. UTC
  Create and use a single script-wide database handle for libloc to
prevent from creating multiple ones.

This helps saving memory, especially on small systems.

Reference #12515.

Signed-off-by: Stefan Schantl <stefan.schantl@ipfire.org>
---
 config/cfgroot/location-functions.pl | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)
  

Comments

Bernhard Bitsch Nov. 8, 2020, 7:36 p.m. UTC | #1
This means, we stay with the unbalanced memory allocation in (Perl) libloc. Which leaves a memory leak.

> Gesendet: Samstag, 07. November 2020 um 19:47 Uhr
> Von: "Stefan Schantl" <stefan.schantl@ipfire.org>
> An: development@lists.ipfire.org
> Betreff: [PATCH 1/5] location-functions.pl: Use a single script-wide db_handle.
>
> Create and use a single script-wide database handle for libloc to
> prevent from creating multiple ones.
>
> This helps saving memory, especially on small systems.
>
> Reference #12515.
>

The error can be produced easily with small memory, but it is present in all systems.
Therefore I've posted this solution as work-around only!

- Bernhard


> Signed-off-by: Stefan Schantl <stefan.schantl@ipfire.org>
> ---
>  config/cfgroot/location-functions.pl | 11 ++++-------
>  1 file changed, 4 insertions(+), 7 deletions(-)
>
> diff --git a/config/cfgroot/location-functions.pl b/config/cfgroot/location-functions.pl
> index 2cfe7f908..9b1d0bfb5 100644
> --- a/config/cfgroot/location-functions.pl
> +++ b/config/cfgroot/location-functions.pl
> @@ -55,6 +55,9 @@ our $keyfile = "$location_dir/signing-key.pem";
>  # Directory which contains the exported databases.
>  our $xt_geoip_db_directory = "/usr/share/xt_geoip/";
>
> +# Create libloc database handle.
> +my $db_handle = &init();
> +
>  #
>  ## Tiny function to init the location database.
>  #
> @@ -86,7 +89,7 @@ sub verify ($) {
>  ## Function to the the country code of a given address.
>  #
>  sub lookup_country_code($$) {
> -	my ($db_handle, $address) = @_;
> +	my ($address) = @_;
>
>  	# Lookup the given address.
>  	my $country_code = &Location::lookup_country_code($db_handle, $address);
> @@ -174,9 +177,6 @@ sub get_full_country_name($) {
>
>  # Function to get all available locations.
>  sub get_locations() {
> -	# Create libloc database handle.
> -	my $db_handle = &init();
> -
>  	# Get locations which are stored in the location database.
>  	my @database_locations = &Location::database_countries($db_handle);
>
> @@ -197,9 +197,6 @@ sub address_has_flags($) {
>  	# Array to store the flags of the address.
>  	my @flags;
>
> -	# Init libloc database handle.
> -	my $db_handle = &init();
> -
>  	# Loop through the hash of possible network flags.
>  	foreach my $flag (keys(%network_flags)) {
>  		# Check if the address has the current flag.
> --
> 2.20.1
>
>
  
Michael Tremer Nov. 9, 2020, 11:35 a.m. UTC | #2
Hello,

Thank you Stefan for submitting this patchset.

> On 8 Nov 2020, at 19:36, Bernhard Bitsch <Bernhard.Bitsch@gmx.de> wrote:
> 
> This means, we stay with the unbalanced memory allocation in (Perl) libloc. Which leaves a memory leak.

Bernhard, could you please elaborate on how this memory leak is still existing?

I also do not understand what you mean by unbalanced.

As far as I understand the code right now, the database is being opened once and the handle is being stored internally. All functions that are being called will no longer have to hold their own database handle. Therefore the maximum amount of handles open is one.

> 
>> Gesendet: Samstag, 07. November 2020 um 19:47 Uhr
>> Von: "Stefan Schantl" <stefan.schantl@ipfire.org>
>> An: development@lists.ipfire.org
>> Betreff: [PATCH 1/5] location-functions.pl: Use a single script-wide db_handle.
>> 
>> Create and use a single script-wide database handle for libloc to
>> prevent from creating multiple ones.
>> 
>> This helps saving memory, especially on small systems.
>> 
>> Reference #12515.
>> 
> 
> The error can be produced easily with small memory, but it is present in all systems.
> Therefore I've posted this solution as work-around only!

Did you test this patchset or did you come to your conclusion by only reading the code?

Best,
-Michael

> 
> - Bernhard
> 
> 
>> Signed-off-by: Stefan Schantl <stefan.schantl@ipfire.org>
>> ---
>> config/cfgroot/location-functions.pl | 11 ++++-------
>> 1 file changed, 4 insertions(+), 7 deletions(-)
>> 
>> diff --git a/config/cfgroot/location-functions.pl b/config/cfgroot/location-functions.pl
>> index 2cfe7f908..9b1d0bfb5 100644
>> --- a/config/cfgroot/location-functions.pl
>> +++ b/config/cfgroot/location-functions.pl
>> @@ -55,6 +55,9 @@ our $keyfile = "$location_dir/signing-key.pem";
>> # Directory which contains the exported databases.
>> our $xt_geoip_db_directory = "/usr/share/xt_geoip/";
>> 
>> +# Create libloc database handle.
>> +my $db_handle = &init();
>> +
>> #
>> ## Tiny function to init the location database.
>> #
>> @@ -86,7 +89,7 @@ sub verify ($) {
>> ## Function to the the country code of a given address.
>> #
>> sub lookup_country_code($$) {
>> -	my ($db_handle, $address) = @_;
>> +	my ($address) = @_;
>> 
>> 	# Lookup the given address.
>> 	my $country_code = &Location::lookup_country_code($db_handle, $address);
>> @@ -174,9 +177,6 @@ sub get_full_country_name($) {
>> 
>> # Function to get all available locations.
>> sub get_locations() {
>> -	# Create libloc database handle.
>> -	my $db_handle = &init();
>> -
>> 	# Get locations which are stored in the location database.
>> 	my @database_locations = &Location::database_countries($db_handle);
>> 
>> @@ -197,9 +197,6 @@ sub address_has_flags($) {
>> 	# Array to store the flags of the address.
>> 	my @flags;
>> 
>> -	# Init libloc database handle.
>> -	my $db_handle = &init();
>> -
>> 	# Loop through the hash of possible network flags.
>> 	foreach my $flag (keys(%network_flags)) {
>> 		# Check if the address has the current flag.
>> --
>> 2.20.1
>> 
>>
  
Bernhard Bitsch Nov. 9, 2020, 12:13 p.m. UTC | #3
Hello,

> Gesendet: Montag, 09. November 2020 um 12:35 Uhr
> Von: "Michael Tremer" <michael.tremer@ipfire.org>
> An: "Bernhard Bitsch" <Bernhard.Bitsch@gmx.de>
> Cc: development@lists.ipfire.org
> Betreff: Re: [PATCH 1/5] location-functions.pl: Use a single script-wide db_handle.
>
> Hello,
>
> Thank you Stefan for submitting this patchset.
>
> > On 8 Nov 2020, at 19:36, Bernhard Bitsch <Bernhard.Bitsch@gmx.de> wrote:
> >
> > This means, we stay with the unbalanced memory allocation in (Perl) libloc. Which leaves a memory leak.
>
> Bernhard, could you please elaborate on how this memory leak is still existing?
>
> I also do not understand what you mean by unbalanced.
>
> As far as I understand the code right now, the database is being opened once and the handle is being stored internally. All functions that are being called will no longer have to hold their own database handle. Therefore the maximum amount of handles open is one.
>

The init() function allocates the handle, which is not really destroyed. The END block is just a safety process. The allocated memory for the handle should be released in the moment of references==0.
Yes, the new code ( inspired by my work-around ) minimizes the number of handles to one. But this handle is persistent also ( this doesn't hurt because it is used persistent for the time of module use ).

> >
> >> Gesendet: Samstag, 07. November 2020 um 19:47 Uhr
> >> Von: "Stefan Schantl" <stefan.schantl@ipfire.org>
> >> An: development@lists.ipfire.org
> >> Betreff: [PATCH 1/5] location-functions.pl: Use a single script-wide db_handle.
> >>
> >> Create and use a single script-wide database handle for libloc to
> >> prevent from creating multiple ones.
> >>
> >> This helps saving memory, especially on small systems.
> >>
> >> Reference #12515.
> >>
> >
> > The error can be produced easily with small memory, but it is present in all systems.
> > Therefore I've posted this solution as work-around only!
>
> Did you test this patchset or did you come to your conclusion by only reading the code?
>
> Best,
> -Michael
>

The sentence about "small systems" refers to the original comment. It is not a real memory saving, the patch cures a 'memory wasting'. It is clear, that a massively use of malloc without free crashes a small system faster than a system with big memory resources.
The patchset is mainly my work-around, thus it is in test since I found the error.

Best,
-Bernhard

> >
> > - Bernhard
> >
> >
> >> Signed-off-by: Stefan Schantl <stefan.schantl@ipfire.org>
> >> ---
> >> config/cfgroot/location-functions.pl | 11 ++++-------
> >> 1 file changed, 4 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/config/cfgroot/location-functions.pl b/config/cfgroot/location-functions.pl
> >> index 2cfe7f908..9b1d0bfb5 100644
> >> --- a/config/cfgroot/location-functions.pl
> >> +++ b/config/cfgroot/location-functions.pl
> >> @@ -55,6 +55,9 @@ our $keyfile = "$location_dir/signing-key.pem";
> >> # Directory which contains the exported databases.
> >> our $xt_geoip_db_directory = "/usr/share/xt_geoip/";
> >>
> >> +# Create libloc database handle.
> >> +my $db_handle = &init();
> >> +
> >> #
> >> ## Tiny function to init the location database.
> >> #
> >> @@ -86,7 +89,7 @@ sub verify ($) {
> >> ## Function to the the country code of a given address.
> >> #
> >> sub lookup_country_code($$) {
> >> -	my ($db_handle, $address) = @_;
> >> +	my ($address) = @_;
> >>
> >> 	# Lookup the given address.
> >> 	my $country_code = &Location::lookup_country_code($db_handle, $address);
> >> @@ -174,9 +177,6 @@ sub get_full_country_name($) {
> >>
> >> # Function to get all available locations.
> >> sub get_locations() {
> >> -	# Create libloc database handle.
> >> -	my $db_handle = &init();
> >> -
> >> 	# Get locations which are stored in the location database.
> >> 	my @database_locations = &Location::database_countries($db_handle);
> >>
> >> @@ -197,9 +197,6 @@ sub address_has_flags($) {
> >> 	# Array to store the flags of the address.
> >> 	my @flags;
> >>
> >> -	# Init libloc database handle.
> >> -	my $db_handle = &init();
> >> -
> >> 	# Loop through the hash of possible network flags.
> >> 	foreach my $flag (keys(%network_flags)) {
> >> 		# Check if the address has the current flag.
> >> --
> >> 2.20.1
> >>
> >>
>
>
  
Michael Tremer Nov. 9, 2020, 2:04 p.m. UTC | #4
Hello,

> On 9 Nov 2020, at 12:13, Bernhard Bitsch <Bernhard.Bitsch@gmx.de> wrote:
> 
> Hello,
> 
>> Gesendet: Montag, 09. November 2020 um 12:35 Uhr
>> Von: "Michael Tremer" <michael.tremer@ipfire.org>
>> An: "Bernhard Bitsch" <Bernhard.Bitsch@gmx.de>
>> Cc: development@lists.ipfire.org
>> Betreff: Re: [PATCH 1/5] location-functions.pl: Use a single script-wide db_handle.
>> 
>> Hello,
>> 
>> Thank you Stefan for submitting this patchset.
>> 
>>> On 8 Nov 2020, at 19:36, Bernhard Bitsch <Bernhard.Bitsch@gmx.de> wrote:
>>> 
>>> This means, we stay with the unbalanced memory allocation in (Perl) libloc. Which leaves a memory leak.
>> 
>> Bernhard, could you please elaborate on how this memory leak is still existing?
>> 
>> I also do not understand what you mean by unbalanced.
>> 
>> As far as I understand the code right now, the database is being opened once and the handle is being stored internally. All functions that are being called will no longer have to hold their own database handle. Therefore the maximum amount of handles open is one.
>> 
> 
> The init() function allocates the handle, which is not really destroyed. The END block is just a safety process. The allocated memory for the handle should be released in the moment of references==0.

The handle is being initialised when location-functions.pl is being loaded and it is freed when the Perl interpreter ends.

It remains referenced all the time.

> Yes, the new code ( inspired by my work-around ) minimizes the number of handles to one. But this handle is persistent also ( this doesn't hurt because it is used persistent for the time of module use ).

It is only being initialised when location-functions.pl is being loaded. That only happens in code that uses those functions.

I do not see the problem with this. Constantly closing and re-opening the database is not an option since the connection tracking list can have tens of thousands of lookups and this will make the page - simply - slow.

I cannot see an option that is more resource-friendly than this.

> 
>>> 
>>>> Gesendet: Samstag, 07. November 2020 um 19:47 Uhr
>>>> Von: "Stefan Schantl" <stefan.schantl@ipfire.org>
>>>> An: development@lists.ipfire.org
>>>> Betreff: [PATCH 1/5] location-functions.pl: Use a single script-wide db_handle.
>>>> 
>>>> Create and use a single script-wide database handle for libloc to
>>>> prevent from creating multiple ones.
>>>> 
>>>> This helps saving memory, especially on small systems.
>>>> 
>>>> Reference #12515.
>>>> 
>>> 
>>> The error can be produced easily with small memory, but it is present in all systems.
>>> Therefore I've posted this solution as work-around only!
>> 
>> Did you test this patchset or did you come to your conclusion by only reading the code?
>> 
>> Best,
>> -Michael
>> 
> 
> The sentence about "small systems" refers to the original comment. It is not a real memory saving, the patch cures a 'memory wasting'. It is clear, that a massively use of malloc without free crashes a small system faster than a system with big memory resources.

Opening the database requires pretty much nothing (maybe 4K) of RSS memory and about 50MB of VIRT memory. This due to mmap() which helps us reading the database a lot fast. I think this is very very reasonable. The more data is being read from the database the more memory it will use.

If you wish to change that (and that comes with a performance penalty) you could submit a patch that removes mmap() if the user wishes to.

But I do not see why this is a problem at all. The whole system won’t run well on systems with 128 or 256MB of memory. The Linux kernel won’t even boot properly on those any more. Memory is cheap. Our time isn’t.

We have found the most resource-saving way to deal with this now and anything else will simply use considerable amount of time with no benefit to 99% of our users.

Best,
-Michael

> The patchset is mainly my work-around, thus it is in test since I found the error.
> 
> Best,
> -Bernhard
> 
>>> 
>>> - Bernhard
>>> 
>>> 
>>>> Signed-off-by: Stefan Schantl <stefan.schantl@ipfire.org>
>>>> ---
>>>> config/cfgroot/location-functions.pl | 11 ++++-------
>>>> 1 file changed, 4 insertions(+), 7 deletions(-)
>>>> 
>>>> diff --git a/config/cfgroot/location-functions.pl b/config/cfgroot/location-functions.pl
>>>> index 2cfe7f908..9b1d0bfb5 100644
>>>> --- a/config/cfgroot/location-functions.pl
>>>> +++ b/config/cfgroot/location-functions.pl
>>>> @@ -55,6 +55,9 @@ our $keyfile = "$location_dir/signing-key.pem";
>>>> # Directory which contains the exported databases.
>>>> our $xt_geoip_db_directory = "/usr/share/xt_geoip/";
>>>> 
>>>> +# Create libloc database handle.
>>>> +my $db_handle = &init();
>>>> +
>>>> #
>>>> ## Tiny function to init the location database.
>>>> #
>>>> @@ -86,7 +89,7 @@ sub verify ($) {
>>>> ## Function to the the country code of a given address.
>>>> #
>>>> sub lookup_country_code($$) {
>>>> -	my ($db_handle, $address) = @_;
>>>> +	my ($address) = @_;
>>>> 
>>>> 	# Lookup the given address.
>>>> 	my $country_code = &Location::lookup_country_code($db_handle, $address);
>>>> @@ -174,9 +177,6 @@ sub get_full_country_name($) {
>>>> 
>>>> # Function to get all available locations.
>>>> sub get_locations() {
>>>> -	# Create libloc database handle.
>>>> -	my $db_handle = &init();
>>>> -
>>>> 	# Get locations which are stored in the location database.
>>>> 	my @database_locations = &Location::database_countries($db_handle);
>>>> 
>>>> @@ -197,9 +197,6 @@ sub address_has_flags($) {
>>>> 	# Array to store the flags of the address.
>>>> 	my @flags;
>>>> 
>>>> -	# Init libloc database handle.
>>>> -	my $db_handle = &init();
>>>> -
>>>> 	# Loop through the hash of possible network flags.
>>>> 	foreach my $flag (keys(%network_flags)) {
>>>> 		# Check if the address has the current flag.
>>>> --
>>>> 2.20.1
  
Bernhard Bitsch Nov. 9, 2020, 3:16 p.m. UTC | #5
Hello,

> Gesendet: Montag, 09. November 2020 um 15:04 Uhr
> Von: "Michael Tremer" <michael.tremer@ipfire.org>
> An: "Bernhard Bitsch" <Bernhard.Bitsch@gmx.de>
> Cc: development@lists.ipfire.org
> Betreff: Re: [PATCH 1/5] location-functions.pl: Use a single script-wide db_handle.
>
> Hello,
> 
> > On 9 Nov 2020, at 12:13, Bernhard Bitsch <Bernhard.Bitsch@gmx.de> wrote:
> > 
> > Hello,
> > 
> >> Gesendet: Montag, 09. November 2020 um 12:35 Uhr
> >> Von: "Michael Tremer" <michael.tremer@ipfire.org>
> >> An: "Bernhard Bitsch" <Bernhard.Bitsch@gmx.de>
> >> Cc: development@lists.ipfire.org
> >> Betreff: Re: [PATCH 1/5] location-functions.pl: Use a single script-wide db_handle.
> >> 
> >> Hello,
> >> 
> >> Thank you Stefan for submitting this patchset.
> >> 
> >>> On 8 Nov 2020, at 19:36, Bernhard Bitsch <Bernhard.Bitsch@gmx.de> wrote:
> >>> 
> >>> This means, we stay with the unbalanced memory allocation in (Perl) libloc. Which leaves a memory leak.
> >> 
> >> Bernhard, could you please elaborate on how this memory leak is still existing?
> >> 
> >> I also do not understand what you mean by unbalanced.
> >> 
> >> As far as I understand the code right now, the database is being opened once and the handle is being stored internally. All functions that are being called will no longer have to hold their own database handle. Therefore the maximum amount of handles open is one.
> >> 
> > 
> > The init() function allocates the handle, which is not really destroyed. The END block is just a safety process. The allocated memory for the handle should be released in the moment of references==0.
> 
> The handle is being initialised when location-functions.pl is being loaded and it is freed when the Perl interpreter ends.
> 
> It remains referenced all the time.
> 

This the new state!
The init() function builds a memory block to access the location data base and returns a handle to it. This memory block is not released by the memory management. I suppose there are some deficiencies in handling the reference counters.
A call to init() in each function of Location::Functions leaves this memory block. That was the originatings issue.

> > Yes, the new code ( inspired by my work-around ) minimizes the number of handles to one. But this handle is persistent also ( this doesn't hurt because it is used persistent for the time of module use ).
> 
> It is only being initialised when location-functions.pl is being loaded. That only happens in code that uses those functions.
> 
> I do not see the problem with this. Constantly closing and re-opening the database is not an option since the connection tracking list can have tens of thousands of lookups and this will make the page - simply - slow.
> 
> I cannot see an option that is more resource-friendly than this.
> 

It is not the open/close of the data base which produces the memory leak, but the Perl interface ( see above ). 
In terms of performance, you are right. So my "mini" system showed as the way to a better performing system.

> > 
> >>> 
> >>>> Gesendet: Samstag, 07. November 2020 um 19:47 Uhr
> >>>> Von: "Stefan Schantl" <stefan.schantl@ipfire.org>
> >>>> An: development@lists.ipfire.org
> >>>> Betreff: [PATCH 1/5] location-functions.pl: Use a single script-wide db_handle.
> >>>> 
> >>>> Create and use a single script-wide database handle for libloc to
> >>>> prevent from creating multiple ones.
> >>>> 
> >>>> This helps saving memory, especially on small systems.
> >>>> 
> >>>> Reference #12515.
> >>>> 
> >>> 
> >>> The error can be produced easily with small memory, but it is present in all systems.
> >>> Therefore I've posted this solution as work-around only!
> >> 
> >> Did you test this patchset or did you come to your conclusion by only reading the code?
> >> 
> >> Best,
> >> -Michael
> >> 
> > 
> > The sentence about "small systems" refers to the original comment. It is not a real memory saving, the patch cures a 'memory wasting'. It is clear, that a massively use of malloc without free crashes a small system faster than a system with big memory resources.
> 
> Opening the database requires pretty much nothing (maybe 4K) of RSS memory and about 50MB of VIRT memory. This due to mmap() which helps us reading the database a lot fast. I think this is very very reasonable. The more data is being read from the database the more memory it will use.
> 
> If you wish to change that (and that comes with a performance penalty) you could submit a patch that removes mmap() if the user wishes to.
> 
> But I do not see why this is a problem at all. The whole system won’t run well on systems with 128 or 256MB of memory. The Linux kernel won’t even boot properly on those any more. Memory is cheap. Our time isn’t.
> 
> We have found the most resource-saving way to deal with this now and anything else will simply use considerable amount of time with no benefit to 99% of our users.
> 
> Best,
> -Michael
> 

Further I think, we have found the most straight-forward way to interface to libloc ( which is a great work! ). The first approach remembered me a bit at good old times when M$ took MS-DOS and some code upon and called it Windows. More and more layers put on this building requested faster processors and more memory, without real gain in functionality. Just the experience of an "old" computer scientist, which saw each step in this evolution. ;)

Best,
-Bernhard
> > The patchset is mainly my work-around, thus it is in test since I found the error.
> > 
> > Best,
> > -Bernhard
> > 
> >>> 
> >>> - Bernhard
> >>> 
> >>> 
> >>>> Signed-off-by: Stefan Schantl <stefan.schantl@ipfire.org>
> >>>> ---
> >>>> config/cfgroot/location-functions.pl | 11 ++++-------
> >>>> 1 file changed, 4 insertions(+), 7 deletions(-)
> >>>> 
> >>>> diff --git a/config/cfgroot/location-functions.pl b/config/cfgroot/location-functions.pl
> >>>> index 2cfe7f908..9b1d0bfb5 100644
> >>>> --- a/config/cfgroot/location-functions.pl
> >>>> +++ b/config/cfgroot/location-functions.pl
> >>>> @@ -55,6 +55,9 @@ our $keyfile = "$location_dir/signing-key.pem";
> >>>> # Directory which contains the exported databases.
> >>>> our $xt_geoip_db_directory = "/usr/share/xt_geoip/";
> >>>> 
> >>>> +# Create libloc database handle.
> >>>> +my $db_handle = &init();
> >>>> +
> >>>> #
> >>>> ## Tiny function to init the location database.
> >>>> #
> >>>> @@ -86,7 +89,7 @@ sub verify ($) {
> >>>> ## Function to the the country code of a given address.
> >>>> #
> >>>> sub lookup_country_code($$) {
> >>>> -	my ($db_handle, $address) = @_;
> >>>> +	my ($address) = @_;
> >>>> 
> >>>> 	# Lookup the given address.
> >>>> 	my $country_code = &Location::lookup_country_code($db_handle, $address);
> >>>> @@ -174,9 +177,6 @@ sub get_full_country_name($) {
> >>>> 
> >>>> # Function to get all available locations.
> >>>> sub get_locations() {
> >>>> -	# Create libloc database handle.
> >>>> -	my $db_handle = &init();
> >>>> -
> >>>> 	# Get locations which are stored in the location database.
> >>>> 	my @database_locations = &Location::database_countries($db_handle);
> >>>> 
> >>>> @@ -197,9 +197,6 @@ sub address_has_flags($) {
> >>>> 	# Array to store the flags of the address.
> >>>> 	my @flags;
> >>>> 
> >>>> -	# Init libloc database handle.
> >>>> -	my $db_handle = &init();
> >>>> -
> >>>> 	# Loop through the hash of possible network flags.
> >>>> 	foreach my $flag (keys(%network_flags)) {
> >>>> 		# Check if the address has the current flag.
> >>>> --
> >>>> 2.20.1
> 
>
  
Michael Tremer Nov. 9, 2020, 5:03 p.m. UTC | #6
Hello,

> On 9 Nov 2020, at 15:16, Bernhard Bitsch <Bernhard.Bitsch@gmx.de> wrote:
> 
> Hello,
> 
>> Gesendet: Montag, 09. November 2020 um 15:04 Uhr
>> Von: "Michael Tremer" <michael.tremer@ipfire.org>
>> An: "Bernhard Bitsch" <Bernhard.Bitsch@gmx.de>
>> Cc: development@lists.ipfire.org
>> Betreff: Re: [PATCH 1/5] location-functions.pl: Use a single script-wide db_handle.
>> 
>> Hello,
>> 
>>> On 9 Nov 2020, at 12:13, Bernhard Bitsch <Bernhard.Bitsch@gmx.de> wrote:
>>> 
>>> Hello,
>>> 
>>>> Gesendet: Montag, 09. November 2020 um 12:35 Uhr
>>>> Von: "Michael Tremer" <michael.tremer@ipfire.org>
>>>> An: "Bernhard Bitsch" <Bernhard.Bitsch@gmx.de>
>>>> Cc: development@lists.ipfire.org
>>>> Betreff: Re: [PATCH 1/5] location-functions.pl: Use a single script-wide db_handle.
>>>> 
>>>> Hello,
>>>> 
>>>> Thank you Stefan for submitting this patchset.
>>>> 
>>>>> On 8 Nov 2020, at 19:36, Bernhard Bitsch <Bernhard.Bitsch@gmx.de> wrote:
>>>>> 
>>>>> This means, we stay with the unbalanced memory allocation in (Perl) libloc. Which leaves a memory leak.
>>>> 
>>>> Bernhard, could you please elaborate on how this memory leak is still existing?
>>>> 
>>>> I also do not understand what you mean by unbalanced.
>>>> 
>>>> As far as I understand the code right now, the database is being opened once and the handle is being stored internally. All functions that are being called will no longer have to hold their own database handle. Therefore the maximum amount of handles open is one.
>>>> 
>>> 
>>> The init() function allocates the handle, which is not really destroyed. The END block is just a safety process. The allocated memory for the handle should be released in the moment of references==0.
>> 
>> The handle is being initialised when location-functions.pl is being loaded and it is freed when the Perl interpreter ends.
>> 
>> It remains referenced all the time.
>> 
> 
> This the new state!
> The init() function builds a memory block to access the location data base and returns a handle to it. This memory block is not released by the memory management. I suppose there are some deficiencies in handling the reference counters.

Okay, where are those?

Libloc internally uses reference counting and we have some unit tests to check if those are working okay.

If you know that we are handling this somewhere incorrectly, let me know and we will be able to reduce the footprint of libloc even more.

> A call to init() in each function of Location::Functions leaves this memory block. That was the originatings issue.
> 
>>> Yes, the new code ( inspired by my work-around ) minimizes the number of handles to one. But this handle is persistent also ( this doesn't hurt because it is used persistent for the time of module use ).
>> 
>> It is only being initialised when location-functions.pl is being loaded. That only happens in code that uses those functions.
>> 
>> I do not see the problem with this. Constantly closing and re-opening the database is not an option since the connection tracking list can have tens of thousands of lookups and this will make the page - simply - slow.
>> 
>> I cannot see an option that is more resource-friendly than this.
>> 
> 
> It is not the open/close of the data base which produces the memory leak, but the Perl interface ( see above ). 

Okay, but where? In which function?

> In terms of performance, you are right. So my "mini" system showed as the way to a better performing system.
> 
>>> 
>>>>> 
>>>>>> Gesendet: Samstag, 07. November 2020 um 19:47 Uhr
>>>>>> Von: "Stefan Schantl" <stefan.schantl@ipfire.org>
>>>>>> An: development@lists.ipfire.org
>>>>>> Betreff: [PATCH 1/5] location-functions.pl: Use a single script-wide db_handle.
>>>>>> 
>>>>>> Create and use a single script-wide database handle for libloc to
>>>>>> prevent from creating multiple ones.
>>>>>> 
>>>>>> This helps saving memory, especially on small systems.
>>>>>> 
>>>>>> Reference #12515.
>>>>>> 
>>>>> 
>>>>> The error can be produced easily with small memory, but it is present in all systems.
>>>>> Therefore I've posted this solution as work-around only!
>>>> 
>>>> Did you test this patchset or did you come to your conclusion by only reading the code?
>>>> 
>>>> Best,
>>>> -Michael
>>>> 
>>> 
>>> The sentence about "small systems" refers to the original comment. It is not a real memory saving, the patch cures a 'memory wasting'. It is clear, that a massively use of malloc without free crashes a small system faster than a system with big memory resources.
>> 
>> Opening the database requires pretty much nothing (maybe 4K) of RSS memory and about 50MB of VIRT memory. This due to mmap() which helps us reading the database a lot fast. I think this is very very reasonable. The more data is being read from the database the more memory it will use.
>> 
>> If you wish to change that (and that comes with a performance penalty) you could submit a patch that removes mmap() if the user wishes to.
>> 
>> But I do not see why this is a problem at all. The whole system won’t run well on systems with 128 or 256MB of memory. The Linux kernel won’t even boot properly on those any more. Memory is cheap. Our time isn’t.
>> 
>> We have found the most resource-saving way to deal with this now and anything else will simply use considerable amount of time with no benefit to 99% of our users.
>> 
>> Best,
>> -Michael
>> 
> 
> Further I think, we have found the most straight-forward way to interface to libloc ( which is a great work! ). The first approach remembered me a bit at good old times when M$ took MS-DOS and some code upon and called it Windows. More and more layers put on this building requested faster processors and more memory, without real gain in functionality. Just the experience of an "old" computer scientist, which saw each step in this evolution. ;)

To be totally honest, we wanted to keep the Perl interface as easy as possible. The Python interface is much easier to integrate and will gain many more users. Perl is on its way out, but of course the current web user interface is written in perl. We started with only a few functions but it grew quickly and therefore it became messy.

The native perl part is only there because the perl C bindings are less fun to use than putting hot wax into my eyes.

-Michael

> 
> Best,
> -Bernhard
>>> The patchset is mainly my work-around, thus it is in test since I found the error.
>>> 
>>> Best,
>>> -Bernhard
>>> 
>>>>> 
>>>>> - Bernhard
>>>>> 
>>>>> 
>>>>>> Signed-off-by: Stefan Schantl <stefan.schantl@ipfire.org>
>>>>>> ---
>>>>>> config/cfgroot/location-functions.pl | 11 ++++-------
>>>>>> 1 file changed, 4 insertions(+), 7 deletions(-)
>>>>>> 
>>>>>> diff --git a/config/cfgroot/location-functions.pl b/config/cfgroot/location-functions.pl
>>>>>> index 2cfe7f908..9b1d0bfb5 100644
>>>>>> --- a/config/cfgroot/location-functions.pl
>>>>>> +++ b/config/cfgroot/location-functions.pl
>>>>>> @@ -55,6 +55,9 @@ our $keyfile = "$location_dir/signing-key.pem";
>>>>>> # Directory which contains the exported databases.
>>>>>> our $xt_geoip_db_directory = "/usr/share/xt_geoip/";
>>>>>> 
>>>>>> +# Create libloc database handle.
>>>>>> +my $db_handle = &init();
>>>>>> +
>>>>>> #
>>>>>> ## Tiny function to init the location database.
>>>>>> #
>>>>>> @@ -86,7 +89,7 @@ sub verify ($) {
>>>>>> ## Function to the the country code of a given address.
>>>>>> #
>>>>>> sub lookup_country_code($$) {
>>>>>> -	my ($db_handle, $address) = @_;
>>>>>> +	my ($address) = @_;
>>>>>> 
>>>>>> 	# Lookup the given address.
>>>>>> 	my $country_code = &Location::lookup_country_code($db_handle, $address);
>>>>>> @@ -174,9 +177,6 @@ sub get_full_country_name($) {
>>>>>> 
>>>>>> # Function to get all available locations.
>>>>>> sub get_locations() {
>>>>>> -	# Create libloc database handle.
>>>>>> -	my $db_handle = &init();
>>>>>> -
>>>>>> 	# Get locations which are stored in the location database.
>>>>>> 	my @database_locations = &Location::database_countries($db_handle);
>>>>>> 
>>>>>> @@ -197,9 +197,6 @@ sub address_has_flags($) {
>>>>>> 	# Array to store the flags of the address.
>>>>>> 	my @flags;
>>>>>> 
>>>>>> -	# Init libloc database handle.
>>>>>> -	my $db_handle = &init();
>>>>>> -
>>>>>> 	# Loop through the hash of possible network flags.
>>>>>> 	foreach my $flag (keys(%network_flags)) {
>>>>>> 		# Check if the address has the current flag.
>>>>>> --
>>>>>> 2.20.1
  

Patch

diff --git a/config/cfgroot/location-functions.pl b/config/cfgroot/location-functions.pl
index 2cfe7f908..9b1d0bfb5 100644
--- a/config/cfgroot/location-functions.pl
+++ b/config/cfgroot/location-functions.pl
@@ -55,6 +55,9 @@  our $keyfile = "$location_dir/signing-key.pem";
 # Directory which contains the exported databases.
 our $xt_geoip_db_directory = "/usr/share/xt_geoip/";
 
+# Create libloc database handle.
+my $db_handle = &init();
+
 #
 ## Tiny function to init the location database.
 #
@@ -86,7 +89,7 @@  sub verify ($) {
 ## Function to the the country code of a given address.
 #
 sub lookup_country_code($$) {
-	my ($db_handle, $address) = @_;
+	my ($address) = @_;
 
 	# Lookup the given address.
 	my $country_code = &Location::lookup_country_code($db_handle, $address);
@@ -174,9 +177,6 @@  sub get_full_country_name($) {
 
 # Function to get all available locations.
 sub get_locations() {
-	# Create libloc database handle.
-	my $db_handle = &init();
-
 	# Get locations which are stored in the location database.
 	my @database_locations = &Location::database_countries($db_handle);
 
@@ -197,9 +197,6 @@  sub address_has_flags($) {
 	# Array to store the flags of the address.
 	my @flags;
 
-	# Init libloc database handle.
-	my $db_handle = &init();
-
 	# Loop through the hash of possible network flags.
 	foreach my $flag (keys(%network_flags)) {
 		# Check if the address has the current flag.