[1/2] location-importer: Introduce auxiliary function to sanitise ASNs

Message ID 6e7024bb-71ec-66e6-252d-283c2a22666b@ipfire.org
State Superseded
Headers
Series [1/2] location-importer: Introduce auxiliary function to sanitise ASNs |

Commit Message

Peter Müller Oct. 10, 2021, 4:16 p.m. UTC
  Signed-off-by: Peter Müller <peter.mueller@ipfire.org>
---
 src/python/location-importer.in | 20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)
  

Comments

Michael Tremer Oct. 12, 2021, 11:23 a.m. UTC | #1
Hello,

> On 10 Oct 2021, at 17:16, Peter Müller <peter.mueller@ipfire.org> wrote:
> 
> Signed-off-by: Peter Müller <peter.mueller@ipfire.org>
> ---
> src/python/location-importer.in | 20 ++++++++++++++++++--
> 1 file changed, 18 insertions(+), 2 deletions(-)
> 
> diff --git a/src/python/location-importer.in b/src/python/location-importer.in
> index da058d3..c2b3e41 100644
> --- a/src/python/location-importer.in
> +++ b/src/python/location-importer.in
> @@ -574,6 +574,22 @@ class CLI(object):
> 		# be suitable for libloc consumption...
> 		return True
> 
> +	def _check_parsed_asn(self, asn):
> +		"""
> +			Assistive function to filter Autonomous System Numbers not being suitable
> +			for adding to our database. Returns False in such cases, and True otherwise.
> +		"""
> +
> +		if not asn or not isinstance(asn, int):
> +			return False

Does this happen that a non-integer is being passed to this function?

You also return False for zero without logging the message.

I would suggest to drop the check above.

> +
> +		if not ((1 <= asn and asn <= 23455) or (23457 <= asn and asn <= 64495) or (131072 <= asn and asn <= 4199999999)):
> +			log.debug("Skipping invalid ASN: %s" % asn)
> +			return False

This works, but I do not consider this very Pythonic.

I would have written a tuple which conatins one tuple for each range and then iterate over that until you find a match.

> +
> +		# ASN is fine if we made it here...
> +		return True

Ellipses in comments are sometimes weird...

> +
> 	def _parse_block(self, block, source_key, validcountries = None):
> 		# Get first line to find out what type of block this is
> 		line = block[0]
> @@ -829,8 +845,8 @@ class CLI(object):
> 					log.debug("Skipping ARIN AS names line not containing an integer for ASN")
> 					continue
> 
> -				if not ((1 <= asn and asn <= 23455) or (23457 <= asn and asn <= 64495) or (131072 <= asn and asn <= 4199999999)):
> -					log.debug("Skipping ARIN AS names line not containing a valid ASN: %s" % asn)
> +				# Filter invalid ASNs...
> +				if not self._check_parsed_asn(asn):
> 					continue
> 
> 				# Skip any AS name that appears to be a placeholder for a different RIR or entity...
> -- 
> 2.26.2
  
Peter Müller Oct. 13, 2021, 4:33 p.m. UTC | #2
Hello Michael,

thanks for your reply.

> Hello,
> 
>> On 10 Oct 2021, at 17:16, Peter Müller <peter.mueller@ipfire.org> wrote:
>>
>> Signed-off-by: Peter Müller <peter.mueller@ipfire.org>
>> ---
>> src/python/location-importer.in | 20 ++++++++++++++++++--
>> 1 file changed, 18 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/python/location-importer.in b/src/python/location-importer.in
>> index da058d3..c2b3e41 100644
>> --- a/src/python/location-importer.in
>> +++ b/src/python/location-importer.in
>> @@ -574,6 +574,22 @@ class CLI(object):
>> 		# be suitable for libloc consumption...
>> 		return True
>>
>> +	def _check_parsed_asn(self, asn):
>> +		"""
>> +			Assistive function to filter Autonomous System Numbers not being suitable
>> +			for adding to our database. Returns False in such cases, and True otherwise.
>> +		"""
>> +
>> +		if not asn or not isinstance(asn, int):
>> +			return False
> 
> Does this happen that a non-integer is being passed to this function?

What's wrong with input validation? I _like_ input validation. :-)

Seriously: Anything else than an integer does not make sense for an ASN. Sure, this
function is not intended to get anything else, but we will never know. Better to be
safe than sorry.

> You also return False for zero without logging the message.

True. Since there will probably a second version of this patchset, I will ensure it
logs anything useful in this case.

> I would suggest to drop the check above.

Frankly, I don't see why.

>> +
>> +		if not ((1 <= asn and asn <= 23455) or (23457 <= asn and asn <= 64495) or (131072 <= asn and asn <= 4199999999)):
>> +			log.debug("Skipping invalid ASN: %s" % asn)
>> +			return False
> 
> This works, but I do not consider this very Pythonic.
> 
> I would have written a tuple which conatins one tuple for each range and then iterate over that until you find a match.

Far from being a Python developer, this wouldn't have come to my mind. But if it's
Pythonic, I'll do so. When in Rome...

> 
>> +
>> +		# ASN is fine if we made it here...
>> +		return True
> 
> Ellipses in comments are sometimes weird...

???

Thanks, and best regards,
Peter Müller

> 
>> +
>> 	def _parse_block(self, block, source_key, validcountries = None):
>> 		# Get first line to find out what type of block this is
>> 		line = block[0]
>> @@ -829,8 +845,8 @@ class CLI(object):
>> 					log.debug("Skipping ARIN AS names line not containing an integer for ASN")
>> 					continue
>>
>> -				if not ((1 <= asn and asn <= 23455) or (23457 <= asn and asn <= 64495) or (131072 <= asn and asn <= 4199999999)):
>> -					log.debug("Skipping ARIN AS names line not containing a valid ASN: %s" % asn)
>> +				# Filter invalid ASNs...
>> +				if not self._check_parsed_asn(asn):
>> 					continue
>>
>> 				# Skip any AS name that appears to be a placeholder for a different RIR or entity...
>> -- 
>> 2.26.2
>
  
Michael Tremer Oct. 14, 2021, 6:19 p.m. UTC | #3
Hi,

> On 13 Oct 2021, at 17:33, Peter Müller <peter.mueller@ipfire.org> wrote:
> 
> Hello Michael,
> 
> thanks for your reply.
> 
>> Hello,
>> 
>>> On 10 Oct 2021, at 17:16, Peter Müller <peter.mueller@ipfire.org> wrote:
>>> 
>>> Signed-off-by: Peter Müller <peter.mueller@ipfire.org>
>>> ---
>>> src/python/location-importer.in | 20 ++++++++++++++++++--
>>> 1 file changed, 18 insertions(+), 2 deletions(-)
>>> 
>>> diff --git a/src/python/location-importer.in b/src/python/location-importer.in
>>> index da058d3..c2b3e41 100644
>>> --- a/src/python/location-importer.in
>>> +++ b/src/python/location-importer.in
>>> @@ -574,6 +574,22 @@ class CLI(object):
>>> 		# be suitable for libloc consumption...
>>> 		return True
>>> 
>>> +	def _check_parsed_asn(self, asn):
>>> +		"""
>>> +			Assistive function to filter Autonomous System Numbers not being suitable
>>> +			for adding to our database. Returns False in such cases, and True otherwise.
>>> +		"""
>>> +
>>> +		if not asn or not isinstance(asn, int):
>>> +			return False
>> 
>> Does this happen that a non-integer is being passed to this function?
> 
> What's wrong with input validation? I _like_ input validation. :-)

There is nothing wrong with that. You are just checking the developer here and I am not sure whether you want that or not.

> Seriously: Anything else than an integer does not make sense for an ASN. Sure, this
> function is not intended to get anything else, but we will never know. Better to be
> safe than sorry.

Not entirely. You want code to perform. If you want to be 100% use, Python isn’t the language this parser should be written in.

Nothing else but an integer makes sense. The question is how do you want to treat zero?

>> You also return False for zero without logging the message.
> 
> True. Since there will probably a second version of this patchset, I will ensure it
> logs anything useful in this case.
> 
>> I would suggest to drop the check above.
> 
> Frankly, I don't see why.
> 
>>> +
>>> +		if not ((1 <= asn and asn <= 23455) or (23457 <= asn and asn <= 64495) or (131072 <= asn and asn <= 4199999999)):
>>> +			log.debug("Skipping invalid ASN: %s" % asn)
>>> +			return False
>> 
>> This works, but I do not consider this very Pythonic.
>> 
>> I would have written a tuple which conatins one tuple for each range and then iterate over that until you find a match.
> 
> Far from being a Python developer, this wouldn't have come to my mind. But if it's
> Pythonic, I'll do so. When in Rome...

I don’t make the rules. That is just how I would do it:

* Data in one place

* A short algorithm that works on the data

In C I would hope that the compiler makes it fast.

> 
>> 
>>> +
>>> +		# ASN is fine if we made it here...
>>> +		return True
>> 
>> Ellipses in comments are sometimes weird...
> 
> ???

This one left the comment kind of open ended. Making it sound kind of unlikely.

> 
> Thanks, and best regards,
> Peter Müller
> 
>> 
>>> +
>>> 	def _parse_block(self, block, source_key, validcountries = None):
>>> 		# Get first line to find out what type of block this is
>>> 		line = block[0]
>>> @@ -829,8 +845,8 @@ class CLI(object):
>>> 					log.debug("Skipping ARIN AS names line not containing an integer for ASN")
>>> 					continue
>>> 
>>> -				if not ((1 <= asn and asn <= 23455) or (23457 <= asn and asn <= 64495) or (131072 <= asn and asn <= 4199999999)):
>>> -					log.debug("Skipping ARIN AS names line not containing a valid ASN: %s" % asn)
>>> +				# Filter invalid ASNs...
>>> +				if not self._check_parsed_asn(asn):
>>> 					continue
>>> 
>>> 				# Skip any AS name that appears to be a placeholder for a different RIR or entity...
>>> -- 
>>> 2.26.2
  

Patch

diff --git a/src/python/location-importer.in b/src/python/location-importer.in
index da058d3..c2b3e41 100644
--- a/src/python/location-importer.in
+++ b/src/python/location-importer.in
@@ -574,6 +574,22 @@  class CLI(object):
 		# be suitable for libloc consumption...
 		return True
 
+	def _check_parsed_asn(self, asn):
+		"""
+			Assistive function to filter Autonomous System Numbers not being suitable
+			for adding to our database. Returns False in such cases, and True otherwise.
+		"""
+
+		if not asn or not isinstance(asn, int):
+			return False
+
+		if not ((1 <= asn and asn <= 23455) or (23457 <= asn and asn <= 64495) or (131072 <= asn and asn <= 4199999999)):
+			log.debug("Skipping invalid ASN: %s" % asn)
+			return False
+
+		# ASN is fine if we made it here...
+		return True
+
 	def _parse_block(self, block, source_key, validcountries = None):
 		# Get first line to find out what type of block this is
 		line = block[0]
@@ -829,8 +845,8 @@  class CLI(object):
 					log.debug("Skipping ARIN AS names line not containing an integer for ASN")
 					continue
 
-				if not ((1 <= asn and asn <= 23455) or (23457 <= asn and asn <= 64495) or (131072 <= asn and asn <= 4199999999)):
-					log.debug("Skipping ARIN AS names line not containing a valid ASN: %s" % asn)
+				# Filter invalid ASNs...
+				if not self._check_parsed_asn(asn):
 					continue
 
 				# Skip any AS name that appears to be a placeholder for a different RIR or entity...