[v2] location-importer: Fix Spamhaus ASN-DROP parsing

Message ID 4add526f-913d-4f16-ac80-7642ff9800e0@ipfire.org
State Accepted
Commit 5c7cfeb21f59d44b2b12bcfdce85b3a61049fb4f
Headers
Series [v2] location-importer: Fix Spamhaus ASN-DROP parsing |

Commit Message

Peter Müller Feb. 17, 2024, 10:31 p.m. UTC
  The format of this list has changed, from a plain text file with a
customer schema to JSON. Adjust our routines accordingly to make use of
this list again.

The second version of this patch incorporates Michael's feedback on the
first version, and adds AS names to the autnums table in case they are
not there already, which closes some gaps on rogue ASNs in the LACNIC
area.

Signed-off-by: Peter Müller <peter.mueller@ipfire.org>
Tested-by: Peter Müller <peter.mueller@ipfire.org>
---
 src/scripts/location-importer.in | 46 ++++++++++++++++++++++++--------
 1 file changed, 35 insertions(+), 11 deletions(-)
  

Patch

diff --git a/src/scripts/location-importer.in b/src/scripts/location-importer.in
index 28a4f6c..ac7249d 100644
--- a/src/scripts/location-importer.in
+++ b/src/scripts/location-importer.in
@@ -3,7 +3,7 @@ 
 #                                                                             #
 # libloc - A library to determine the location of someone on the Internet     #
 #                                                                             #
-# Copyright (C) 2020-2022 IPFire Development Team <info@ipfire.org>           #
+# Copyright (C) 2020-2024 IPFire Development Team <info@ipfire.org>           #
 #                                                                             #
 # This library is free software; you can redistribute it and/or               #
 # modify it under the terms of the GNU Lesser General Public                  #
@@ -1686,7 +1686,7 @@  class CLI(object):
 				]
 
 		asn_lists = [
-					("SPAMHAUS-ASNDROP", "https://www.spamhaus.org/drop/asndrop.txt")
+					("SPAMHAUS-ASNDROP", "https://www.spamhaus.org/drop/asndrop.json")
 				]
 
 		for name, url in ip_lists:
@@ -1759,22 +1759,32 @@  class CLI(object):
 
 				# Iterate through every line, filter comments and add remaining ASNs to
 				# the override table in case they are valid...
-				for sline in f.readlines():
+				for sline in fcontent:
 					# The response is assumed to be encoded in UTF-8...
 					sline = sline.decode("utf-8")
 
-					# Comments start with a semicolon...
-					if sline.startswith(";"):
+					# Load every line as a JSON object and try to obtain an ASN from it...
+					try:
+						lineobj = json.loads(sline)
+					except json.decoder.JSONDecodeError:
+						log.error("Unable to parse line as a JSON object: %s" % sline)
 						continue
 
-					# Throw away anything after the first space...
-					sline = sline.split()[0]
+					# Skip line contiaining file metadata
+					try:
+						type = lineobj["type"]
 
-					# ... strip the "AS" prefix from it ...
-					sline = sline.strip("AS")
+						if type == "metadata":
+							continue
+					except KeyError:
+						pass
 
-					# ... and convert it into an integer. Voila.
-					asn = int(sline)
+					try:
+						asn = lineobj["asn"]
+						as_name = lineobj["asname"]
+					except KeyError:
+						log.warning("Unable to extract necessary information from line: %s" % sline)
+						continue
 
 					# Filter invalid ASNs...
 					if not self._check_parsed_asn(asn):
@@ -1795,6 +1805,20 @@  class CLI(object):
 						True
 					)
 
+					# In case we do not have an name for this AS already, update
+					# autnums table accordingly
+					self.db.execute("""
+						INSERT INTO autnums(
+							number,
+							name,
+							source
+						) VALUES (%s, %s, %s)
+						ON CONFLICT (number) DO NOTHING""",
+						"%s" % asn,
+						as_name,
+						name
+					)
+
 	@staticmethod
 	def _parse_bool(block, key):
 		val = block.get(key)