location-importer: Only delete override data if we are sure to have a valid replacement

Message ID 1c84d2fc-c061-80eb-4624-288c263b78bb@ipfire.org
State New
Headers show
Series location-importer: Only delete override data if we are sure to have a valid replacement | expand

Commit Message

Peter Müller June 5, 2022, 10:04 a.m. UTC
The current way of truncating all override data straight away leaves us
with no data at all, should a source turn out to be unreachable or
returning bogus files (yes, Cloudflare, I _am_ looking at you).

It is therefore better to only delete data we know to have a valid
replacement for, rather than just dropping the source altogether.

Signed-off-by: Peter Müller <peter.mueller@ipfire.org>
---
 src/scripts/location-importer.in | 22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)

Patch

diff --git a/src/scripts/location-importer.in b/src/scripts/location-importer.in
index bee9186..bde92ce 100644
--- a/src/scripts/location-importer.in
+++ b/src/scripts/location-importer.in
@@ -1168,10 +1168,11 @@  class CLI(object):
 
 	def handle_update_overrides(self, ns):
 		with self.db.transaction():
-			# Drop all data that we have
+			# Only drop manually created overrides, as we can be reasonably sure to have them,
+			# and preserve the rest. If appropriate, it is deleted by correspondent functions.
 			self.db.execute("""
-				TRUNCATE TABLE autnum_overrides;
-				TRUNCATE TABLE network_overrides;
+				DELETE FROM autnum_overrides WHERE source = 'manual';
+				DELETE FROM network_overrides WHERE source = 'manual';
 			""")
 
 			# Update overrides for various cloud providers big enough to publish their own IP
@@ -1267,6 +1268,11 @@  class CLI(object):
 			log.error("unable to preprocess Amazon AWS IP ranges: %s" % e)
 			return
 
+		# At this point, we can assume the downloaded file to be valid
+		self.db.execute("""
+			DELETE FROM network_overrides WHERE source = 'Amazon AWS IP feed';
+		""")
+
 		# XXX: Set up a dictionary for mapping a region name to a country. Unfortunately,
 		# there seems to be no machine-readable version available of this other than
 		# https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html
@@ -1387,6 +1393,16 @@  class CLI(object):
 				log.error("Unable to download Spamhaus DROP URL %s: %s" % (url, e))
 				return
 
+				# Conduct a very basic sanity check to rule out CDN issues causing bogus DROP
+				# downloads.
+				if len(fcontent) > 10:
+					self.db.execute("""
+						DELETE FROM autnum_overrides WHERE source = 'Spamhaus ASN-DROP list';
+						DELETE FROM network_overrides WHERE source = 'Spamhaus DROP lists';
+					""")
+				else:
+					log.error("Spamhaus DROP URL %s returned likely bogus file, ignored" % url)
+
 			# Iterate through every line, filter comments and add remaining networks to
 			# the override table in case they are valid...
 			with self.db.transaction():