From patchwork Sat Apr 10 12:28:06 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Peter_M=C3=BCller?= X-Patchwork-Id: 4139 Return-Path: Received: from mail01.ipfire.org (mail01.haj.ipfire.org [172.28.1.202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384 client-signature ECDSA (P-384) client-digest SHA384) (Client CN "mail01.haj.ipfire.org", Issuer "R3" (verified OK)) by web04.haj.ipfire.org (Postfix) with ESMTPS id 4FHZ650yR7z40R0 for ; Sat, 10 Apr 2021 12:28:13 +0000 (UTC) Received: from mail02.haj.ipfire.org (mail02.haj.ipfire.org [172.28.1.201]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) client-signature ECDSA (P-384)) (Client CN "mail02.haj.ipfire.org", Issuer "R3" (verified OK)) by mail01.ipfire.org (Postfix) with ESMTPS id 4FHZ6453TvzxB; Sat, 10 Apr 2021 12:28:12 +0000 (UTC) Received: from mail02.haj.ipfire.org (localhost [127.0.0.1]) by mail02.haj.ipfire.org (Postfix) with ESMTP id 4FHZ644Pm6z2xYh; Sat, 10 Apr 2021 12:28:12 +0000 (UTC) Received: from mail01.ipfire.org (mail01.haj.ipfire.org [172.28.1.202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384 client-signature ECDSA (P-384) client-digest SHA384) (Client CN "mail01.haj.ipfire.org", Issuer "R3" (verified OK)) by mail02.haj.ipfire.org (Postfix) with ESMTPS id 4FHZ632nPvz2xB0 for ; Sat, 10 Apr 2021 12:28:11 +0000 (UTC) Received: from [127.0.0.1] (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by mail01.ipfire.org (Postfix) with ESMTPSA id 4FHZ615VWnzxB for ; Sat, 10 Apr 2021 12:28:09 +0000 (UTC) DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=ipfire.org; s=202003ed25519; t=1618057690; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=0YlX1w05KnCnro6cAIUtXpOIT5SlwYWIjQfF6C8WS5U=; b=s7e23JSOKIao8i8VUWqKMqUXeiK0DJjE6U6AQYGST2i/8fJU2FMhFPoEnRoQXhKihyY5PH 1Or5x5BZJrg9CUDQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ipfire.org; s=202003rsa; t=1618057690; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=0YlX1w05KnCnro6cAIUtXpOIT5SlwYWIjQfF6C8WS5U=; b=T/jAV7u6e5rV93qMt4CNumVIDM7goCVF+Alyx/50MPRo6eMgXbau7KdmYPJwVzQxM1J+8n O1S/ru4VQtKkC0AGTwjuOX57R4rjEHIx9F6K29D8M0FzrkyJqL1NG3pj56Yzn33cUZzEuT VhYAQJSk0Z3jbUXOkYtVHPV1erarLs5Xe7XTuB+YmHAFz6HFdi5SXSa6JPrO81pwzF7SAP i8fTHJAu5/zBa5ukmfUleuqzJrhu9POBeAzpmhlgd09seYlmleymlBRuonZv5yYwgP5+FC R2jpEJ7kaVCPqYX8b7x/8eo2bPn8PWgwrDcvLnjzgoybBvXIaqdQdX5jAB1d8Q== To: "IPFire: Location" From: =?utf-8?q?Peter_M=C3=BCller?= Subject: [PATCH] location-importer.in: import additional IP information for Amazon AWS IP networks Message-ID: <7e30f16b-687e-62f2-bf1f-1a6f17616919@ipfire.org> Date: Sat, 10 Apr 2021 14:28:06 +0200 MIME-Version: 1.0 Content-Language: en-US X-BeenThere: location@lists.ipfire.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: location-bounces@lists.ipfire.org Sender: "Location" Amazon publishes information regarding some of their IP networks primarily used for AWS cloud services in a machine-readable format. To improve libloc lookup results for these, we have little choice other than importing and parsing them. Unfortunately, there seems to be no machine-readable list of the locations of their data centers or availability zones available. If there _is_ any, please let the author know. Fixes: #12594 Signed-off-by: Peter Müller --- src/python/location-importer.in | 110 ++++++++++++++++++++++++++++++++ 1 file changed, 110 insertions(+) diff --git a/src/python/location-importer.in b/src/python/location-importer.in index 1e08458..5be1d61 100644 --- a/src/python/location-importer.in +++ b/src/python/location-importer.in @@ -19,6 +19,7 @@ import argparse import ipaddress +import json import logging import math import re @@ -931,6 +932,10 @@ class CLI(object): TRUNCATE TABLE network_overrides; """) + # Update overrides for various cloud providers big enough to publish their own IP + # network allocation lists in a machine-readable format... + self._update_overrides_for_aws() + for file in ns.files: log.info("Reading %s..." % file) @@ -998,6 +1003,111 @@ class CLI(object): else: log.warning("Unsupported type: %s" % type) + def _update_overrides_for_aws(self): + # Download Amazon AWS IP allocation file to create overrides... + downloader = location.importer.Downloader() + + try: + with downloader.request("https://ip-ranges.amazonaws.com/ip-ranges.json", return_blocks=False) as f: + aws_ip_dump = json.load(f.body) + except Exception as e: + log.error("unable to preprocess Amazon AWS IP ranges: %s" % e) + return + + # XXX: Set up a dictionary for mapping a region name to a country. Unfortunately, + # there seems to be no machine-readable version available of this other than + # https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html + # (worse, it seems to be incomplete :-/ ); https://www.cloudping.cloud/endpoints + # was helpful here as well. + aws_region_country_map = { + "af-south-1": "ZA", + "ap-east-1": "HK", + "ap-south-1": "IN", + "ap-south-2": "IN", + "ap-northeast-3": "JP", + "ap-northeast-2": "KR", + "ap-southeast-1": "SG", + "ap-southeast-2": "AU", + "ap-southeast-3": "MY", + "ap-northeast-1": "JP", + "ca-central-1": "CA", + "eu-central-1": "DE", + "eu-central-2": "CH", + "eu-west-1": "IE", + "eu-west-2": "GB", + "eu-south-1": "IT", + "eu-south-2": "ES", + "eu-west-3": "FR", + "eu-north-1": "SE", + "me-south-1": "BH", + "sa-east-1": "BR" + } + + # Fetch all valid country codes to check parsed networks aganist... + rows = self.db.query("SELECT * FROM countries ORDER BY country_code") + validcountries = [] + + for row in rows: + validcountries.append(row.country_code) + + with self.db.transaction(): + for snetwork in aws_ip_dump["prefixes"] + aws_ip_dump["ipv6_prefixes"]: + try: + network = ipaddress.ip_network(snetwork.get("ip_prefix") or snetwork.get("ipv6_prefix"), strict=False) + except ValueError: + log.warning("Unable to parse line: %s" % snetwork) + continue + + # Sanitize parsed networks... + if not self._check_parsed_network(network): + continue + + # Determine region of this network... + region = snetwork["region"] + cc = None + is_anycast = False + + # Any region name starting with "us-" will get "US" country code assigned straight away... + if region.startswith("us-"): + cc = "US" + elif region.startswith("cn-"): + # ... same goes for China ... + cc = "CN" + elif region == "GLOBAL": + # ... funny region name for anycast-like networks ... + is_anycast = True + elif region in aws_region_country_map: + # ... assign looked up country code otherwise ... + cc = aws_region_country_map[region] + else: + # ... and bail out if we are missing something here + log.warning("Unable to determine country code for line: %s" % snetwork) + continue + + # Skip networks with unknown country codes + if not is_anycast and validcountries and cc not in validcountries: + log.warning("Skipping Amazon AWS network with bogus country '%s': %s" % \ + (cc, network)) + return + + # Conduct SQL statement... + self.db.execute(""" + INSERT INTO network_overrides( + network, + country, + is_anonymous_proxy, + is_satellite_provider, + is_anycast + ) VALUES (%s, %s, %s, %s, %s) + ON CONFLICT (network) DO NOTHING""", + "%s" % network, + cc, + None, + None, + is_anycast, + ) + + @staticmethod def _parse_bool(block, key): val = block.get(key)