From patchwork Sun Sep 20 19:20:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Peter_M=C3=BCller?= X-Patchwork-Id: 3476 Return-Path: Received: from mail01.ipfire.org (mail01.haj.ipfire.org [172.28.1.202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) client-signature ECDSA (P-384)) (Client CN "mail01.haj.ipfire.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by web04.haj.ipfire.org (Postfix) with ESMTPS id 4Bvcq275CWz3x48 for ; Sun, 20 Sep 2020 19:20:30 +0000 (UTC) Received: from mail02.haj.ipfire.org (mail02.haj.ipfire.org [172.28.1.201]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) client-signature ECDSA (P-384)) (Client CN "mail02.haj.ipfire.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mail01.ipfire.org (Postfix) with ESMTPS id 4Bvcq24sHwzkH; Sun, 20 Sep 2020 19:20:30 +0000 (UTC) Received: from mail02.haj.ipfire.org (localhost [127.0.0.1]) by mail02.haj.ipfire.org (Postfix) with ESMTP id 4Bvcq24KJWz2xny; Sun, 20 Sep 2020 19:20:30 +0000 (UTC) Received: from mail01.ipfire.org (mail01.haj.ipfire.org [172.28.1.202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) client-signature ECDSA (P-384)) (Client CN "mail01.haj.ipfire.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mail02.haj.ipfire.org (Postfix) with ESMTPS id 4Bvcq113FXz2xny for ; Sun, 20 Sep 2020 19:20:29 +0000 (UTC) Received: from [127.0.0.1] (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (Client did not present a certificate) by mail01.ipfire.org (Postfix) with ESMTPSA id 4Bvcpx6hxZzkH; Sun, 20 Sep 2020 19:20:25 +0000 (UTC) DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=ipfire.org; s=202003ed25519; t=1600629628; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=j5CVB1W7Avpi4AHEJJYpOAbNkO0Dfb8Xj5UI7bNNSjc=; b=/1wSPhxibYILOf9HpSG70DBk8RMpnqWKOcVbuTmaMS3J+ux2vtxcqoXDrmU2f9UYF1P69o 73x9OZznmA7bTJDQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ipfire.org; s=202003rsa; t=1600629628; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=j5CVB1W7Avpi4AHEJJYpOAbNkO0Dfb8Xj5UI7bNNSjc=; b=Giald05DpZM1gxjRgo7OsxlVW9kHq9FtrxmS0AcvXTNxpr6b6sah1HNTeeMInRcI3OhFKK aNFrMBUjW3RDcjrWqhSeKgLdJg0aybhmwaD6oBVMqWtBm6RbFSsm2QAnnZWrJ2XAhO6xBI cSHQ1ModgwdYMKRdAW2rIrPNDpEqtjFEgqfNqv2ULRvdgpKqDCb9+TkjvtUiCjCLK+jWGy BwnneYnriVRwo77ZoV5X2ivl8a8a9SZUI3+d/GJqxfoXlomEZNENAuwVBPH/1qTSTB/B5Z 7b54Vsll5PB5TjXMDOX+8lYbZagvbb3MKIxzhbXCfTkqK6NBoThlXRS6aXzw5w== Subject: [PATCH v2 2/3] importer: Import raw sources for inetnum's again To: location@lists.ipfire.org References: From: =?utf-8?q?Peter_M=C3=BCller?= Message-ID: <54b93c52-9578-3390-d8d4-e889766dcb84@ipfire.org> Date: Sun, 20 Sep 2020 19:20:18 +0000 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US Authentication-Results: mail01.ipfire.org; auth=pass smtp.mailfrom=peter.mueller@ipfire.org X-BeenThere: location@lists.ipfire.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: location-bounces@lists.ipfire.org Sender: "Location" The extended feeds do not have enough detailed information for us, so that we need to import inetnums from RIRs where possible. Filtering private networks is necessary as RIR data may contain 0.0.0.0/0 or similar entries for administrative purposes or due to misfilings. Special thanks goes to Michael for spending numerous hours on this, setting up a testing environment and providing helpful advice while debugging. Partially fixes: #12458 Cc: Michael Tremer Signed-off-by: Peter Müller --- src/python/importer.py | 14 ++++---- src/python/location-importer.in | 63 +++++++++++++++++++++++++++++++++ 2 files changed, 70 insertions(+), 7 deletions(-) diff --git a/src/python/importer.py b/src/python/importer.py index de20f37..f19db4b 100644 --- a/src/python/importer.py +++ b/src/python/importer.py @@ -30,8 +30,8 @@ WHOIS_SOURCES = ( "https://ftp.afrinic.net/pub/pub/dbase/afrinic.db.gz", # Asia Pacific Network Information Centre - #"https://ftp.apnic.net/apnic/whois/apnic.db.inet6num.gz", - #"https://ftp.apnic.net/apnic/whois/apnic.db.inetnum.gz", + "https://ftp.apnic.net/apnic/whois/apnic.db.inet6num.gz", + "https://ftp.apnic.net/apnic/whois/apnic.db.inetnum.gz", #"https://ftp.apnic.net/apnic/whois/apnic.db.route6.gz", #"https://ftp.apnic.net/apnic/whois/apnic.db.route.gz", "https://ftp.apnic.net/apnic/whois/apnic.db.aut-num.gz", @@ -45,8 +45,8 @@ WHOIS_SOURCES = ( # XXX ??? # Réseaux IP Européens - #"https://ftp.ripe.net/ripe/dbase/split/ripe.db.inet6num.gz", - #"https://ftp.ripe.net/ripe/dbase/split/ripe.db.inetnum.gz", + "https://ftp.ripe.net/ripe/dbase/split/ripe.db.inet6num.gz", + "https://ftp.ripe.net/ripe/dbase/split/ripe.db.inetnum.gz", #"https://ftp.ripe.net/ripe/dbase/split/ripe.db.route6.gz", #"https://ftp.ripe.net/ripe/dbase/split/ripe.db.route.gz", "https://ftp.ripe.net/ripe/dbase/split/ripe.db.aut-num.gz", @@ -55,10 +55,10 @@ WHOIS_SOURCES = ( EXTENDED_SOURCES = ( # African Network Information Centre - "https://ftp.afrinic.net/pub/stats/afrinic/delegated-afrinic-extended-latest", + #"https://ftp.afrinic.net/pub/stats/afrinic/delegated-afrinic-extended-latest", # Asia Pacific Network Information Centre - "https://ftp.apnic.net/apnic/stats/apnic/delegated-apnic-extended-latest", + #"https://ftp.apnic.net/apnic/stats/apnic/delegated-apnic-extended-latest", # American Registry for Internet Numbers "https://ftp.arin.net/pub/stats/arin/delegated-arin-extended-latest", @@ -67,7 +67,7 @@ EXTENDED_SOURCES = ( "http://ftp.lacnic.net/pub/stats/lacnic/delegated-lacnic-extended-latest", # Réseaux IP Européens - "https://ftp.ripe.net/pub/stats/ripencc/delegated-ripencc-extended-latest", + #"https://ftp.ripe.net/pub/stats/ripencc/delegated-ripencc-extended-latest", ) class Downloader(object): diff --git a/src/python/location-importer.in b/src/python/location-importer.in index 77952f2..e3a07a0 100644 --- a/src/python/location-importer.in +++ b/src/python/location-importer.in @@ -393,6 +393,10 @@ class CLI(object): if line.startswith("aut-num:"): return self._parse_autnum_block(block) + # inetnum + if line.startswith("inet6num:") or line.startswith("inetnum:"): + return self._parse_inetnum_block(block) + # organisation elif line.startswith("organisation:"): return self._parse_org_block(block) @@ -422,6 +426,65 @@ class CLI(object): autnum.get("asn"), autnum.get("org"), ) + def _parse_inetnum_block(self, block): + logging.debug("Parsing inetnum block:") + + inetnum = {} + for line in block: + logging.debug(line) + + # Split line + key, val = split_line(line) + + if key == "inetnum": + start_address, delim, end_address = val.partition("-") + + # Strip any excess space + start_address, end_address = start_address.rstrip(), end_address.strip() + + # Convert to IP address + try: + start_address = ipaddress.ip_address(start_address) + end_address = ipaddress.ip_address(end_address) + except ValueError: + logging.warning("Could not parse line: %s" % line) + return + + # Set prefix to default + prefix = 32 + + # Count number of addresses in this subnet + num_addresses = int(end_address) - int(start_address) + if num_addresses: + prefix -= math.log(num_addresses, 2) + + inetnum["inetnum"] = "%s/%.0f" % (start_address, prefix) + + elif key == "inet6num": + inetnum[key] = val + + elif key == "country": + if val == "UNITED STATES": + val = "US" + + inetnum[key] = val.upper() + + # Skip empty objects + if not inetnum: + return + + network = ipaddress.ip_network(inetnum.get("inet6num") or inetnum.get("inetnum"), strict=False) + + # Bail out in case we have processed a non-public IP network + if network.is_private: + logging.warning("Skipping non-globally routable network: %s" % network) + return + + self.db.execute("INSERT INTO networks(network, country) \ + VALUES(%s, %s) ON CONFLICT (network) DO UPDATE SET country = excluded.country", + "%s" % network, inetnum.get("country"), + ) + def _parse_org_block(self, block): org = {} for line in block: