From patchwork Tue Sep 1 19:44:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Peter_M=C3=BCller?= X-Patchwork-Id: 3422 Return-Path: Received: from mail01.ipfire.org (mail01.haj.ipfire.org [172.28.1.202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384 client-signature ECDSA (P-384) client-digest SHA384) (Client CN "mail01.haj.ipfire.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by web04.haj.ipfire.org (Postfix) with ESMTPS id 4BgyGd3R5Xz3x3R for ; Tue, 1 Sep 2020 19:45:29 +0000 (UTC) Received: from mail02.haj.ipfire.org (mail02.haj.ipfire.org [172.28.1.201]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384 client-signature ECDSA (P-384) client-digest SHA384) (Client CN "mail02.haj.ipfire.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mail01.ipfire.org (Postfix) with ESMTPS id 4BgyGd20lDzky; Tue, 1 Sep 2020 19:45:29 +0000 (UTC) Received: from mail02.haj.ipfire.org (localhost [127.0.0.1]) by mail02.haj.ipfire.org (Postfix) with ESMTP id 4BgyGd0WLDz2xd3; Tue, 1 Sep 2020 19:45:29 +0000 (UTC) Received: from mail01.ipfire.org (mail01.haj.ipfire.org [172.28.1.202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384 client-signature ECDSA (P-384) client-digest SHA384) (Client CN "mail01.haj.ipfire.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mail02.haj.ipfire.org (Postfix) with ESMTPS id 4BgyGb6glBz2xWn for ; Tue, 1 Sep 2020 19:45:27 +0000 (UTC) Received: from [127.0.0.1] (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (Client did not present a certificate) by mail01.ipfire.org (Postfix) with ESMTPSA id 4BgyGX1L4jzky; Tue, 1 Sep 2020 19:45:23 +0000 (UTC) DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=ipfire.org; s=202003ed25519; t=1598989527; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=lty8NC7SAw5Y2nPsflIduWtRjUzx0lthKDa5SSFC9Ok=; b=eh5IzF8uiLux45ynFV/kfvbpS4vaDqdO1FmDwC2D7yPFn2BuCVnRAHOOI/A4tP4JZyYSpE GYGkq0BVtj5RQIBw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ipfire.org; s=202003rsa; t=1598989527; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=lty8NC7SAw5Y2nPsflIduWtRjUzx0lthKDa5SSFC9Ok=; b=H5HRBRC07WayrYbG7E6e4/1NLSDNKnMRT25seIhbLCxM/mCaQKc4FVnpvQfhc1O5VR7aJR wRDK9hCiIfHYQTmewWZWqByUPodOupKJzMPNbUqVV8sDm4iI/OynvARGLcoppSJ4ccUACJ sOGUy73ZkHeVl7hS/pqhgntqfabqMxnBPkGxS2hHJ0SLXrNf+iPI+pt9T0rYvdPgslUG+O YbQZ4c+LWxtlyFbMuFwupcRTfy3q6RdVdulIb451RD1lVv7npQ0wUJUictlRj+75ztAzjg nCkixM7rbU9c9gfKJRR/0kUgp4x4gM5cI7E7bZqQ76Xnf1Tqz2NYqg6spqPgfQ== To: Michael Tremer From: =?utf-8?q?Peter_M=C3=BCller?= Subject: Re-introducing inetnum parser, first attempt Message-ID: <46ac5f3c-36e4-c1a5-a9f6-fd9f6f4d21e9@ipfire.org> Date: Tue, 1 Sep 2020 19:44:37 +0000 MIME-Version: 1.0 Content-Language: en-US Authentication-Results: mail01.ipfire.org; auth=pass smtp.mailfrom=peter.mueller@ipfire.org X-BeenThere: location@lists.ipfire.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: location@lists.ipfire.org Errors-To: location-bounces@lists.ipfire.org Sender: "Location" Good evening Michael, below comes the diff of a rather hacky attempt to bring back the intetnum parser we've had in the past. Since I am not quite sure about that INSERT INTO networks() SQL statement and handling conflicts with extended source files there, I thought letting you have a look at it might be a good idea. :-) What do you think? Is this the right direction? Thanks, and best regards, Peter Müller diff --git a/src/python/importer.py b/src/python/importer.py index de20f37..586bd97 100644 --- a/src/python/importer.py +++ b/src/python/importer.py @@ -30,8 +30,8 @@ WHOIS_SOURCES = ( "https://ftp.afrinic.net/pub/pub/dbase/afrinic.db.gz", # Asia Pacific Network Information Centre - #"https://ftp.apnic.net/apnic/whois/apnic.db.inet6num.gz", - #"https://ftp.apnic.net/apnic/whois/apnic.db.inetnum.gz", + "https://ftp.apnic.net/apnic/whois/apnic.db.inet6num.gz", + "https://ftp.apnic.net/apnic/whois/apnic.db.inetnum.gz", #"https://ftp.apnic.net/apnic/whois/apnic.db.route6.gz", #"https://ftp.apnic.net/apnic/whois/apnic.db.route.gz", "https://ftp.apnic.net/apnic/whois/apnic.db.aut-num.gz", @@ -45,8 +45,8 @@ WHOIS_SOURCES = ( # XXX ??? # Réseaux IP Européens - #"https://ftp.ripe.net/ripe/dbase/split/ripe.db.inet6num.gz", - #"https://ftp.ripe.net/ripe/dbase/split/ripe.db.inetnum.gz", + "https://ftp.ripe.net/ripe/dbase/split/ripe.db.inet6num.gz", + "https://ftp.ripe.net/ripe/dbase/split/ripe.db.inetnum.gz", #"https://ftp.ripe.net/ripe/dbase/split/ripe.db.route6.gz", #"https://ftp.ripe.net/ripe/dbase/split/ripe.db.route.gz", "https://ftp.ripe.net/ripe/dbase/split/ripe.db.aut-num.gz", diff --git a/src/python/location-importer.in b/src/python/location-importer.in index f5ae4a9..4d7cec4 100644 --- a/src/python/location-importer.in +++ b/src/python/location-importer.in @@ -393,6 +393,10 @@ class CLI(object): if line.startswith("aut-num:"): return self._parse_autnum_block(block) + # inetnum + if line.startswith("inet6num:") or line.startswith("inetnum:"): + return self._parse_inetnum_block(block) + # organisation elif line.startswith("organisation:"): return self._parse_org_block(block) @@ -422,6 +426,78 @@ class CLI(object): autnum.get("asn"), autnum.get("org"), ) + def _parse_inetnum_block(self, block): + logging.debug("Parsing inetnum block:") + + inetnum = {} + for line in block: + logging.debug(line) + + # Split line + key, val = split_line(line) + + if key == "inetnum": + start_address, delim, end_address = val.partition("-") + + # Strip any excess space + start_address, end_address = start_address.rstrip(), end_address.strip() + + # Skip invalid blocks + if start_address in ["0.0.0.0", "::/0", "0::/0",]: + return + + # Convert to IP address + try: + start_address = ipaddress.ip_address(start_address) + end_address = ipaddress.ip_address(end_address) + except ValueError: + logging.warning("Could not parse line: %s" % line) + return + + # Set prefix to default + prefix = 32 + + # Count number of addresses in this subnet + num_addresses = int(end_address) - int(start_address) + if num_addresses: + prefix -= math.log(num_addresses, 2) + + inetnum["inetnum"] = "%s/%.0f" % (start_address, prefix) + + elif key == "inet6num": + # Skip invalid blocks + if val in ["0.0.0.0", "::/0", "0::/0",]: + return + + inetnum[key] = val + + elif key == "netname": + inetnum[key] = val + + elif key == "country": + if val == "UNITED STATES": + val = "US" + + inetnum[key] = val.upper() + + elif key == "descr": + if key in inetnum: + inetnum[key] += "\n%s" % val + else: + inetnum[key] = val + + # Skip empty objects + if not inetnum: + return + + network = ipaddress.ip_network(inetnum.get("inet6num") or inetnum.get("inetnum"), strict=False) + + self.db.execute("INSERT INTO networks(network, country) \ + VALUES(%s, %s) ON CONFLICT (network) DO \ + UPDATE SET country = excluded.country", + "%s" % str(network), inetnum.get("country"), + ) + def _parse_org_block(self, block): org = {} for line in block: