From patchwork Tue Jun 8 12:10:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Peter_M=C3=BCller?= X-Patchwork-Id: 4409 Return-Path: Received: from mail01.ipfire.org (mail01.haj.ipfire.org [172.28.1.202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) client-signature ECDSA (P-384)) (Client CN "mail01.haj.ipfire.org", Issuer "R3" (verified OK)) by web04.haj.ipfire.org (Postfix) with ESMTPS id 4Fzpwc6sMKz3wc6 for ; Tue, 8 Jun 2021 12:10:40 +0000 (UTC) Received: from mail02.haj.ipfire.org (mail02.haj.ipfire.org [172.28.1.201]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) client-signature ECDSA (P-384)) (Client CN "mail02.haj.ipfire.org", Issuer "R3" (verified OK)) by mail01.ipfire.org (Postfix) with ESMTPS id 4Fzpwc4Hhyzym; Tue, 8 Jun 2021 12:10:40 +0000 (UTC) Received: from mail02.haj.ipfire.org (localhost [127.0.0.1]) by mail02.haj.ipfire.org (Postfix) with ESMTP id 4Fzpwc3YhYz2xmR; Tue, 8 Jun 2021 12:10:40 +0000 (UTC) Received: from mail01.ipfire.org (mail01.haj.ipfire.org [172.28.1.202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) client-signature ECDSA (P-384)) (Client CN "mail01.haj.ipfire.org", Issuer "R3" (verified OK)) by mail02.haj.ipfire.org (Postfix) with ESMTPS id 4FzpwZ5xhwz2xNW for ; Tue, 8 Jun 2021 12:10:38 +0000 (UTC) Received: from location02.haj.ipfire.org (location02.haj.ipfire.org [172.28.1.170]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384 client-signature ECDSA (P-384) client-digest SHA384) (Client CN "location02.haj.ipfire.org", Issuer "R3" (verified OK)) by mail01.ipfire.org (Postfix) with ESMTPS id 4FzpwZ1wmVzym; Tue, 8 Jun 2021 12:10:38 +0000 (UTC) Received: by location02.haj.ipfire.org (Postfix, from userid 0) id 4FzpwZ092Gz13Y5; Tue, 8 Jun 2021 12:10:38 +0000 (UTC) From: =?utf-8?q?Peter_M=C3=BCller?= To: location@lists.ipfire.org Subject: [PATCH] location-importer.in: Import (technical) AS names from ARIN Date: Tue, 8 Jun 2021 12:10:36 +0000 Message-Id: <20210608121036.16242-1-peter.mueller@ipfire.org> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 X-BeenThere: location@lists.ipfire.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: location-bounces@lists.ipfire.org Sender: "Location" ARIN and LACNIC, unfortunately, do not seem to publish data containing human readable AS names. For the former, we at least have a list of tecnical names, which this patch fetches and inserts into the autnums table. While some of them do not seem to be suitable for human consumption (i. e. being very cryptic), providing these data might be helpful neverthelesss. Signed-off-by: Peter Müller --- src/python/location-importer.in | 61 +++++++++++++++++++++++++++++++++ 1 file changed, 61 insertions(+) diff --git a/src/python/location-importer.in b/src/python/location-importer.in index aa3b8f7..2a9bf33 100644 --- a/src/python/location-importer.in +++ b/src/python/location-importer.in @@ -505,6 +505,9 @@ class CLI(object): for line in f: self._parse_line(line, source_key, validcountries) + # Download and import (technical) AS names from ARIN + self._import_as_names_from_arin() + def _check_parsed_network(self, network): """ Assistive function to detect and subsequently sort out parsed @@ -775,6 +778,64 @@ class CLI(object): "%s" % network, country, [country], source_key, ) + def _import_as_names_from_arin(self): + downloader = location.importer.Downloader() + + # XXX: Download AS names file from ARIN (note that these names appear to be quite + # technical, not intended for human consumption, as description fields in + # organisation handles for other RIRs are - however, this is what we have got, + # and in some cases, it might be still better than nothing) + try: + with downloader.request("https://ftp.arin.net/info/asn.txt", return_blocks=False) as f: + arin_as_names_file = f.body + except Exception as e: + log.error("failed to download and preprocess AS name file from ARIN: %s" % e) + return + + # Split downloaded body into lines and parse each of them... + for sline in arin_as_names_file.readlines(): + + # ... valid lines start with a space, followed by the number of the Autonomous System ... + if not sline.startswith(b" "): + continue + + # Split line and check if there is a valid ASN in it... + scontents = sline.split() + try: + asn = int(scontents[0]) + except ValueError: + log.debug("Skipping ARIN AS names line not containing an integer for ASN") + continue + + if not ((1 <= asn and asn <= 23455) or (23457 <= asn and asn <= 64495) or (131072 <= asn and asn <= 4199999999)): + log.debug("Skipping ARIN AS names line not containing a valid ASN: %s" % asn) + continue + + # Skip any AS name that appears to be a placeholder for a different RIR or entity... + as_name = scontents[1].decode("ascii") + + if re.match(r"^(ASN-BLK|)(AFCONC|AFRINIC|APNIC|ASNBLK|DNIC|LACNIC|RIPE|IANA)\d{0,1}-*", as_name): + continue + + # Bail out in case the AS name contains anything we do not expect here... + if re.search(r"[^a-zA-Z0-9-_]", as_name): + log.debug("Skipping ARIN AS name for %s containing invalid characters: %s" % \ + (asn, as_name)) + + # Things look good here, run INSERT statement and skip this one if we already have + # a (better?) name for this Autonomous System... + self.db.execute(""" + INSERT INTO autnums( + number, + name, + source + ) VALUES (%s, %s, %s) + ON CONFLICT (number) DO NOTHING""", + asn, + as_name, + "ARIN", + ) + def handle_update_announcements(self, ns): server = ns.server[0]