Message ID | 20220927164847.3409646-1-michael.tremer@ipfire.org |
---|---|
State | Accepted |
Commit | 183b2f7477068a68e8b439754487565945899052 |
Headers |
Return-Path: <location-bounces@lists.ipfire.org> Received: from mail01.ipfire.org (mail01.haj.ipfire.org [172.28.1.202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) client-signature ECDSA (P-384)) (Client CN "mail01.haj.ipfire.org", Issuer "R3" (verified OK)) by web04.haj.ipfire.org (Postfix) with ESMTPS id 4McQZ2271hz3wfW for <patchwork@web04.haj.ipfire.org>; Tue, 27 Sep 2022 16:48:58 +0000 (UTC) Received: from mail02.haj.ipfire.org (mail02.haj.ipfire.org [172.28.1.201]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384 client-signature ECDSA (P-384) client-digest SHA384) (Client CN "mail02.haj.ipfire.org", Issuer "R3" (verified OK)) by mail01.ipfire.org (Postfix) with ESMTPS id 4McQZ04NKLzDp; Tue, 27 Sep 2022 16:48:56 +0000 (UTC) Received: from mail02.haj.ipfire.org (localhost [127.0.0.1]) by mail02.haj.ipfire.org (Postfix) with ESMTP id 4McQZ03CWMz2yV1; Tue, 27 Sep 2022 16:48:56 +0000 (UTC) Received: from mail01.ipfire.org (mail01.haj.ipfire.org [172.28.1.202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384 client-signature ECDSA (P-384) client-digest SHA384) (Client CN "mail01.haj.ipfire.org", Issuer "R3" (verified OK)) by mail02.haj.ipfire.org (Postfix) with ESMTPS id 4McQYy3fm9z2xLr for <location@lists.ipfire.org>; Tue, 27 Sep 2022 16:48:54 +0000 (UTC) Received: from [127.0.0.1] (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by mail01.ipfire.org (Postfix) with ESMTPSA id 4McQYx68N9zDp; Tue, 27 Sep 2022 16:48:53 +0000 (UTC) DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=ipfire.org; s=202003ed25519; t=1664297333; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=MmKPFN/2eQIESUeE02XuKKwNCTL3aXkI70k4GdcO/Vs=; b=5KZfrcWx8lTkt3ctBdFFODLQCLzjDlx9Ma1nUvGVbH2obLgZcivYv0XwCXm/ZvsIHm/iEO Fy2F+KpVXuxi/wAA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ipfire.org; s=202003rsa; t=1664297333; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=MmKPFN/2eQIESUeE02XuKKwNCTL3aXkI70k4GdcO/Vs=; b=Ped/HPbtFlBCtqR1+a59PzMOS0mS/tDiiBlCsfiz0zjqWVCSi9693LmHGUAo0yv75b/3hY nPEx5rtt6m3ZAhujnx/b4hKWhogC2iW6g7Ogl7N2WE+OMP2mbPOBostsIGu7h1B0gCGSKC BI7CqtqppgYpWAEaNs74o+iDkVAUAK7aReGqib7McETeitttNvshF20z5i1YmghzUMs57I ermMd7TayYryl7jIR0XYcO/CWyZwBv4qvXyIRT2huwgc4Sw87VHjp5JOol7XMct7F0x1Z9 Q5KHe+XloSh9wkGtY3ltpGhzEyG35WFDiazOog5tdG4BQHigziPXhbgHHavLCQ== From: Michael Tremer <michael.tremer@ipfire.org> To: location@lists.ipfire.org Subject: [PATCH 01/10] importer: Store geofeed URLs from RIR data Date: Tue, 27 Sep 2022 16:48:38 +0000 Message-Id: <20220927164847.3409646-1-michael.tremer@ipfire.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: location@lists.ipfire.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: <location.lists.ipfire.org> List-Unsubscribe: <https://lists.ipfire.org/mailman/options/location>, <mailto:location-request@lists.ipfire.org?subject=unsubscribe> List-Archive: <http://lists.ipfire.org/pipermail/location/> List-Post: <mailto:location@lists.ipfire.org> List-Help: <mailto:location-request@lists.ipfire.org?subject=help> List-Subscribe: <https://lists.ipfire.org/mailman/listinfo/location>, <mailto:location-request@lists.ipfire.org?subject=subscribe> Cc: Michael Tremer <michael.tremer@ipfire.org> Errors-To: location-bounces@lists.ipfire.org Sender: "Location" <location-bounces@lists.ipfire.org> |
Series |
[01/10] importer: Store geofeed URLs from RIR data
|
|
Commit Message
Michael Tremer
Sept. 27, 2022, 4:48 p.m. UTC
Signed-off-by: Michael Tremer <michael.tremer@ipfire.org>
---
src/scripts/location-importer.in | 40 +++++++++++++++++++++++++++++++-
1 file changed, 39 insertions(+), 1 deletion(-)
Comments
Hello Michael, above all, thank you very much for the patchset and all the work behind it. Unfortunately, as briefly discussed via the phone already, I have some general concerns regarding geofeeds: (a) In contrast to RIRs, I do not see geofeed providers as trustworthy source. While the former are not trustworthy in terms of the data they provide (since no vetting or QA of database changes is usually conducted, and it does not look to me like this is going to change soon), at least their infrastructure is: It seems reasonable to me to trust, for example, RIPE's FTP server to serve the same database files regardless of the client requesting it. For some of them, we could even verify that through file signature validation, assuming that it is too costly to do live GPG-signing at scale. Geofeed URLs, in contrast, can lead to anywhere, and I would not be surprised at all to see dubious ISPs serving different geofeeds to different clients. Given that our IP address ranges are public and static, and libloc reveals itself through the User-Agent HTTP header, it would be quite easy to serve us a geofeed that tampers with data, while playing innocent to other clients. In addition, many of the 215 geofeed URLs that are currently live (attached) point to services such as Google Docs or GitHub - both don't strike me as reliable sources in terms of persistence. Generally, we have the full problem of URL/domain rot again. :-( One could argue that these points (to a certain extend) hold true for RIRs as well. However, if we cannot trust them, it's curtains for libloc either way. :-) Some random ISPs trying to make us consuming geolocation data from random URLs, on the other hand, poses a greater risk than benefit to the quality of the location database. Which brings me directly to the next point... (b) Presumed we still agree on not being more precise than /24 or /48, all the information geofeeds provide could (should?) have been in the RIR databases as well. The only exception is ARIN, but since we do not get their raw database, we won't be able to consume any geofeed URLs in it. So, for the area where we lack accuracy of geolocation information most, geofeed won't help us. And for all the other RIRs (LACNIC included, for which we process an additional geolocation database feed already), the geofeeds ideally should not contain any new information to us. Earlier today, I created a location database text dump on location02 with and without the geofeed patchset applied. The diff can be retrieved from https://people.ipfire.org/~pmueller/location-database-geofeed-diff.tar.gz, and is rather massive, partly because CIDRs smaller than /24 resp. /48 are yet to be ignored by the geofeed processing routines. I have yet to assess the diff closely, but for a superficial analysis, it appears like geofeed introduces a lot of changes that could have been in the respective RIR databases as well. The fact that they are not there does not inspire confidence. Apologies for this rather disappointing feedback, and best regards, Peter Müller id | url | status | updated_at ------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------+---------------------------- 803 | https://128bit.ee/geofeed.csv | | 2022-09-27 08:48:26.590283 570 | https://1581710f-1ced-4a06-8390-7cc61076f103.selcdn.net/geofeed.csv | | 2022-09-27 08:48:25.051701 132 | https://207941.xyz/static/geo.csv | | 2022-09-27 08:48:20.414238 317 | https://39office.co.uk/geofeed/geofeed.csv | 406 | 555 | https://abramcevo.net/geofeeds.csv | | 2022-09-27 08:48:24.871628 1053 | https://akrn.net/geofeed/ | | 2022-09-27 08:48:30.103967 357 | https://altairline.net/ipgeofeed.csv | | 2022-09-27 08:48:24.184789 246 | https://api.ewpratten.com/network/geofeed.csv | | 2022-09-27 08:48:22.676312 278 | https://as203333.jonasbusch.de/geofeed.csv | | 2022-09-27 08:48:23.171395 256 | https://as204406.net/geofeed.csv | | 2022-09-27 08:48:23.042018 52 | https://as205479.net/geofeed.csv | 404 | 207 | https://as206016.ezdomain.ru/geofeed.csv | | 2022-09-27 08:48:21.961518 191 | https://as207941.net/static/geo.csv | | 2022-09-27 08:48:29.101466 128 | https://as209114.net/geofeed.csv | | 2022-09-27 08:48:20.541015 142 | https://as211398.net/geofeed.csv | | 2022-09-27 08:48:20.695646 8 | https://as47869.net/geofeed.csv | | 2022-09-27 08:48:16.969164 459 | https://as56655.net/geofeed.csv | | 2022-09-27 08:48:24.426049 842 | https://assets.ip-only.net/documents/geoipfeed/AS12552.txt | | 2022-09-27 08:48:47.301489 206 | https://assigned.network/geofeed.csv | | 2022-09-27 08:48:21.642988 163 | https://august.tw/rfc8805.csv | | 2022-09-27 08:48:21.286217 133 | https://b4rn.org.uk/geofeed.csv | | 2022-09-27 08:48:21.189017 318 | https://bakergroup.uk/geofeed.csv | | 2022-09-27 08:48:23.679303 185 | https://bisv.ru/geofeed.csv | | 2022-09-27 08:48:21.167518 493 | https://blade-public-static.blade-group.com/network/geofeed.csv | | 2022-09-27 08:48:25.0396 91 | https://chesskuo.tw/geofeeds.csv | | 2022-09-27 08:48:20.384467 1192 | https://cilix.co.uk/geofeed.csv | | 2022-09-27 08:48:31.21659 76 | https://city-telekom.ru/geo/google.csv | | 2022-09-27 08:48:19.094937 408 | https://clouvider.com/geoipfeed.csv | | 2022-09-27 08:48:24.893566 4 | https://connectivia.it/wp-content/uploads/2022/08/geofeed.csv | | 2022-09-27 08:48:27.052104 824 | https://convergenze.it/geofeeds.csv | | 2022-09-27 08:48:26.868773 1122 | https://cposo.ru/geofeed.csv | | 2022-09-27 08:48:31.070979 24 | https://data-source.elastx.cloud/geofeed.csv | | 2022-09-27 08:48:17.833537 326 | https://deltatelesystems.ru/geofeed.csv | | 2022-09-27 08:48:24.164182 902 | https://docs.google.com/spreadsheets/d/1mPvsx7AzQx8yqFhyoYK6K9viEJNEpAZ3ICGKXG3Us5o/edit?usp=sharing/geofeed.csv | | 2022-09-27 08:48:28.139785 833 | https://docs.google.com/spreadsheets/d/e/2PACX-1vR2cb06ptOKwF8GLQwz0AAPJDjEi3L1Y0cWGlbB5pfyHFc_CFiW3tYTbrI5q3_Iuak2iiMB0htwLZPE/pub?output=csv | | 2022-09-27 08:48:27.655495 1015 | https://docs.google.com/spreadsheets/d/e/2PACX-1vSrczJhjW18HlapiISgmf6_52-Qc3fDA280OxzxjEEUAXnZxZgPEvh_qehOE_tvq-ZOVmICiYDr544P/pub?gid=0&single=true&output=csv | | 2022-09-27 08:48:29.876872 673 | https://docs.google.com/spreadsheets/d/e/2PACX-1vTfAIimGjXRLgfJd85VvVj5-fLa6a8kREezGUE8lLffEyW1i-mQCV8qPuqxIeyRzen1iYbaIDcEazaK/pub?output=csv | | 2022-09-27 08:48:26.520855 674 | https://docs.google.com/spreadsheets/d/e/2PACX-1vTfJX6jAMdyLkgqFPj9aoO5k9TEsFegn0KDWX4gfnK2XGbC1S3hZ1EHZOMMPJ0kchky7ElZr_Uh2vI6/pub?output=csv | | 2022-09-27 08:48:26.805053 197 | https://docs.google.com/spreadsheets/d/e/2PACX-1vTm5F9a_HLGR-FPR7LWut672RX4nihGTovrmtVd8asQq4N81_JRJEoVQzF9YWQdYBAccsc_yLp9Pi9K/pub?gid=0&single=true&output=csv | | 2022-09-27 08:48:21.869312 305 | https://etelecom.ru/files/geofeed.csv | | 2022-09-27 08:48:23.672911 53 | https://exoscale-prefixes.sos.exo.io/exoscale_geofeed | | 2022-09-27 08:48:19.077892 43 | https://files.as209861.net/geofeed.csv | | 2022-09-27 08:48:18.536718 998 | https://files.delta.nl/geofeed-dfn.csv | | 2022-09-27 08:48:28.699714 636 | https://fj38gd6m2pd73464fjqw.maloco.ru/geofeed.csv | | 2022-09-27 08:48:25.445304 245 | https://fr89.uk/geofeed.csv | | 2022-09-27 08:48:22.658833 328 | https://gblnet.net/ipaddress/geofeed.csv | | 2022-09-27 08:48:24.145082 1069 | https://geo.blacknight.ie/ripe-geofeed.csv | | 2022-09-27 08:48:30.140024 39 | https://geo.daknob.net/as210312.csv | | 2022-09-27 08:48:18.423616 204 | https://geofeed.186526.net/geofeed.csv | | 2022-09-27 08:48:21.427929 106 | https://geofeed.as207111.net/geofeed.csv | | 2022-09-27 08:48:20.243992 101 | https://geofeed.as210546.net/subnet.csv | | 2022-09-27 08:48:19.671305 303 | https://geofeed.azedunet.com/geofeed.csv | | 2022-09-27 08:48:23.662716 236 | https://geofeed.b00b.eu/feed.txt | | 2022-09-27 08:48:22.48175 74 | https://geofeed.bluevps.com/geofeed.csv | | 2022-09-27 08:48:18.806875 482 | https://geofeed.bsonetwork.net/geofeed.csv | | 2022-09-27 08:48:25.421007 15 | https://geofeed.constant.com/ | | 2022-09-27 08:48:19.238399 125 | https://geofeed.cynthia.re/geofeed/primary.csv | | 2022-09-27 08:48:20.07647 669 | https://geofeeddata.s3.eu-west-2.amazonaws.com/geofeed.csv | | 2022-09-27 08:48:25.590025 1112 | https://geofeed.disney.com | | 2022-09-27 08:48:32.368274 1085 | https://geofeed.disney.com/ | | 2022-09-27 08:48:32.455889 10 | https://geofeed.hostzealot.com/geofeed.csv | | 2022-09-27 08:48:17.675103 90 | https://geofeed.huize.asia/geofeed.csv | 523 | 354 | https://geofeed.ip-max.net/ip-max.csv | | 2022-09-27 08:48:24.150327 695 | https://geofeed.keepit.com/geofeed.csv | | 2022-09-27 08:48:25.840645 18 | https://geofeed.kviknet.dk/geofeed.csv | | 2022-09-27 08:48:17.255633 6 | https://geofeed.llnw.net/ | | 2022-09-27 08:48:18.41984 162 | https://geofeed.netfiretec.com/ | | 2022-09-27 08:48:22.236797 244 | https://geofeed.noc.vanvik.ax/geofeed.csv | | 2022-09-27 08:48:22.385053 1 | https://geofeed.rapidseedbox.com/geofeed.csv | | 2022-09-27 08:48:17.381663 211 | https://geofeed.rejecty.com/geofeed.csv | | 2022-09-27 08:48:21.839197 697 | https://geofeed.servercore.com/prefixes.csv | 404 | 796 | https://geofeed.servercore.com/subnets.csv | | 2022-09-27 08:48:26.582391 111 | https://geofeeds.ihcb-group.com/ | 526 | 23 | https://geofeed.snapserv.net/v1.csv | | 2022-09-27 08:48:17.633081 103 | https://geofeeds.speedypage.com/geofeed.csv | | 2022-09-27 08:48:19.843224 621 | https://geofeeds.surfshark.com/geofeed.csv | | 2022-09-27 08:48:25.237156 330 | https://geofeed.timeweb.net/geofeed.csv | | 2022-09-27 08:48:24.032876 78 | https://geofeed.transatel.com/geoloc.csv | | 2022-09-27 08:48:19.238399 20 | https://geofeed.tunenet.dk/geofeed.csv | | 2022-09-27 08:48:17.656902 706 | https://geofeed.wgtwo.com/geofeed.csv | | 2022-09-27 08:48:26.116829 117 | https://geofeed.zhiccc.net/2a0d-2587-geofeed.csv | | 2022-09-27 08:48:20.163871 127 | https://geofeed.zhiccc.net/2a0e-b107-geofeed.csv | | 2022-09-27 08:48:20.147434 44 | https://geo.imaster.ru/geofeeds.csv | | 2022-09-27 08:48:18.936465 80 | https://geoip.51178.ru/geofeed.csv | | 2022-09-27 08:48:19.535356 12 | https://geolocation.itm8.com | | 2022-09-27 08:48:17.612405 658 | https://geolocation.misaka.io/as969/r4g.csv | | 2022-09-27 08:48:25.610238 31 | https://geolocation.misaka.io/as969/r6g.csv | | 2022-09-27 08:48:18.054779 295 | https://geo.telehouse-rechenzentrum.de/geofeed.txt | | 2022-09-27 08:48:23.186858 198 | https://gf.hrvoje.org | | 2022-09-27 08:48:20.99458 164 | https://gf.hrvoje.org/ | | 2022-09-27 08:48:20.682033 397 | https://github.com/aossia/geolocation/blob/main/geofeed.csv | | 2022-09-27 08:48:24.52641 3 | https://github.com/datis-geoip/geofeed/blob/main/geofeed.csv | | 2022-09-27 08:48:17.461629 1062 | https://github.com/geocheck/geofeed/blob/main/185.108.207.0-24_madrid.csv | | 2022-09-27 08:48:29.918208 996 | https://github.com/geocheck/geofeed/blob/main/185.167.234.0-24_zurich.csv | | 2022-09-27 08:48:28.924989 689 | https://github.com/geocheck/geofeed/blob/main/193.31.62.0-24_helsinki.csv | | 2022-09-27 08:48:25.967151 1077 | https://github.com/geocheck/geofeed/blob/main/193.37.196.0-24_kishinev.csv | | 2022-09-27 08:48:30.292457 1002 | https://github.com/geocheck/geofeed/blob/main/194.150.210.0-24_ankara.csv | | 2022-09-27 08:48:28.970679 1003 | https://github.com/geocheck/geofeed/blob/main/194.150.211.0-24_bratislava.csv | | 2022-09-27 08:48:29.029244 1065 | https://github.com/geocheck/geofeed/blob/main/194.233.8.0-22_kiev.csv | | 2022-09-27 08:48:30.061457 828 | https://github.com/geocheck/geofeed/blob/main/195.34.78.0-24_athens.csv | | 2022-09-27 08:48:27.226006 1049 | https://github.com/geocheck/geofeed/blob/main/212.69.18.0-24_luxembourg.csv | | 2022-09-27 08:48:29.631977 1114 | https://github.com/geocheck/geofeed/blob/main/213.139.64.0-22_paris.csv | | 2022-09-27 08:48:30.552134 1116 | https://github.com/geocheck/geofeed/blob/main/213.225.238.0-24_riga.csv | | 2022-09-27 08:48:30.737163 1125 | https://github.com/geocheck/geofeed/blob/main/45.140.195.0-24_oslo.csv | | 2022-09-27 08:48:31.092659 1118 | https://github.com/geocheck/geofeed/blob/main/45.153.125.0-24_warsaw.csv | | 2022-09-27 08:48:30.836871 1094 | https://github.com/geocheck/geofeed/blob/main/45.88.10.0-24_jerusalem.csv | | 2022-09-27 08:48:30.320103 1063 | https://github.com/geocheck/geofeed/blob/main/5.182.34.0-24_roma.csv | | 2022-09-27 08:48:30.073939 1044 | https://github.com/geocheck/geofeed/blob/main/5.249.188.0-22_amsterdam.csv | | 2022-09-27 08:48:29.435688 1117 | https://github.com/geocheck/geofeed/blob/main/62.72.179.0-24_stockholm.csv | | 2022-09-27 08:48:30.650348 1026 | https://github.com/geocheck/geofeed/blob/main/88.216.184.0-24_london.csv | | 2022-09-27 08:48:29.425145 990 | https://github.com/geocheck/geofeed/blob/main/89.116.172.0-23_vilnius.csv | | 2022-09-27 08:48:28.647311 823 | https://github.com/geocheck/geofeed/blob/main/89.116.200.0-24_vilnius.csv | | 2022-09-27 08:48:27.109292 991 | https://github.com/geocheck/geofeed/blob/main/89.117.36.0-23_vilnius.csv | | 2022-09-27 08:48:28.668863 932 | https://github.com/geocheck/geofeed/blob/main/91.208.73.0-24_dublin.csv | | 2022-09-27 08:48:28.595083 1078 | https://github.com/geocheck/geofeed/blob/main/91.228.168.0-24_tbilisi.csv | | 2022-09-27 08:48:30.180117 27 | https://github.com/Gnetwork-networkteam/GeoLocation_RIPE/blob/main/Geoloc.csv | | 2022-09-27 08:48:17.995401 911 | https://github.com/luakst/geofeed/blob/main/45.140.244.0-23_edirne.csv | | 2022-09-27 08:48:27.925962 1023 | https://github.com/luakst/geofeed/blob/main/91.186.194.0-23_edirne.csv | | 2022-09-27 08:48:29.30541 874 | https://github.com/luakst/geofeed/blob/main/91.186.212.0-22_Helsinki.csv | | 2022-09-27 08:48:27.476827 903 | https://github.com/tognetwork/Geofeed/blob/6b6f7b1a0dd05b779b84c3a6dec29c41d2581335/GNetwork_geofeed.csv | 404 | 339 | https://github.com/welltelecom/geofeeds/blob/main/geofeeds.csv | | 2022-09-27 08:48:24.051408 853 | https://globalsecurelayer.com/google-data/GSL-geoip-feed.csv | | 2022-09-27 08:48:27.234621 238 | https://hatsnet.work/geofeed/2a0d-2587-8800-geofeed.csv | | 2022-09-27 08:48:22.339861 77 | https://hellomouse.net/api/geofeed | | 2022-09-27 08:48:19.656768 5 | https://info.net.deic.dk/deic-geofeed.csv | | 2022-09-27 08:48:17.481572 783 | https://ipbroker.info/geofeeds/ipbroker.csv | | 2022-09-27 08:48:26.303243 296 | https://ip-geolocation.fastly.com/ | | 2022-09-27 08:48:57.027704 280 | https://ip-geolocation.xenode.app/ | | 1045 | https://ipocean.ru/geofeed.csv | | 2022-09-27 08:48:29.825817 129 | https://itldc.com/ipgeo.csv | | 2022-09-27 08:48:20.404382 322 | https://jcs.jo/geotag.csv | | 2022-09-27 08:48:23.948845 220 | https://kagl.me/rfc8805.csv | | 2022-09-27 08:48:22.776661 7 | https://keanu.bahnhof.net/geofeed.csv | | 2022-09-27 08:48:17.234445 194 | https://kitten.network/geofeed.csv | | 2022-09-27 08:48:20.812368 690 | https://kreationnext.com/location/geofeed.csv | | 2022-09-27 08:48:25.805146 35 | https://lds.online/geofeed.csv | | 2022-09-27 08:48:18.375832 757 | https://massresponse.com/geofeed.csv | | 2022-09-27 08:48:26.131257 248 | https://minicdn.as211233.net/geofeed.csv | | 2022-09-27 08:48:23.154443 1140 | https://n.ceisn.co/rfc8805.csv | | 2022-09-27 08:48:31.158014 662 | https://netspeed.com.tr/geofeed.csv | 406 | 102 | https://noc.livecomm.net/geofeed.csv | | 2022-09-27 08:48:19.827172 281 | https://northlayer.com/geofeed.csv | | 2022-09-27 08:48:23.679303 384 | https://openfactory.net/geofeed.csv | | 2022-09-27 08:48:24.293772 89 | https://opengeofeed.org/feed/as142289.csv | | 2022-09-27 08:48:19.344074 255 | https://opengeofeed.org/feed/as203145.csv | | 2022-09-27 08:48:23.049411 279 | https://opengeofeed.org/feed/as203199.csv | | 2022-09-27 08:48:23.062945 40 | https://opengeofeed.org/feed/as208175.csv | | 2022-09-27 08:48:18.567422 41 | https://opengeofeed.org/feed/as208187.csv | | 2022-09-27 08:48:18.567422 42 | https://opengeofeed.org/feed/as212710.csv | | 2022-09-27 08:48:18.554637 9 | https://opengeofeed.org/feed/as49605.csv | | 2022-09-27 08:48:17.210073 112 | https://open.okita.network/feed.csv | 521 | 917 | https://packetwall.org/geofeed.csv | | 2022-09-27 08:48:28.620797 95 | https://peering.pudu.be/geofeed.csv | | 2022-09-27 08:48:19.71252 30 | https://profitserver.ru/php/geofeed.php | | 2022-09-27 08:48:18.375832 1144 | https://quickhost.uk/assets/geoip/geofeed.csv | | 2022-09-27 08:48:30.927672 1086 | https://raw.githubusercontent.com/ahdpik1/geofeeds/main/geofeed.csv | | 2022-09-27 08:48:30.298753 214 | https://raw.githubusercontent.com/Alexander-Berry-Roe/geofeed/main/geofeed.csv | | 2022-09-27 08:48:21.978083 136 | https://raw.githubusercontent.com/ChrisMacNaughton/geofeed-as207420/main/geofeed.csv | | 2022-09-27 08:48:20.597619 88 | https://raw.githubusercontent.com/c-nico/AS39792/main/geofeed.csv | | 2022-09-27 08:48:19.524228 811 | https://raw.githubusercontent.com/evoxt/geofeed/main/geofeed.csv | | 2022-09-27 08:48:26.781036 1173 | https://raw.githubusercontent.com/HeyKuxo/geofeed/main/geofeed.csv | | 2022-09-27 08:48:31.358659 1005 | https://raw.githubusercontent.com/Hoasted/geofeed/master/geofeed.csv | | 2022-09-27 08:48:28.768182 337 | https://raw.githubusercontent.com/jppol-noc/geoip/main/geoip.txt | | 2022-09-27 08:48:23.720605 205 | https://raw.githubusercontent.com/leo10ui/ripe/main/geofeeed.csv | | 2022-09-27 08:48:21.596527 209 | https://raw.githubusercontent.com/MrMoreira/geofeed/main/geofeed.csv | | 2022-09-27 08:48:21.699892 431 | https://raw.githubusercontent.com/navarino/geofeed/main/geofeed.csv | | 2022-09-27 08:48:24.338294 202 | https://raw.githubusercontent.com/ngarafol/geofeed/main/geofeed.csv | | 2022-09-27 08:48:21.353898 786 | https://raw.githubusercontent.com/notyourcommy/veesp-geo/main/geofeed.csv | | 2022-09-27 08:48:26.749152 216 | https://raw.githubusercontent.com/null31/geofeed/master/geofeed.csv | | 2022-09-27 08:48:22.118851 235 | https://raw.githubusercontent.com/rapdodge/AS203868-Geofeeds/main/geofeeds.csv | | 2022-09-27 08:48:22.264681 430 | https://raw.githubusercontent.com/servinga/geofeed/main/geofeed.csv | | 2022-09-27 08:48:24.411605 1033 | https://raw.githubusercontent.com/Simonadascalu/Freedomtech-Geofeed/main/Freedomtech%20solutions%20-%20ALL?token=GHSAT0AAAAAABQKY2PBBELKSXZL6TVYIS7SYP3WQZA | | 2022-09-27 08:48:29.60595 1149 | https://raw.githubusercontent.com/supplierstechpay/geofeed/main/geofeed.csv | | 2022-09-27 08:48:31.226972 224 | https://raw.githubusercontent.com/tomas347/geofeed/main/geofeed.csv | | 2022-09-27 08:48:22.217225 1035 | https://raw.githubusercontent.com/visnetwork/geofeed/main/geofeed.csv | | 2022-09-27 08:48:29.131521 1013 | https://raw.githubusercontent.com/vtainc/geofeeds/main/geofeeds.csv | | 2022-09-27 08:48:28.970679 87 | https://raw.githubusercontent.com/Web1-Oy/geofeed/main/geofeed.csv | | 2022-09-27 08:48:19.494054 114 | https://red-panda.be/geofeed.csv | | 2022-09-27 08:48:19.908639 213 | https://ripe-ariutk.onrender.com/geofeed.csv | | 2022-09-27 08:48:22.366702 645 | https://ripe.unyc.io | | 2022-09-27 08:48:25.476598 758 | https://rose.dsh-mirror.de/geofeed/geofeed.csv | | 2022-09-27 08:48:26.241661 25 | https://s3.wifirst.net/geofeed/AS52075_Geofeed.csv | | 2022-09-27 08:48:17.805775 712 | https://secure.wireline.com.au/geo/feed.csv | | 2022-09-27 08:48:28.120107 26 | https://self.bbanda.it/site.cgi?action=download_document&docnum=92120 | | 2022-09-27 08:48:18.34809 96 | https://server-factory.com/geofeed.csv | | 2022-09-27 08:48:19.6206 804 | https://service.wienenergie.at/media/files/geoip.csv | | 2022-09-27 08:48:26.856748 190 | https://smishcraft.com/geofeed.csv | | 2022-09-27 08:48:20.704988 554 | https://static.cloud.konicaminolta.eu/geofeed/204839.csv | | 2022-09-27 08:48:24.75275 532 | https://static.cloud.konicaminolta.eu/geofeed/205287.csv | | 2022-09-27 08:48:24.659096 201 | https://storage.pwn.blue/assets/geofeed.csv | | 2022-09-27 08:48:21.5807 653 | https://telecu.net/geofeed.csv | | 2022-09-27 08:48:26.055104 1080 | https://teploset.org/geofeed.csv | | 2022-09-27 08:48:30.760928 45 | https://tktelecom.ru/geoloc.csv | | 2022-09-27 08:48:19.215348 159 | https://v6only.host/geofeed.csv | | 2022-09-27 08:48:20.422941 475 | https://webhost1.ru/upload/geoip/geofeed.csv | | 2022-09-27 08:48:24.659096 763 | https://www.alwyzon.com/feeds/geoip.csv | | 2022-09-27 08:48:26.261211 257 | https://www.bnb.host/geo.csv | | 2022-09-27 08:48:23.028548 1240 | https://www.daryllswer.com/geofeed/ | | 2022-09-27 08:48:31.36456 769 | https://www.garrison.com/geolocation/KL-DC1.csv | 403 | 639 | https://www.garrison.com/geolocation/SG-DC1.csv | 403 | 634 | https://www.garrison.com/geolocation/UK-DC1.csv | 403 | 768 | https://www.garrison.com/geolocation/UK-DC2.csv | 403 | 637 | https://www.garrison.com/geolocation/US-DC1.csv | 403 | 638 | https://www.garrison.com/geolocation/US-DC2.csv | 403 | 47 | https://www.it-df.net/geofeed.csv | | 2022-09-27 08:48:18.644533 310 | https://www.iunxi.com/nl/ext/csv/geofeed.csv | | 2022-09-27 08:48:23.701211 131 | https://www.ncryptd.net/geo/feed/geofeed.csv | | 2022-09-27 08:48:20.243992 948 | https://www.onvi.nl/geofeed.txt | | 2022-09-27 08:48:28.245746 861 | https://www.paltel.ps/ip_geopaltel/ | | 100 | https://www.rkshosting.com/geofeed.csv | | 28 | https://www.siportal.it/csv/geofeed.csv | | 2022-09-27 08:48:17.86757 13 | https://www.teledata.de/as21263_geofeed.csv | | 2022-09-27 08:48:17.633081 882 | https://www.transmost.ru/files/geofeed.csv | | 2022-09-27 08:48:27.635654 34 | https://xiaoyu.net/BGP/geofeed.csv | | 19 | https://zappiehost.com/geofeeds.csv | | 2022-09-27 08:48:20.556104 (215 rows)
Hello Peter, > On 28 Oct 2022, at 21:29, Peter Müller <peter.mueller@ipfire.org> wrote: > > Hello Michael, > > above all, thank you very much for the patchset and all the work behind it. > > Unfortunately, as briefly discussed via the phone already, I have some general > concerns regarding geofeeds: > > (a) In contrast to RIRs, I do not see geofeed providers as trustworthy source. > While the former are not trustworthy in terms of the data they provide (since > no vetting or QA of database changes is usually conducted, and it does not look > to me like this is going to change soon), at least their infrastructure is: > It seems reasonable to me to trust, for example, RIPE's FTP server to serve > the same database files regardless of the client requesting it. For some of > them, we could even verify that through file signature validation, assuming that > it is too costly to do live GPG-signing at scale. > > Geofeed URLs, in contrast, can lead to anywhere, and I would not be surprised > at all to see dubious ISPs serving different geofeeds to different clients. > Given that our IP address ranges are public and static, and libloc reveals itself > through the User-Agent HTTP header, it would be quite easy to serve us a geofeed > that tampers with data, while playing innocent to other clients. > > In addition, many of the 215 geofeed URLs that are currently live (attached) point > to services such as Google Docs or GitHub - both don't strike me as reliable sources > in terms of persistence. Generally, we have the full problem of URL/domain rot again. :-( > > One could argue that these points (to a certain extend) hold true for RIRs as > well. However, if we cannot trust them, it's curtains for libloc either way. :-) > Some random ISPs trying to make us consuming geolocation data from random URLs, > on the other hand, poses a greater risk than benefit to the quality of the > location database. I see your point, but I disagree. The RIR databases are self-assessment, too. People can put whatever they want in there and it is not being checked by anyone. The only thing that you might have in favour of your argument is that there is a better paper trail of any changes than the geo feeds. Those can be changed - even randomly generated. But I believe that we have in both cases no chance to verify any data. Malicious players will fake their location even in the RIR databases. What I would suggest as a minimum is to select at least a couple of “trusted” or very large sources that we maintain manually. There are a couple of cloud providers which use Geofeeds and we would quite likely improve the quality of the data for them. > Which brings me directly to the next point... > > (b) Presumed we still agree on not being more precise than /24 or /48, all > the information geofeeds provide could (should?) have been in the RIR databases > as well. > > The only exception is ARIN, but since we do not get their raw database, we won't > be able to consume any geofeed URLs in it. So, for the area where we lack accuracy > of geolocation information most, geofeed won't help us. And for all the other RIRs > (LACNIC included, for which we process an additional geolocation database feed > already), the geofeeds ideally should not contain any new information to us. Why should we not process anything smaller than those prefixes? It wouldn’t hurt us at all. > Earlier today, I created a location database text dump on location02 with and without > the geofeed patchset applied. The diff can be retrieved from https://people.ipfire.org/~pmueller/location-database-geofeed-diff.tar.gz, > and is rather massive, partly because CIDRs smaller than /24 resp. /48 are yet to > be ignored by the geofeed processing routines. > > I have yet to assess the diff closely, but for a superficial analysis, it appears > like geofeed introduces a lot of changes that could have been in the respective RIR > databases as well. The fact that they are not there does not inspire confidence. > > Apologies for this rather disappointing feedback, and best regards, > Peter Müller<20221028_live_geofeeds.txt> Well, I don’t think this is disappointing. Technically I suspect that you are happy with the code. We now just need to figure out where to use it and where to not use it. Best, -Michael
diff --git a/src/scripts/location-importer.in b/src/scripts/location-importer.in index 9faf23b..5bd5da3 100644 --- a/src/scripts/location-importer.in +++ b/src/scripts/location-importer.in @@ -182,6 +182,11 @@ class CLI(object): CREATE INDEX IF NOT EXISTS networks_family ON networks USING BTREE(family(network)); CREATE INDEX IF NOT EXISTS networks_search ON networks USING GIST(network inet_ops); + -- geofeeds + CREATE TABLE IF NOT EXISTS network_geofeeds(network inet, url text); + CREATE UNIQUE INDEX IF NOT EXISTS network_geofeeds_unique + ON network_geofeeds(network); + -- overrides CREATE TABLE IF NOT EXISTS autnum_overrides( number bigint NOT NULL, @@ -799,6 +804,16 @@ class CLI(object): inetnum[key].append(val) + # Parse the geofeed attribute + elif key == "geofeed": + inetnum["geofeed"] = val + + # Parse geofeed when used as a remark + elif key == "remark": + m = re.match(r"^(?:geofeed|Geofeed)\s+(https://.*)", val) + if m: + inetnum["geofeed"] = m.group(1) + # Skip empty objects if not inetnum or not "country" in inetnum: return @@ -810,7 +825,6 @@ class CLI(object): # them into the database, if _check_parsed_network() succeeded for single_network in inetnum.get("inet6num") or inetnum.get("inetnum"): if self._check_parsed_network(single_network): - # Skip objects with unknown country codes if they are valid to avoid log spam... if validcountries and invalidcountries: log.warning("Skipping network with bogus countr(y|ies) %s (original countries: %s): %s" % \ @@ -823,6 +837,30 @@ class CLI(object): "%s" % single_network, inetnum.get("country")[0], inetnum.get("country"), source_key, ) + # Update any geofeed information + geofeed = inetnum.get("geofeed", None) + + # Store/update any geofeeds + if geofeed: + self.db.execute(""" + INSERT INTO + network_geofeeds( + network, + url + ) + VALUES( + %s, %s + ) + ON CONFLICT (network) DO + UPDATE SET url = excluded.url""", + "%s" % single_network, geofeed, + ) + + # Delete any previous geofeeds + else: + self.db.execute("DELETE FROM network_geofeeds WHERE network = %s", + "%s" % single_network) + def _parse_org_block(self, block, source_key): org = {} for line in block: