[01/10] importer: Store geofeed URLs from RIR data

Message ID 20220927164847.3409646-1-michael.tremer@ipfire.org
State Accepted
Commit 183b2f7477068a68e8b439754487565945899052
Headers
Series [01/10] importer: Store geofeed URLs from RIR data |

Commit Message

Michael Tremer Sept. 27, 2022, 4:48 p.m. UTC
  Signed-off-by: Michael Tremer <michael.tremer@ipfire.org>
---
 src/scripts/location-importer.in | 40 +++++++++++++++++++++++++++++++-
 1 file changed, 39 insertions(+), 1 deletion(-)
  

Comments

Peter Müller Oct. 28, 2022, 8:29 p.m. UTC | #1
Hello Michael,

above all, thank you very much for the patchset and all the work behind it.

Unfortunately, as briefly discussed via the phone already, I have some general
concerns regarding geofeeds:

(a) In contrast to RIRs, I do not see geofeed providers as trustworthy source.
While the former are not trustworthy in terms of the data they provide (since
no vetting or QA of database changes is usually conducted, and it does not look
to me like this is going to change soon), at least their infrastructure is:
It seems reasonable to me to trust, for example, RIPE's FTP server to serve
the same database files regardless of the client requesting it. For some of
them, we could even verify that through file signature validation, assuming that
it is too costly to do live GPG-signing at scale.

Geofeed URLs, in contrast, can lead to anywhere, and I would not be surprised
at all to see dubious ISPs serving different geofeeds to different clients.
Given that our IP address ranges are public and static, and libloc reveals itself
through the User-Agent HTTP header, it would be quite easy to serve us a geofeed
that tampers with data, while playing innocent to other clients.

In addition, many of the 215 geofeed URLs that are currently live (attached) point
to services such as Google Docs or GitHub - both don't strike me as reliable sources
in terms of persistence. Generally, we have the full problem of URL/domain rot again. :-(

One could argue that these points (to a certain extend) hold true for RIRs as
well. However, if we cannot trust them, it's curtains for libloc either way. :-)
Some random ISPs trying to make us consuming geolocation data from random URLs,
on the other hand, poses a greater risk than benefit to the quality of the
location database.

Which brings me directly to the next point...

(b) Presumed we still agree on not being more precise than /24 or /48, all
the information geofeeds provide could (should?) have been in the RIR databases
as well.

The only exception is ARIN, but since we do not get their raw database, we won't
be able to consume any geofeed URLs in it. So, for the area where we lack accuracy
of geolocation information most, geofeed won't help us. And for all the other RIRs
(LACNIC included, for which we process an additional geolocation database feed
already), the geofeeds ideally should not contain any new information to us.


Earlier today, I created a location database text dump on location02 with and without
the geofeed patchset applied. The diff can be retrieved from https://people.ipfire.org/~pmueller/location-database-geofeed-diff.tar.gz,
and is rather massive, partly because CIDRs smaller than /24 resp. /48 are yet to
be ignored by the geofeed processing routines.

I have yet to assess the diff closely, but for a superficial analysis, it appears
like geofeed introduces a lot of changes that could have been in the respective RIR
databases as well. The fact that they are not there does not inspire confidence.

Apologies for this rather disappointing feedback, and best regards,
Peter Müller
id  |                                                                               url                                                                                | status |         updated_at         
------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------+----------------------------
  803 | https://128bit.ee/geofeed.csv                                                                                                                                    |        | 2022-09-27 08:48:26.590283
  570 | https://1581710f-1ced-4a06-8390-7cc61076f103.selcdn.net/geofeed.csv                                                                                              |        | 2022-09-27 08:48:25.051701
  132 | https://207941.xyz/static/geo.csv                                                                                                                                |        | 2022-09-27 08:48:20.414238
  317 | https://39office.co.uk/geofeed/geofeed.csv                                                                                                                       |    406 | 
  555 | https://abramcevo.net/geofeeds.csv                                                                                                                               |        | 2022-09-27 08:48:24.871628
 1053 | https://akrn.net/geofeed/                                                                                                                                        |        | 2022-09-27 08:48:30.103967
  357 | https://altairline.net/ipgeofeed.csv                                                                                                                             |        | 2022-09-27 08:48:24.184789
  246 | https://api.ewpratten.com/network/geofeed.csv                                                                                                                    |        | 2022-09-27 08:48:22.676312
  278 | https://as203333.jonasbusch.de/geofeed.csv                                                                                                                       |        | 2022-09-27 08:48:23.171395
  256 | https://as204406.net/geofeed.csv                                                                                                                                 |        | 2022-09-27 08:48:23.042018
   52 | https://as205479.net/geofeed.csv                                                                                                                                 |    404 | 
  207 | https://as206016.ezdomain.ru/geofeed.csv                                                                                                                         |        | 2022-09-27 08:48:21.961518
  191 | https://as207941.net/static/geo.csv                                                                                                                              |        | 2022-09-27 08:48:29.101466
  128 | https://as209114.net/geofeed.csv                                                                                                                                 |        | 2022-09-27 08:48:20.541015
  142 | https://as211398.net/geofeed.csv                                                                                                                                 |        | 2022-09-27 08:48:20.695646
    8 | https://as47869.net/geofeed.csv                                                                                                                                  |        | 2022-09-27 08:48:16.969164
  459 | https://as56655.net/geofeed.csv                                                                                                                                  |        | 2022-09-27 08:48:24.426049
  842 | https://assets.ip-only.net/documents/geoipfeed/AS12552.txt                                                                                                       |        | 2022-09-27 08:48:47.301489
  206 | https://assigned.network/geofeed.csv                                                                                                                             |        | 2022-09-27 08:48:21.642988
  163 | https://august.tw/rfc8805.csv                                                                                                                                    |        | 2022-09-27 08:48:21.286217
  133 | https://b4rn.org.uk/geofeed.csv                                                                                                                                  |        | 2022-09-27 08:48:21.189017
  318 | https://bakergroup.uk/geofeed.csv                                                                                                                                |        | 2022-09-27 08:48:23.679303
  185 | https://bisv.ru/geofeed.csv                                                                                                                                      |        | 2022-09-27 08:48:21.167518
  493 | https://blade-public-static.blade-group.com/network/geofeed.csv                                                                                                  |        | 2022-09-27 08:48:25.0396
   91 | https://chesskuo.tw/geofeeds.csv                                                                                                                                 |        | 2022-09-27 08:48:20.384467
 1192 | https://cilix.co.uk/geofeed.csv                                                                                                                                  |        | 2022-09-27 08:48:31.21659
   76 | https://city-telekom.ru/geo/google.csv                                                                                                                           |        | 2022-09-27 08:48:19.094937
  408 | https://clouvider.com/geoipfeed.csv                                                                                                                              |        | 2022-09-27 08:48:24.893566
    4 | https://connectivia.it/wp-content/uploads/2022/08/geofeed.csv                                                                                                    |        | 2022-09-27 08:48:27.052104
  824 | https://convergenze.it/geofeeds.csv                                                                                                                              |        | 2022-09-27 08:48:26.868773
 1122 | https://cposo.ru/geofeed.csv                                                                                                                                     |        | 2022-09-27 08:48:31.070979
   24 | https://data-source.elastx.cloud/geofeed.csv                                                                                                                     |        | 2022-09-27 08:48:17.833537
  326 | https://deltatelesystems.ru/geofeed.csv                                                                                                                          |        | 2022-09-27 08:48:24.164182
  902 | https://docs.google.com/spreadsheets/d/1mPvsx7AzQx8yqFhyoYK6K9viEJNEpAZ3ICGKXG3Us5o/edit?usp=sharing/geofeed.csv                                                 |        | 2022-09-27 08:48:28.139785
  833 | https://docs.google.com/spreadsheets/d/e/2PACX-1vR2cb06ptOKwF8GLQwz0AAPJDjEi3L1Y0cWGlbB5pfyHFc_CFiW3tYTbrI5q3_Iuak2iiMB0htwLZPE/pub?output=csv                   |        | 2022-09-27 08:48:27.655495
 1015 | https://docs.google.com/spreadsheets/d/e/2PACX-1vSrczJhjW18HlapiISgmf6_52-Qc3fDA280OxzxjEEUAXnZxZgPEvh_qehOE_tvq-ZOVmICiYDr544P/pub?gid=0&single=true&output=csv |        | 2022-09-27 08:48:29.876872
  673 | https://docs.google.com/spreadsheets/d/e/2PACX-1vTfAIimGjXRLgfJd85VvVj5-fLa6a8kREezGUE8lLffEyW1i-mQCV8qPuqxIeyRzen1iYbaIDcEazaK/pub?output=csv                   |        | 2022-09-27 08:48:26.520855
  674 | https://docs.google.com/spreadsheets/d/e/2PACX-1vTfJX6jAMdyLkgqFPj9aoO5k9TEsFegn0KDWX4gfnK2XGbC1S3hZ1EHZOMMPJ0kchky7ElZr_Uh2vI6/pub?output=csv                   |        | 2022-09-27 08:48:26.805053
  197 | https://docs.google.com/spreadsheets/d/e/2PACX-1vTm5F9a_HLGR-FPR7LWut672RX4nihGTovrmtVd8asQq4N81_JRJEoVQzF9YWQdYBAccsc_yLp9Pi9K/pub?gid=0&single=true&output=csv |        | 2022-09-27 08:48:21.869312
  305 | https://etelecom.ru/files/geofeed.csv                                                                                                                            |        | 2022-09-27 08:48:23.672911
   53 | https://exoscale-prefixes.sos.exo.io/exoscale_geofeed                                                                                                            |        | 2022-09-27 08:48:19.077892
   43 | https://files.as209861.net/geofeed.csv                                                                                                                           |        | 2022-09-27 08:48:18.536718
  998 | https://files.delta.nl/geofeed-dfn.csv                                                                                                                           |        | 2022-09-27 08:48:28.699714
  636 | https://fj38gd6m2pd73464fjqw.maloco.ru/geofeed.csv                                                                                                               |        | 2022-09-27 08:48:25.445304
  245 | https://fr89.uk/geofeed.csv                                                                                                                                      |        | 2022-09-27 08:48:22.658833
  328 | https://gblnet.net/ipaddress/geofeed.csv                                                                                                                         |        | 2022-09-27 08:48:24.145082
 1069 | https://geo.blacknight.ie/ripe-geofeed.csv                                                                                                                       |        | 2022-09-27 08:48:30.140024
   39 | https://geo.daknob.net/as210312.csv                                                                                                                              |        | 2022-09-27 08:48:18.423616
  204 | https://geofeed.186526.net/geofeed.csv                                                                                                                           |        | 2022-09-27 08:48:21.427929
  106 | https://geofeed.as207111.net/geofeed.csv                                                                                                                         |        | 2022-09-27 08:48:20.243992
  101 | https://geofeed.as210546.net/subnet.csv                                                                                                                          |        | 2022-09-27 08:48:19.671305
  303 | https://geofeed.azedunet.com/geofeed.csv                                                                                                                         |        | 2022-09-27 08:48:23.662716
  236 | https://geofeed.b00b.eu/feed.txt                                                                                                                                 |        | 2022-09-27 08:48:22.48175
   74 | https://geofeed.bluevps.com/geofeed.csv                                                                                                                          |        | 2022-09-27 08:48:18.806875
  482 | https://geofeed.bsonetwork.net/geofeed.csv                                                                                                                       |        | 2022-09-27 08:48:25.421007
   15 | https://geofeed.constant.com/                                                                                                                                    |        | 2022-09-27 08:48:19.238399
  125 | https://geofeed.cynthia.re/geofeed/primary.csv                                                                                                                   |        | 2022-09-27 08:48:20.07647
  669 | https://geofeeddata.s3.eu-west-2.amazonaws.com/geofeed.csv                                                                                                       |        | 2022-09-27 08:48:25.590025
 1112 | https://geofeed.disney.com                                                                                                                                       |        | 2022-09-27 08:48:32.368274
 1085 | https://geofeed.disney.com/                                                                                                                                      |        | 2022-09-27 08:48:32.455889
   10 | https://geofeed.hostzealot.com/geofeed.csv                                                                                                                       |        | 2022-09-27 08:48:17.675103
   90 | https://geofeed.huize.asia/geofeed.csv                                                                                                                           |    523 | 
  354 | https://geofeed.ip-max.net/ip-max.csv                                                                                                                            |        | 2022-09-27 08:48:24.150327
  695 | https://geofeed.keepit.com/geofeed.csv                                                                                                                           |        | 2022-09-27 08:48:25.840645
   18 | https://geofeed.kviknet.dk/geofeed.csv                                                                                                                           |        | 2022-09-27 08:48:17.255633
    6 | https://geofeed.llnw.net/                                                                                                                                        |        | 2022-09-27 08:48:18.41984
  162 | https://geofeed.netfiretec.com/                                                                                                                                  |        | 2022-09-27 08:48:22.236797
  244 | https://geofeed.noc.vanvik.ax/geofeed.csv                                                                                                                        |        | 2022-09-27 08:48:22.385053
    1 | https://geofeed.rapidseedbox.com/geofeed.csv                                                                                                                     |        | 2022-09-27 08:48:17.381663
  211 | https://geofeed.rejecty.com/geofeed.csv                                                                                                                          |        | 2022-09-27 08:48:21.839197
  697 | https://geofeed.servercore.com/prefixes.csv                                                                                                                      |    404 | 
  796 | https://geofeed.servercore.com/subnets.csv                                                                                                                       |        | 2022-09-27 08:48:26.582391
  111 | https://geofeeds.ihcb-group.com/                                                                                                                                 |    526 | 
   23 | https://geofeed.snapserv.net/v1.csv                                                                                                                              |        | 2022-09-27 08:48:17.633081
  103 | https://geofeeds.speedypage.com/geofeed.csv                                                                                                                      |        | 2022-09-27 08:48:19.843224
  621 | https://geofeeds.surfshark.com/geofeed.csv                                                                                                                       |        | 2022-09-27 08:48:25.237156
  330 | https://geofeed.timeweb.net/geofeed.csv                                                                                                                          |        | 2022-09-27 08:48:24.032876
   78 | https://geofeed.transatel.com/geoloc.csv                                                                                                                         |        | 2022-09-27 08:48:19.238399
   20 | https://geofeed.tunenet.dk/geofeed.csv                                                                                                                           |        | 2022-09-27 08:48:17.656902
  706 | https://geofeed.wgtwo.com/geofeed.csv                                                                                                                            |        | 2022-09-27 08:48:26.116829
  117 | https://geofeed.zhiccc.net/2a0d-2587-geofeed.csv                                                                                                                 |        | 2022-09-27 08:48:20.163871
  127 | https://geofeed.zhiccc.net/2a0e-b107-geofeed.csv                                                                                                                 |        | 2022-09-27 08:48:20.147434
   44 | https://geo.imaster.ru/geofeeds.csv                                                                                                                              |        | 2022-09-27 08:48:18.936465
   80 | https://geoip.51178.ru/geofeed.csv                                                                                                                               |        | 2022-09-27 08:48:19.535356
   12 | https://geolocation.itm8.com                                                                                                                                     |        | 2022-09-27 08:48:17.612405
  658 | https://geolocation.misaka.io/as969/r4g.csv                                                                                                                      |        | 2022-09-27 08:48:25.610238
   31 | https://geolocation.misaka.io/as969/r6g.csv                                                                                                                      |        | 2022-09-27 08:48:18.054779
  295 | https://geo.telehouse-rechenzentrum.de/geofeed.txt                                                                                                               |        | 2022-09-27 08:48:23.186858
  198 | https://gf.hrvoje.org                                                                                                                                            |        | 2022-09-27 08:48:20.99458
  164 | https://gf.hrvoje.org/                                                                                                                                           |        | 2022-09-27 08:48:20.682033
  397 | https://github.com/aossia/geolocation/blob/main/geofeed.csv                                                                                                      |        | 2022-09-27 08:48:24.52641
    3 | https://github.com/datis-geoip/geofeed/blob/main/geofeed.csv                                                                                                     |        | 2022-09-27 08:48:17.461629
 1062 | https://github.com/geocheck/geofeed/blob/main/185.108.207.0-24_madrid.csv                                                                                        |        | 2022-09-27 08:48:29.918208
  996 | https://github.com/geocheck/geofeed/blob/main/185.167.234.0-24_zurich.csv                                                                                        |        | 2022-09-27 08:48:28.924989
  689 | https://github.com/geocheck/geofeed/blob/main/193.31.62.0-24_helsinki.csv                                                                                        |        | 2022-09-27 08:48:25.967151
 1077 | https://github.com/geocheck/geofeed/blob/main/193.37.196.0-24_kishinev.csv                                                                                       |        | 2022-09-27 08:48:30.292457
 1002 | https://github.com/geocheck/geofeed/blob/main/194.150.210.0-24_ankara.csv                                                                                        |        | 2022-09-27 08:48:28.970679
 1003 | https://github.com/geocheck/geofeed/blob/main/194.150.211.0-24_bratislava.csv                                                                                    |        | 2022-09-27 08:48:29.029244
 1065 | https://github.com/geocheck/geofeed/blob/main/194.233.8.0-22_kiev.csv                                                                                            |        | 2022-09-27 08:48:30.061457
  828 | https://github.com/geocheck/geofeed/blob/main/195.34.78.0-24_athens.csv                                                                                          |        | 2022-09-27 08:48:27.226006
 1049 | https://github.com/geocheck/geofeed/blob/main/212.69.18.0-24_luxembourg.csv                                                                                      |        | 2022-09-27 08:48:29.631977
 1114 | https://github.com/geocheck/geofeed/blob/main/213.139.64.0-22_paris.csv                                                                                          |        | 2022-09-27 08:48:30.552134
 1116 | https://github.com/geocheck/geofeed/blob/main/213.225.238.0-24_riga.csv                                                                                          |        | 2022-09-27 08:48:30.737163
 1125 | https://github.com/geocheck/geofeed/blob/main/45.140.195.0-24_oslo.csv                                                                                           |        | 2022-09-27 08:48:31.092659
 1118 | https://github.com/geocheck/geofeed/blob/main/45.153.125.0-24_warsaw.csv                                                                                         |        | 2022-09-27 08:48:30.836871
 1094 | https://github.com/geocheck/geofeed/blob/main/45.88.10.0-24_jerusalem.csv                                                                                        |        | 2022-09-27 08:48:30.320103
 1063 | https://github.com/geocheck/geofeed/blob/main/5.182.34.0-24_roma.csv                                                                                             |        | 2022-09-27 08:48:30.073939
 1044 | https://github.com/geocheck/geofeed/blob/main/5.249.188.0-22_amsterdam.csv                                                                                       |        | 2022-09-27 08:48:29.435688
 1117 | https://github.com/geocheck/geofeed/blob/main/62.72.179.0-24_stockholm.csv                                                                                       |        | 2022-09-27 08:48:30.650348
 1026 | https://github.com/geocheck/geofeed/blob/main/88.216.184.0-24_london.csv                                                                                         |        | 2022-09-27 08:48:29.425145
  990 | https://github.com/geocheck/geofeed/blob/main/89.116.172.0-23_vilnius.csv                                                                                        |        | 2022-09-27 08:48:28.647311
  823 | https://github.com/geocheck/geofeed/blob/main/89.116.200.0-24_vilnius.csv                                                                                        |        | 2022-09-27 08:48:27.109292
  991 | https://github.com/geocheck/geofeed/blob/main/89.117.36.0-23_vilnius.csv                                                                                         |        | 2022-09-27 08:48:28.668863
  932 | https://github.com/geocheck/geofeed/blob/main/91.208.73.0-24_dublin.csv                                                                                          |        | 2022-09-27 08:48:28.595083
 1078 | https://github.com/geocheck/geofeed/blob/main/91.228.168.0-24_tbilisi.csv                                                                                        |        | 2022-09-27 08:48:30.180117
   27 | https://github.com/Gnetwork-networkteam/GeoLocation_RIPE/blob/main/Geoloc.csv                                                                                    |        | 2022-09-27 08:48:17.995401
  911 | https://github.com/luakst/geofeed/blob/main/45.140.244.0-23_edirne.csv                                                                                           |        | 2022-09-27 08:48:27.925962
 1023 | https://github.com/luakst/geofeed/blob/main/91.186.194.0-23_edirne.csv                                                                                           |        | 2022-09-27 08:48:29.30541
  874 | https://github.com/luakst/geofeed/blob/main/91.186.212.0-22_Helsinki.csv                                                                                         |        | 2022-09-27 08:48:27.476827
  903 | https://github.com/tognetwork/Geofeed/blob/6b6f7b1a0dd05b779b84c3a6dec29c41d2581335/GNetwork_geofeed.csv                                                         |    404 | 
  339 | https://github.com/welltelecom/geofeeds/blob/main/geofeeds.csv                                                                                                   |        | 2022-09-27 08:48:24.051408
  853 | https://globalsecurelayer.com/google-data/GSL-geoip-feed.csv                                                                                                     |        | 2022-09-27 08:48:27.234621
  238 | https://hatsnet.work/geofeed/2a0d-2587-8800-geofeed.csv                                                                                                          |        | 2022-09-27 08:48:22.339861
   77 | https://hellomouse.net/api/geofeed                                                                                                                               |        | 2022-09-27 08:48:19.656768
    5 | https://info.net.deic.dk/deic-geofeed.csv                                                                                                                        |        | 2022-09-27 08:48:17.481572
  783 | https://ipbroker.info/geofeeds/ipbroker.csv                                                                                                                      |        | 2022-09-27 08:48:26.303243
  296 | https://ip-geolocation.fastly.com/                                                                                                                               |        | 2022-09-27 08:48:57.027704
  280 | https://ip-geolocation.xenode.app/                                                                                                                               |        | 
 1045 | https://ipocean.ru/geofeed.csv                                                                                                                                   |        | 2022-09-27 08:48:29.825817
  129 | https://itldc.com/ipgeo.csv                                                                                                                                      |        | 2022-09-27 08:48:20.404382
  322 | https://jcs.jo/geotag.csv                                                                                                                                        |        | 2022-09-27 08:48:23.948845
  220 | https://kagl.me/rfc8805.csv                                                                                                                                      |        | 2022-09-27 08:48:22.776661
    7 | https://keanu.bahnhof.net/geofeed.csv                                                                                                                            |        | 2022-09-27 08:48:17.234445
  194 | https://kitten.network/geofeed.csv                                                                                                                               |        | 2022-09-27 08:48:20.812368
  690 | https://kreationnext.com/location/geofeed.csv                                                                                                                    |        | 2022-09-27 08:48:25.805146
   35 | https://lds.online/geofeed.csv                                                                                                                                   |        | 2022-09-27 08:48:18.375832
  757 | https://massresponse.com/geofeed.csv                                                                                                                             |        | 2022-09-27 08:48:26.131257
  248 | https://minicdn.as211233.net/geofeed.csv                                                                                                                         |        | 2022-09-27 08:48:23.154443
 1140 | https://n.ceisn.co/rfc8805.csv                                                                                                                                   |        | 2022-09-27 08:48:31.158014
  662 | https://netspeed.com.tr/geofeed.csv                                                                                                                              |    406 | 
  102 | https://noc.livecomm.net/geofeed.csv                                                                                                                             |        | 2022-09-27 08:48:19.827172
  281 | https://northlayer.com/geofeed.csv                                                                                                                               |        | 2022-09-27 08:48:23.679303
  384 | https://openfactory.net/geofeed.csv                                                                                                                              |        | 2022-09-27 08:48:24.293772
   89 | https://opengeofeed.org/feed/as142289.csv                                                                                                                        |        | 2022-09-27 08:48:19.344074
  255 | https://opengeofeed.org/feed/as203145.csv                                                                                                                        |        | 2022-09-27 08:48:23.049411
  279 | https://opengeofeed.org/feed/as203199.csv                                                                                                                        |        | 2022-09-27 08:48:23.062945
   40 | https://opengeofeed.org/feed/as208175.csv                                                                                                                        |        | 2022-09-27 08:48:18.567422
   41 | https://opengeofeed.org/feed/as208187.csv                                                                                                                        |        | 2022-09-27 08:48:18.567422
   42 | https://opengeofeed.org/feed/as212710.csv                                                                                                                        |        | 2022-09-27 08:48:18.554637
    9 | https://opengeofeed.org/feed/as49605.csv                                                                                                                         |        | 2022-09-27 08:48:17.210073
  112 | https://open.okita.network/feed.csv                                                                                                                              |    521 | 
  917 | https://packetwall.org/geofeed.csv                                                                                                                               |        | 2022-09-27 08:48:28.620797
   95 | https://peering.pudu.be/geofeed.csv                                                                                                                              |        | 2022-09-27 08:48:19.71252
   30 | https://profitserver.ru/php/geofeed.php                                                                                                                          |        | 2022-09-27 08:48:18.375832
 1144 | https://quickhost.uk/assets/geoip/geofeed.csv                                                                                                                    |        | 2022-09-27 08:48:30.927672
 1086 | https://raw.githubusercontent.com/ahdpik1/geofeeds/main/geofeed.csv                                                                                              |        | 2022-09-27 08:48:30.298753
  214 | https://raw.githubusercontent.com/Alexander-Berry-Roe/geofeed/main/geofeed.csv                                                                                   |        | 2022-09-27 08:48:21.978083
  136 | https://raw.githubusercontent.com/ChrisMacNaughton/geofeed-as207420/main/geofeed.csv                                                                             |        | 2022-09-27 08:48:20.597619
   88 | https://raw.githubusercontent.com/c-nico/AS39792/main/geofeed.csv                                                                                                |        | 2022-09-27 08:48:19.524228
  811 | https://raw.githubusercontent.com/evoxt/geofeed/main/geofeed.csv                                                                                                 |        | 2022-09-27 08:48:26.781036
 1173 | https://raw.githubusercontent.com/HeyKuxo/geofeed/main/geofeed.csv                                                                                               |        | 2022-09-27 08:48:31.358659
 1005 | https://raw.githubusercontent.com/Hoasted/geofeed/master/geofeed.csv                                                                                             |        | 2022-09-27 08:48:28.768182
  337 | https://raw.githubusercontent.com/jppol-noc/geoip/main/geoip.txt                                                                                                 |        | 2022-09-27 08:48:23.720605
  205 | https://raw.githubusercontent.com/leo10ui/ripe/main/geofeeed.csv                                                                                                 |        | 2022-09-27 08:48:21.596527
  209 | https://raw.githubusercontent.com/MrMoreira/geofeed/main/geofeed.csv                                                                                             |        | 2022-09-27 08:48:21.699892
  431 | https://raw.githubusercontent.com/navarino/geofeed/main/geofeed.csv                                                                                              |        | 2022-09-27 08:48:24.338294
  202 | https://raw.githubusercontent.com/ngarafol/geofeed/main/geofeed.csv                                                                                              |        | 2022-09-27 08:48:21.353898
  786 | https://raw.githubusercontent.com/notyourcommy/veesp-geo/main/geofeed.csv                                                                                        |        | 2022-09-27 08:48:26.749152
  216 | https://raw.githubusercontent.com/null31/geofeed/master/geofeed.csv                                                                                              |        | 2022-09-27 08:48:22.118851
  235 | https://raw.githubusercontent.com/rapdodge/AS203868-Geofeeds/main/geofeeds.csv                                                                                   |        | 2022-09-27 08:48:22.264681
  430 | https://raw.githubusercontent.com/servinga/geofeed/main/geofeed.csv                                                                                              |        | 2022-09-27 08:48:24.411605
 1033 | https://raw.githubusercontent.com/Simonadascalu/Freedomtech-Geofeed/main/Freedomtech%20solutions%20-%20ALL?token=GHSAT0AAAAAABQKY2PBBELKSXZL6TVYIS7SYP3WQZA      |        | 2022-09-27 08:48:29.60595
 1149 | https://raw.githubusercontent.com/supplierstechpay/geofeed/main/geofeed.csv                                                                                      |        | 2022-09-27 08:48:31.226972
  224 | https://raw.githubusercontent.com/tomas347/geofeed/main/geofeed.csv                                                                                              |        | 2022-09-27 08:48:22.217225
 1035 | https://raw.githubusercontent.com/visnetwork/geofeed/main/geofeed.csv                                                                                            |        | 2022-09-27 08:48:29.131521
 1013 | https://raw.githubusercontent.com/vtainc/geofeeds/main/geofeeds.csv                                                                                              |        | 2022-09-27 08:48:28.970679
   87 | https://raw.githubusercontent.com/Web1-Oy/geofeed/main/geofeed.csv                                                                                               |        | 2022-09-27 08:48:19.494054
  114 | https://red-panda.be/geofeed.csv                                                                                                                                 |        | 2022-09-27 08:48:19.908639
  213 | https://ripe-ariutk.onrender.com/geofeed.csv                                                                                                                     |        | 2022-09-27 08:48:22.366702
  645 | https://ripe.unyc.io                                                                                                                                             |        | 2022-09-27 08:48:25.476598
  758 | https://rose.dsh-mirror.de/geofeed/geofeed.csv                                                                                                                   |        | 2022-09-27 08:48:26.241661
   25 | https://s3.wifirst.net/geofeed/AS52075_Geofeed.csv                                                                                                               |        | 2022-09-27 08:48:17.805775
  712 | https://secure.wireline.com.au/geo/feed.csv                                                                                                                      |        | 2022-09-27 08:48:28.120107
   26 | https://self.bbanda.it/site.cgi?action=download_document&docnum=92120                                                                                            |        | 2022-09-27 08:48:18.34809
   96 | https://server-factory.com/geofeed.csv                                                                                                                           |        | 2022-09-27 08:48:19.6206
  804 | https://service.wienenergie.at/media/files/geoip.csv                                                                                                             |        | 2022-09-27 08:48:26.856748
  190 | https://smishcraft.com/geofeed.csv                                                                                                                               |        | 2022-09-27 08:48:20.704988
  554 | https://static.cloud.konicaminolta.eu/geofeed/204839.csv                                                                                                         |        | 2022-09-27 08:48:24.75275
  532 | https://static.cloud.konicaminolta.eu/geofeed/205287.csv                                                                                                         |        | 2022-09-27 08:48:24.659096
  201 | https://storage.pwn.blue/assets/geofeed.csv                                                                                                                      |        | 2022-09-27 08:48:21.5807
  653 | https://telecu.net/geofeed.csv                                                                                                                                   |        | 2022-09-27 08:48:26.055104
 1080 | https://teploset.org/geofeed.csv                                                                                                                                 |        | 2022-09-27 08:48:30.760928
   45 | https://tktelecom.ru/geoloc.csv                                                                                                                                  |        | 2022-09-27 08:48:19.215348
  159 | https://v6only.host/geofeed.csv                                                                                                                                  |        | 2022-09-27 08:48:20.422941
  475 | https://webhost1.ru/upload/geoip/geofeed.csv                                                                                                                     |        | 2022-09-27 08:48:24.659096
  763 | https://www.alwyzon.com/feeds/geoip.csv                                                                                                                          |        | 2022-09-27 08:48:26.261211
  257 | https://www.bnb.host/geo.csv                                                                                                                                     |        | 2022-09-27 08:48:23.028548
 1240 | https://www.daryllswer.com/geofeed/                                                                                                                              |        | 2022-09-27 08:48:31.36456
  769 | https://www.garrison.com/geolocation/KL-DC1.csv                                                                                                                  |    403 | 
  639 | https://www.garrison.com/geolocation/SG-DC1.csv                                                                                                                  |    403 | 
  634 | https://www.garrison.com/geolocation/UK-DC1.csv                                                                                                                  |    403 | 
  768 | https://www.garrison.com/geolocation/UK-DC2.csv                                                                                                                  |    403 | 
  637 | https://www.garrison.com/geolocation/US-DC1.csv                                                                                                                  |    403 | 
  638 | https://www.garrison.com/geolocation/US-DC2.csv                                                                                                                  |    403 | 
   47 | https://www.it-df.net/geofeed.csv                                                                                                                                |        | 2022-09-27 08:48:18.644533
  310 | https://www.iunxi.com/nl/ext/csv/geofeed.csv                                                                                                                     |        | 2022-09-27 08:48:23.701211
  131 | https://www.ncryptd.net/geo/feed/geofeed.csv                                                                                                                     |        | 2022-09-27 08:48:20.243992
  948 | https://www.onvi.nl/geofeed.txt                                                                                                                                  |        | 2022-09-27 08:48:28.245746
  861 | https://www.paltel.ps/ip_geopaltel/                                                                                                                              |        | 
  100 | https://www.rkshosting.com/geofeed.csv                                                                                                                           |        | 
   28 | https://www.siportal.it/csv/geofeed.csv                                                                                                                          |        | 2022-09-27 08:48:17.86757
   13 | https://www.teledata.de/as21263_geofeed.csv                                                                                                                      |        | 2022-09-27 08:48:17.633081
  882 | https://www.transmost.ru/files/geofeed.csv                                                                                                                       |        | 2022-09-27 08:48:27.635654
   34 | https://xiaoyu.net/BGP/geofeed.csv                                                                                                                               |        | 
   19 | https://zappiehost.com/geofeeds.csv                                                                                                                              |        | 2022-09-27 08:48:20.556104
(215 rows)
  
Michael Tremer Oct. 29, 2022, 11:43 a.m. UTC | #2
Hello Peter,

> On 28 Oct 2022, at 21:29, Peter Müller <peter.mueller@ipfire.org> wrote:
> 
> Hello Michael,
> 
> above all, thank you very much for the patchset and all the work behind it.
> 
> Unfortunately, as briefly discussed via the phone already, I have some general
> concerns regarding geofeeds:
> 
> (a) In contrast to RIRs, I do not see geofeed providers as trustworthy source.
> While the former are not trustworthy in terms of the data they provide (since
> no vetting or QA of database changes is usually conducted, and it does not look
> to me like this is going to change soon), at least their infrastructure is:
> It seems reasonable to me to trust, for example, RIPE's FTP server to serve
> the same database files regardless of the client requesting it. For some of
> them, we could even verify that through file signature validation, assuming that
> it is too costly to do live GPG-signing at scale.
> 
> Geofeed URLs, in contrast, can lead to anywhere, and I would not be surprised
> at all to see dubious ISPs serving different geofeeds to different clients.
> Given that our IP address ranges are public and static, and libloc reveals itself
> through the User-Agent HTTP header, it would be quite easy to serve us a geofeed
> that tampers with data, while playing innocent to other clients.
> 
> In addition, many of the 215 geofeed URLs that are currently live (attached) point
> to services such as Google Docs or GitHub - both don't strike me as reliable sources
> in terms of persistence. Generally, we have the full problem of URL/domain rot again. :-(
> 
> One could argue that these points (to a certain extend) hold true for RIRs as
> well. However, if we cannot trust them, it's curtains for libloc either way. :-)
> Some random ISPs trying to make us consuming geolocation data from random URLs,
> on the other hand, poses a greater risk than benefit to the quality of the
> location database.

I see your point, but I disagree.

The RIR databases are self-assessment, too. People can put whatever they want in there and it is not being checked by anyone.

The only thing that you might have in favour of your argument is that there is a better paper trail of any changes than the geo feeds. Those can be changed - even randomly generated. But I believe that we have in both cases no chance to verify any data.

Malicious players will fake their location even in the RIR databases.

What I would suggest as a minimum is to select at least a couple of “trusted” or very large sources that we maintain manually. There are a couple of cloud providers which use Geofeeds and we would quite likely improve the quality of the data for them.

> Which brings me directly to the next point...
> 
> (b) Presumed we still agree on not being more precise than /24 or /48, all
> the information geofeeds provide could (should?) have been in the RIR databases
> as well.
> 
> The only exception is ARIN, but since we do not get their raw database, we won't
> be able to consume any geofeed URLs in it. So, for the area where we lack accuracy
> of geolocation information most, geofeed won't help us. And for all the other RIRs
> (LACNIC included, for which we process an additional geolocation database feed
> already), the geofeeds ideally should not contain any new information to us.

Why should we not process anything smaller than those prefixes? It wouldn’t hurt us at all.

> Earlier today, I created a location database text dump on location02 with and without
> the geofeed patchset applied. The diff can be retrieved from https://people.ipfire.org/~pmueller/location-database-geofeed-diff.tar.gz,
> and is rather massive, partly because CIDRs smaller than /24 resp. /48 are yet to
> be ignored by the geofeed processing routines.
> 
> I have yet to assess the diff closely, but for a superficial analysis, it appears
> like geofeed introduces a lot of changes that could have been in the respective RIR
> databases as well. The fact that they are not there does not inspire confidence.
> 
> Apologies for this rather disappointing feedback, and best regards,
> Peter Müller<20221028_live_geofeeds.txt>

Well, I don’t think this is disappointing. Technically I suspect that you are happy with the code.

We now just need to figure out where to use it and where to not use it.

Best,
-Michael
  

Patch

diff --git a/src/scripts/location-importer.in b/src/scripts/location-importer.in
index 9faf23b..5bd5da3 100644
--- a/src/scripts/location-importer.in
+++ b/src/scripts/location-importer.in
@@ -182,6 +182,11 @@  class CLI(object):
 				CREATE INDEX IF NOT EXISTS networks_family ON networks USING BTREE(family(network));
 				CREATE INDEX IF NOT EXISTS networks_search ON networks USING GIST(network inet_ops);
 
+				-- geofeeds
+				CREATE TABLE IF NOT EXISTS network_geofeeds(network inet, url text);
+				CREATE UNIQUE INDEX IF NOT EXISTS network_geofeeds_unique
+					ON network_geofeeds(network);
+
 				-- overrides
 				CREATE TABLE IF NOT EXISTS autnum_overrides(
 					number bigint NOT NULL,
@@ -799,6 +804,16 @@  class CLI(object):
 
 				inetnum[key].append(val)
 
+			# Parse the geofeed attribute
+			elif key == "geofeed":
+				inetnum["geofeed"] = val
+
+			# Parse geofeed when used as a remark
+			elif key == "remark":
+				m = re.match(r"^(?:geofeed|Geofeed)\s+(https://.*)", val)
+				if m:
+					inetnum["geofeed"] = m.group(1)
+
 		# Skip empty objects
 		if not inetnum or not "country" in inetnum:
 			return
@@ -810,7 +825,6 @@  class CLI(object):
 		# them into the database, if _check_parsed_network() succeeded
 		for single_network in inetnum.get("inet6num") or inetnum.get("inetnum"):
 			if self._check_parsed_network(single_network):
-
 				# Skip objects with unknown country codes if they are valid to avoid log spam...
 				if validcountries and invalidcountries:
 					log.warning("Skipping network with bogus countr(y|ies) %s (original countries: %s): %s" % \
@@ -823,6 +837,30 @@  class CLI(object):
 					"%s" % single_network, inetnum.get("country")[0], inetnum.get("country"), source_key,
 				)
 
+				# Update any geofeed information
+				geofeed = inetnum.get("geofeed", None)
+
+				# Store/update any geofeeds
+				if geofeed:
+					self.db.execute("""
+						INSERT INTO
+							network_geofeeds(
+								network,
+								url
+							)
+						VALUES(
+							%s, %s
+						)
+						ON CONFLICT (network) DO
+							UPDATE SET url = excluded.url""",
+						"%s" % single_network, geofeed,
+					)
+
+				# Delete any previous geofeeds
+				else:
+					self.db.execute("DELETE FROM network_geofeeds WHERE network = %s",
+						"%s" % single_network)
+
 	def _parse_org_block(self, block, source_key):
 		org = {}
 		for line in block: