An appeal for more bandwidth to the Internet Archive
nuclearcat at nuclearcat.com
Wed May 13 08:43:20 UTC 2020
On 2020-05-13 11:00, Mark Delany wrote:
> On 13May20, Denys Fedoryshchenko allegedly wrote:
>> What about introducing some cache offloading, like CDN doing? (Google,
>> Facebook, Netflix, Akamai, etc)
>> Maybe some opensource communities can help as well
> Surely someone has already thought thru the idea of a community CDN?
> Perhaps along the lines of pool.ntp.org? What became of that
> Maybe a TOR network could be repurposed to cover the same ground.
I believe tor is not efficient at all for this purposes. Privacy have
very high overhead.
Several schemes exist:
1)ISP announce in some way subnets he want to be served from his cache.
1.A)Apple cache way - just HTTP(S) request will turn specific IP to ISP
cache. Not secure at all.
1.B)BGP + DNS, most common way. ISP does peering with CDN, CDN will
return ISP cache nodes IP's to DNS requests.
It means for example content.archive.org will have local node A/AAAA
records (btw where is IPv6 for archive?) for
customers of ISP with this node, or anybody who is peering with it.
Huge drawback - archive.org will need to provide TLS certificates for
web.archive.org each local node, this is bad and probably no-go.
Yes, i know some schemes exist, that certificate is not present on local
node, but some "precalculated" result used, but it is too complex.
1.C)BGP + HTTP redirect. If ISP has peering with archive.org, to all
subnets announced users will get 302 or some HTTP redirect.
Next is almost same and much better, but will require small
modifications of content engine or frontend balancers.
1.D)BGP + HTTP rewrite. If ISP <*same as before*> URL is rewritten
will appear as
In second option ISP can handle SSL certificate by himself.
2)BGP announce of archive.org subnets locally. Prone to leaks, require
TLS certificates and etc, no-go.
You can still modify some schemes, and make other options that no one
has yet implemented.
because of way they work),
and for example, website generate content links dynamically, for that
client request some /config.json file
(which is dynamically generated and cached for a while), so we give it
to IPs that have a local node - URL of the local node, for the rest -
More information about the NANOG