Over the last few weeks I have noticed slow loading times with websites like reddit and imgur especially in the evening.
While I have no issues streaming HD videos from some sites, e.g. youtube, some reddit images and gifs take multiple seconds to load.
After excluding some other potential problem sources I came to the conclusion that this must be connected to the ISPs peering policies and situation.
I have found multiple posts of others who apparently have these problems. My goal is to run some measurements, present my findings and discuss them with
the online community. This post will explain the context, my assessment of the situation and the detailed explanation of the measurement setup. The next
post will then go into the results and what conclusion I think I can draw from them.
Short disclaimer first: I am not an expert of how the backbone infrastructure of ISPs and all carriers that make up the internet works.
Also I don’t know how peering business deals look like. If I am writing something that is wrong, please let me know in the comments. My perspective in these
posts is the perspective of a consumer who noticed something wrong with the product and wants to do something about it. But first I need to learn what’s going on.
I live in Germany and my ISP is 1&1. They use Deutsche Telekom DSL infrastructure for part of the way. In this post I’m not going to blame anyone specifically,
because I don’t know where the problem actually originates. I starting noticing these issues a few weeks ago and they seemingly got worse since then.
At first I thought maybe it’s the ad blocker, DNS server, WiFi Setup or maybe slow servers where the web sites are hosted. But as soon as I switched from using
DSL to mobile network the same photos and gifs loaded instantly. I also asked some friends to try to load these files and they had no issues loading them in
<100 milliseconds. It was also pretty obvious that not all websites were affected. Youtube, netflix, etc. never showed any signs of slowing down, while
small pictures on reddit took several seconds to load. And it specifically happens in the evening.
Taking a look at actual response times by using the dev tools of the Chrome browser a simple picture of a cute cat
(see above; Original URL) that was only 422kB hosted at preview.redd.it
took 1.99s (time to first byte: 0.5s) to load. At the same time a speed test showed me that I should be totally fine because I get the 50Mbit/s I am paying for. Another
sign that this only affects certain sites and doesn’t seem to be a problem in my local network or with the DSL access in general.
While researching I find some posts mentioning the word peering. Wikipedia says peering is “is a voluntary interconnection
of administratively separate Internet networks for the purpose of exchanging traffic between the users of each network”. This
post by the server hoster Hetzner explains some issues they’ve had with the Deutsche Telekom when it comes to peering.
I am not sure if insufficient interconnections between different networks is causing the problems I am seeing, but it sure looks like that’s a possible explanation.
For me this meant needing to take a closer look. I setup a simple measurement scenario that could be replicated without much voodoo. The only requirements are bash and two
different servers (one at home, one somewhere else). The measurement results go into simple log files and will then be evaluated later on. The following diagram shows
what I came up with:
My server hosted at hetzner (Germany) and my home server both download two sample files each 10 minutes and log the timestamp and download speed. The hetzner server serves
as a reference to eliminate questions about the server speed being a problem.
I am sure this setup, due to its simplicity, will have some flaws. What happens some server inbetween starts caching one of these files, because they are requested so often?
What happens if one of the files is removed from the server during my measurement phase? I’ll see how it goes, maybe I’ll have to come up with something more sophisticated.
This script runs on both servers. The most important line is highlighted. It downloads the file using curl. The files are randomly taken from the websites. I never have issues with any
ard content so that one serves as another reference. The output is space separated and includes source (reddit or ard), a timestamp in iso-8601 format and the download
speed in bytes per second. What follows is an excerpt of what I’ve measured so far on the home server:
I’m not going to interpret these results yet. The next post will have detailed results of a longer time interval and I want to start a discussion about them. So if you happen to read this
let me know what you think. Do you have similar problems? Do you know why?