POSTS
Problems with my Internet Connection - Part 1
Over the last few weeks I have noticed slow loading times with websites like reddit and imgur especially in the evening. While I have no issues streaming HD videos from some sites, e.g. youtube, some reddit images and gifs take multiple seconds to load. After excluding some other potential problem sources I came to the conclusion that this must be connected to the ISPs peering policies and situation. I have found multiple posts of others who apparently have these problems. My goal is to run some measurements, present my findings and discuss them with the online community. This post will explain the context, my assessment of the situation and the detailed explanation of the measurement setup. The next post will then go into the results and what conclusion I think I can draw from them.
Short disclaimer first: I am not an expert of how the backbone infrastructure of ISPs and all carriers that make up the internet works. Also I don’t know how peering business deals look like. If I am writing something that is wrong, please let me know in the comments. My perspective in these posts is the perspective of a consumer who noticed something wrong with the product and wants to do something about it. But first I need to learn what’s going on.
Context
I live in Germany and my ISP is 1&1. They use Deutsche Telekom DSL infrastructure for part of the way. In this post I’m not going to blame anyone specifically, because I don’t know where the problem actually originates. I starting noticing these issues a few weeks ago and they seemingly got worse since then. At first I thought maybe it’s the ad blocker, DNS server, WiFi Setup or maybe slow servers where the web sites are hosted. But as soon as I switched from using DSL to mobile network the same photos and gifs loaded instantly. I also asked some friends to try to load these files and they had no issues loading them in <100 milliseconds. It was also pretty obvious that not all websites were affected. Youtube, netflix, etc. never showed any signs of slowing down, while small pictures on reddit took several seconds to load. And it specifically happens in the evening.
Taking a look at actual response times by using the dev tools of the Chrome browser a simple picture of a cute cat (see above; Original URL) that was only 422kB hosted at preview.redd.it took 1.99s (time to first byte: 0.5s) to load. At the same time a speed test showed me that I should be totally fine because I get the 50Mbit/s I am paying for. Another sign that this only affects certain sites and doesn’t seem to be a problem in my local network or with the DSL access in general.
While researching I find some posts mentioning the word peering
. Wikipedia says peering is “is a voluntary interconnection
of administratively separate Internet networks for the purpose of exchanging traffic between the users of each network”. This
post by the server hoster Hetzner explains some issues they’ve had with the Deutsche Telekom when it comes to peering.
I am not sure if insufficient interconnections between different networks is causing the problems I am seeing, but it sure looks like that’s a possible explanation.
Measurement
For me this meant needing to take a closer look. I setup a simple measurement scenario that could be replicated without much voodoo. The only requirements are bash and two different servers (one at home, one somewhere else). The measurement results go into simple log files and will then be evaluated later on. The following diagram shows what I came up with:
My server hosted at hetzner (Germany) and my home server both download two sample files each 10 minutes and log the timestamp and download speed. The hetzner server serves as a reference to eliminate questions about the server speed being a problem.
I am sure this setup, due to its simplicity, will have some flaws. What happens some server inbetween starts caching one of these files, because they are requested so often? What happens if one of the files is removed from the server during my measurement phase? I’ll see how it goes, maybe I’ll have to come up with something more sophisticated.
Code
logfile='ispspeed.log'
download_sample () {
datetime=$(date --iso-8601=seconds)
result=$(curl $2 -o /dev/null -w %{speed_download} -s)
speed="${result%%,*}"
echo "$1 ${datetime} ${speed}"
}
while :
do
download_sample 'reddit' 'https://external-preview.redd.it/mp4/Y0CsmhfSFvu_HqysQA1Ed4oD73OtpsYorCIFJrSgvYw-source.mp4?s=c24d071636a35de8bfad866c018bed59672234ae' >> $logfile
download_sample 'ard' 'https://dasersteuni-vh.akamaihd.net/i/de/2019/03/22/ee56c487-3f1c-489f-bb13-2c51fd3872ec/,320-1_379735,512-1_379735,960-1_379735,480-1_379735,640-1_379735,1280-1_379735,.mp4.csmil/segment3_5_av.ts?null=0' >> $logfile
sleep 10m
done
This script runs on both servers. The most important line is highlighted. It downloads the file using curl
. The files are randomly taken from the websites. I never have issues with any
ard content so that one serves as another reference. The output is space separated and includes source (reddit
or ard
), a timestamp in iso-8601 format and the download
speed in bytes per second. What follows is an excerpt of what I’ve measured so far on the home server:
~$ head -n 20 ispspeed.log
reddit 2019-03-14T22:15:14+01:00 434285
ard 2019-03-14T22:15:19+01:00 2919081
reddit 2019-03-14T22:25:21+01:00 63749
ard 2019-03-14T22:25:54+01:00 3059830
reddit 2019-03-14T22:35:56+01:00 74161
ard 2019-03-14T22:36:24+01:00 2644445
reddit 2019-03-14T22:46:26+01:00 2505074
ard 2019-03-14T22:46:27+01:00 2965251
reddit 2019-03-14T22:56:28+01:00 76799
ard 2019-03-14T22:56:56+01:00 2655826
reddit 2019-03-14T23:06:58+01:00 96715
ard 2019-03-14T23:07:20+01:00 3980217
reddit 2019-03-14T23:17:21+01:00 147954
ard 2019-03-14T23:17:36+01:00 2879907
reddit 2019-03-14T23:27:37+01:00 247349
ard 2019-03-14T23:27:46+01:00 3177631
reddit 2019-03-14T23:37:47+01:00 951750
ard 2019-03-14T23:37:49+01:00 3663645
reddit 2019-03-14T23:47:50+01:00 2672566
ard 2019-03-14T23:47:51+01:00 3207144
I’m not going to interpret these results yet. The next post will have detailed results of a longer time interval and I want to start a discussion about them. So if you happen to read this let me know what you think. Do you have similar problems? Do you know why?