Plenary, Tuesday 25th of October, 2016

At 10 a.m.:

SPEAKER: Please take your seats, we are about to start.

RICHARD HESSLER: Please move in towards the centre of each row.

SPEAKER: Welcome back, I want to, we are about to start, I just want to remind you of the etiquette or protocol here in this hall. So, we are going to have several sessions, at the end of each session there is going to be a Q&A, so we will refrain from asking any questions until that Q&A portion and during the Q&A, in order to ask a question you need to press the mic button to speak but before you do that, you need to raise your hand and have one of us appoint to you ask that question in order to organise the Q&A. With that, we'd like to start with Ricardo Schmidt from University of Twente.

RICARDO SCHMIDT: May I start? Good morning everybody. So today I brought two presentations for you, and actually from the University of Twente we have three presentations, Wouter will speak just after me and I come back for a third presentation, I am sorry you get tired already for the first one.

All the presentations are in the topic of Anycast, Anycast deployment and analysis, the first one is a study that we have done on the relationship between latency, service latency and Anycast deployment and before you start throwing tomorrow ate toes at me I know that latency is not the only reason for doing Anycast but this talk is about latency.

This is a work that we have recently concluded, still some future work to do but this part is finished. We have collaboration with some people from University of southern California. You can access all the details in a technical report that we have published there, if you go to the slides in the RIPE meeting website, you can find the link there, and you can download the tech report. We are trying to get it published now.

So, I would like to start just giving a very brief overview of Anycast and why we do Anycast just to level up here the room, if there is anyone of you that is not familiar with the terms yet. So basically, operators try to get their services distributed to increase performance and resilience, redundancy, and of course, resilience against denial of service attacks, today is a big thing. That is the third talk. And basically, this goes against everything what we learn in our bachelor studies on networks, for example, that no IP addresses cannot be shared between machines in the Internet, here is completery the opposite, the same logical research will be used to configure two or multiple machines across the Internet. We trust on BGP to map users to correct sites of our deployments so it's a way, one of the means to reach distribution of a services in the Internet. There are other ways like DNS redirection and so on or even the combination of the two of them that can be used. Today the focus is on Anycast. If you have one single location of your service you might not be aware, or if you are aware, users that are far away from your service, as ‑‑ topologically, geographically there is some relationship there, these are other studies. People that are far away from your service will see RTTs much larger than the ones closer to your service, so you do a distribution of your service so that everybody is happy, everybody is accessing your service with the lowest RTT possible, and you trust BGP to create the areas to map the catchment areas of your service. So BGP will direct users to the closest Anycast site of your deployment. But what we mow is that BGP only proximate mates closeness or closest, and and that was the problem that we were looking at here. So if BGP only proximates the closest, we were interested in knowing how good is this proximatemation, what would be an optimal RTT if we don't consider Internet routing to be the one that maps our Anycast catchment, and at the end of this study we could come up with magical number and say if you have a geographically distributed this amount of instances or this amount of sites of your Anycast, you will have good performance.

To do that we studied four letters of the root DNS system, namely CFK and L letters. These are very different in size of deployment and policies, routing policies and the location of their sites, Anycast sites and that was why this was a nice object of study, because of this variety. And one thing that is important here, we were not ‑‑ it was not our goal to point fingers at one letter and say you are doing better than the other letters. We are just comparing and analysing late see, we are not here to judge whether one letter does better than the other one.

To do this analysis, we used RIPE Atlas framework, so around 7900 probes geographicically as you can see on this map, I hope you can see it, but we have a strong bias towards Europe as probably many of, you know, most of the RIPE Atlas probes are located in Europe, and we do not compensate for this bias in this study, this is future work, here we were ‑‑ using as many RIPE Atlas probes as possible to have an overview of latency, but I will show how this bias our results, it does not change our qualitative conclusions but does our quantitative results. The measurements were done in two steps from each RIPE Atlas probe, sent a chase owe query to the Anycast service and one of the service from one of these letters we applies to us with a string that uniquely identifies the serve that is responding so in this case we sent a chaos query and the location in Sydney is the one that replies to us from server number one, so then we know the site, the site here is Sydney. After we do this catchment mapping for all the probes, for all the 7900 probes, we use ping measurements to all the services of ‑‑ to all the servers of any Anycast services to determine whether the catchment is optimal or not so is the catchment RTT equal to the lowest RTT that we observe here, and there is one big thing here is that we are comparing Anycast routing to Unicast routing so we are aware of that, but we were interested to know what would be the optimal there, it doesn't matter if Anycast or Unicast. So the first thing that I can tell you is that good, all the letters do pretty well, so the RTT for the four services we analysed, C, F K and L is 32 milliseconds or lower. We consider that to be pretty good. I don't know, I don't operate a route server but I think that is a very low RTT.

But we were interested to know if we were able to overlook BGP routing or overpass, how better it could be and what we see here, is that C‑Root here actually there are two distribution lines there of RTT distribution and they basically overlap each other so which means C‑Root is already working at its maximum capacity or its best capacity, best performance, while L Root, which has 144 sites, has a lot of room for improvement there. It's not that L Root is doing bad again, but it could be much better if they were able to, I don't know, polish the routing policies or open more the sites where they are located, increase the connectivity to sites. Later I will show better the comparison between these two letters.

So C‑Root is only eight instances, eight sites it, cannot do better than that. L Root could do much better if 144 sites, that is the conclusion.

The location matters a lot, the location where the Anycast sites are. Here is a very simple example where we start with one instance only in LAX so this is basically simulating B Root catchment which is a single site located in Los Angeles, and here we see the bias so the distance between the C‑Root optimal RTT here so this is the best that it could do, and the distance between the other ones is basically the difference between the RIPE probes located in Europe and all the sites here combine it all located in US, so here we show that a single location or a single region does not, cannot provide good over all RTT so we have to go for more geographical distribution and here we show a combination of instances from Europe, of sites from Europe and from US, also drawn from the deployment of C roots, and we see that we have four sites only where we proximate really well the the distribution for C‑Root. So location matters but we still have a long distribution tail there and what is going on there. Do many sites actually help in the distribution tail?

So basically here, I have the distribution of RTT per country so all the countries that we were able to get RIPE Atlas to work. And we see that it goes from median so each dot represents a median of 300 milliseconds all the way down to very few milliseconds so mostly the green ones are European ones but what we see here is that countries that are located to the left side of the plot, they are the tail of the distribution of the RTT distribution and they cannot do much better than what they are doing and that is shown by the lowest ‑‑ lower quantile here. And if I compare that with L Root that has many more instances than C‑Root this is the big difference here, because we are talking about the distribution tail, so the medians are still quite high for L Root but there are some happy vantage points, RIPE Atlas probes here located in the quantiles here the fifth percentile that see, somehow, a very low RTT, which means that the number of sites do help a larger number of sites do help but it will depend a lot on where the probes are located or sorry, where the sites are located and what is the connectivity to that sites if the probes that are in that region or if the users will be able to see those sites.

So, already going to final considerations. The study is much larger than this that I had to present here so I invite you to go to the technical report if you are interested in this subject. But basically, our conclusion is that more than the number of sites location matters for Anycast deployment. When we take into consideration performance. But even more, a combination of location and the connectivity of the sites or even the connectivity matters even more than the location and if we have to come up with a magical number, this is not really arbitrary but we believe that 12 well distributed and well connected sites for an Anycast deployment can provide a good RTT, but these observation is considering Internet as seen by RIPE Atlas framework. For future work following the lines of this research we plan to address other aspects of other purposes of Anycast deployment, such as denial of service, resilience, load balancing and I do have a presentation on denial of service resilience which I will give in 20 minutes from now depending on how long Wouter will take on his presentation. So that is it, I would like to thank you for presenting this piece of research here, thank you for your attention and I will be glad to answer any questions that you have.


SPEAKER: Do we have any questions?

SPEAKER: Hi, from Google. I have a question: How easy would it be to produce these experiments with other root serves or any other infrastructure, DNS, websites, whatever, just to compare how distributed they are and what is the latency so I don't know much about RIPE Atlas, I don't know how easy to select latency probes because probably depending on the probe you select you select some oh, the experiment is not the same so, not the same so how easy will be to produce for other people, for example?

RICARDO SCHMIDT: Okay. Thank you for the question. So I will do some propaganda for RIPE Atlas here. It's very easy to access and start doing your measurements. You can use the same set of probes that you used in a previous measurement to another, so all this information you can find on RIPE Atlas web page. I do recommend to do that. There is the problem of bias from RIPE Atlas because of the many probes being located in Europe, you can try to get rid of those ‑‑ had a bias, compensate for that, the measurement it testify is very simple, it's chaos queries and pings, the only thing is chaos queries do demand that the service, the Anycast service replies to you as this identifier so if you query for TXT field host dot BIND for example, it will have to return you, I ‑‑ you can identify who is replying you so you can really determine the catchment. But we have other ideas of thousand measure the catchment as well, not specifically latency is that is what Wouter will show you next.

CHAIR: Any other questions? Just to explain myself, I am really cold down here, I have a higher operating temperature so that is why I am wearing the cap. Any questions?

RICHARD HESSLER: Thank you very much.

RICHARD HESSLER: And next we have Wouter de Vries so is from University of Twente and he will be talking about impact of ‑‑

WOUTER DE VRIES: Routing on Anycast. So how to to measure a controllable Anycast network. I am a PhD student at University of Twente, I am currently in my second year so the rest of the years I spend hopefully working on Anycast. Let's do a quick recap, if you were still asleep when my colleague was explaining it: If you have a non‑Anycast service you have one server which is, for example, in Madrid Spain which announces a prefix of the /24 and it's connected to the Internet and this can be anything, can be web server and that will be called a non‑Anycasted service. Now, if you perform some magic and you place another server in Amsterdam in the Netherlands and you announce the same prefix there, then you have an Anycast services not necessarily a good Anycasted service but you will have an Anycasted service.

And the point is that the routing policies which can be literally anything that is local preferences and ISPs, management decisions, this be Daemon, can be about anything, decide which client reaches which server within an Anycasted serve and this is what we call the catchment. So what is the problem with that? These routing policies are very diverse as an operator you are not able to control the end‑to‑end paths so what you end up with is a fairly chaotic catchment, so it can be completely sub‑optimal in the terms you will reach an instance in Sydney when you are located in Amsterdam and there is also a serve located in Amsterdam. This means that if you go back to the explanation of Anycast, it might be that all the clients are reaching the Madrid instance and not the one in Amsterdam. So what could we want to do: To measure this. Two initial problems, existing Anycast are interesting but are static so we can't change the configuration, the route server they don't really respond to when we ask them to shut down so we can run our experiments; and we don't really have a good way to determine the catchment. Another important point of having Anycast service that we are able to control ourselves is that we can look from the inside of the service which allows for far more detailed study. So let's look at this first problem. So we want to any Anycast service that we can control, there are two options: Either you can use the peering testbed, which is developed by some people in the US. They have a number of sites that you can use and you can apply to them with a project proposal, you can get some for some duration access to their testbed. You can do that, or, which I think is more fun, you can set up your own network. And when I was at RIPE 69 there was a presentation by Nat Morris which was called Anycast on a shoestring and he showed it was very easy to set up your own Anycast service. So I ran the I ended up doing a year of work to end up with Anycast service, we ended up with Bucharest and just now last month the testbed has become somewhat operational. So how does it work? We try to convince people to host a node site or instance that are somehow interchangeable and in our case we ask for a VM which we hope they provide for free, and then we set up the BGP session. Setting up a BGP session is also, can also be intimidating, sometimes, and even when your BGP session is running that might not mean that it actually does anything. So this is what our Anycast testbed looks like in theory. We have a researcher which is me, which is on the lower right, and I control the Anycast testbed which is nodes 1 through N, via the master node so I have one point from contact from there, I control the Anycast network via an API so I am able to programatically change the configuration of the Anycast testbed.

So far, so good. The green lines are the BGP sessions or the traffic that flows over the ‑‑ our own prefixes towards our own prefixes.

So what we have so far, we have ten Anycast sites distributed across the world, in the US and Australia and France, Japan, Brazil, two in the Netherlands, and we have to have that bias towards Europe. And this took a significant amount of time to set up. That varies from, depending on the person who is helping us, it can range anywhere from six months to get one node running to little under 24 hours if you have someone who is good with that kind of stuff.

So now that we have our Anycast service, we can go on to try to determine the catchment of our service. We have various options, my colleague was talking about RIPE Atlas, there are other things like PlanetLab and the LN NOG ring, and other things like ‑‑ infrastructure that you can use, but since we have this testbed for ourselves we can also try to measure it from the inside. How do we do that? We send a ping to the Internet, something happens and then we know the catchment. What happens when you ping someone? They send, if they reply to ICMP, echo requests, they send a reply back to the address that sent it to them. So if we use our Anycast address to send an ICMP packet, the client who receives it will send the ICMP reply back to that address, and that reply might not necessarily end up at the Anycast instance it sent the request, in fact that is not guaranteed at all. Here on the left you see a set‑up with one Anycast instance, a non‑Anycast instance so the ICMP request goes to the user and the reply goes back to the Anycast instance. Now, on the right you see Anycast service with three sites and they see that the reply is diverted to the Anycast site 3. And this allows for an interesting measurement, because what we can do is, we can set up collectors at all of our Anycast sites, all ten, in this case we use seven, we can collect room collectors that capture ICMP echo replies. We run a pinger on one of the Anycast sites and start pinging everyone. So, since that is a bit much, we ping one IP for every /24 prefix, and there is a list which is provided by the USC ISI, the information science institute in California where they, based on Internet censuses, they run periodically a census of the Internet where they scan the entire space and based on the past 16 they select one IP address for every /24 that is most likely to respond. If you use this you only have to ping one IP address in every /24. For example, in the example here you have 172.16.16 /23 and in this case two IP addresses, that could be in the list.

The good thing about that list is also ‑‑ those lists are also available for IPv6 and it would be completely infeasible to scan the entire IPv6 base where it is recently been possible in ‑‑ for IPv4, but for IPv6 you need this list because otherwise it will just not work. What is the coverage we achieved, we can see replies coming from 90% of all ASes, that is slightly more what Atlas is doing, running just over 6%, and for 30% of the ASes we see five replies coming back, that allows for an even deeper insight into the catchment.

So this is a picture of the catchment as measured by either using ICMP method or RIPE at as testbed and that measures using the case I don't say queries so each site runs a PowerDNS Daemon that responds to the chaos query with its name, he can send ICMP packets. You can see here the bias towards Europe in a sense that almost all of the catchments are made out of probes from Europe, for example the node here ‑‑ the node here in France sees a lot of packets coming from Asia while we don't see that coming back in the Atlas infrastructure because they don't have that many probes there. So using this ICMP method we are able to bias‑free view of our catchment, which is a thing incredibly useful for people running Anycast service because you don't need any clients to do something for you, you can poke them and they will tell you where they will go. I have put here two graphs which both have a logarithmic scale which makes it fairly misleading, but what you see here is, is a look inside the ASes, so because we have so many probes ‑‑ so many packets coming back from one AS we can see differences within an AS, so when you say that you have a coverage of 6% of all ASes, that might not mean that you have a complete view of that 6%. Some might be divided into separate parts. What you can see here for 45 approximately ASes, see six different Anycast sites within their AS so they have /24s that are responding, some are going to France and others to Sydney and this shows that you have to have far more than one probe within an AS to have a full coverage.

The CDF here shows that for some, well first off, it shows that if you have see more sites from a certain AS then it's usually also AS that is bigger, which is ‑‑ makes sense, somehow. What you also see is for ‑‑ some ASes that have more than ten /24s in them, they still go only to one site and on the other hand, you have ASes that only have, have less than 100 prefixes of /24 that go to six different Anycast sites so they go all across the world.

So this is I think an important message to people who are trying to measure things and are relying on the fact that an AS is a single entity. They are not entity and do whatever they want within an AS, it's just a number. So I will go on to my conclusions, which is that creating your own real world testbed for BGP is possible, but you will have to take your time. It's not just that it's technically different; it's not really that difficult, but it's more that you will have to play a lot of waiting for people to respond to your e‑mails and then not hearing from them and they go on holidays and you go on holidays and then in the end after six months you finally figure out there was some prefix filter at their upstream which meant your announcement was not going anywhere. Also a ping can give you a lot of information in an Anycast environment. It only works if you are within your own service so it can be used to to analyse the root service but for operators this can be a interesting method to exhaustively determine the catchment across the Internet. And I believe that this is also allows for fertile ground for Anycast optimisation because what you can see, we have a node in Brazil which is connected to end path which is a direct link to AMS‑IX which is kind of transparent and that causes, for example, if thank it if we enable that site and all the traffic from Europe is ‑‑ pulled towards Brazil even though we have two sites in the Netherlands. So that is ‑‑ that shows if you add a new Anycast site to your service it might cause extreme degradation in performance.

So, the tools that I use: So the tool that sends out ICMP messages is on GitHub. The tool that collects them is also on GitHub. You can return this yourself. The data that these graphs are based on will be made available soon, and if you talk to me after the break, then you can ‑‑ I can probably give them to you right away, I have them sitting on my laptop.

So finally, it's time for questions, comments, and I also particularly welcome proposals for collaboration, so if you have any purpose to use Anycast testbed and you want to collaborate on that, feel free contact me, thank you.


CHAIR: Do we have any questions? In the back? There is a hand right there.

SPEAKER: Good morning, I have a question about the difficulty of correct Anycast set‑up. As far as I heard from you that there is a big problem running a set‑up Anycast on some particular IXP and the serve located in the country which is far, far from the location point of IXP might be announced to Anycast. I have a question, how many times do you usually race on catching all things related to the catchment of the Anycast?

WOUTER DE VRIES: You mean how long it takes to run the experiment?

SPEAKER: How many times usually people actually may use ‑‑ by the correction of the any Anycast catchment.

WOUTER DE VRIES: How many time spend on that. To be honest, I don't know, I think people in the room probably know, but I have only experience from my own testbed and ours is not that bad that we have a varying catchment which is not really ‑‑ it's interesting to see that it's not very well balanced, so for us it's kind of great but I can imagine you can spend quite some time on it because BGP is a very complex ‑‑ the protocol itself not so complex but the Internet is and a lot of things happen and it can be incredibly hard to troubleshoot that and why some traffic is ending up in Brazil instead of Amsterdam.

SPEAKER: This is my point about it, thank you.

CHAIR: Any other questions? All right, thank you.

All right, our next presentation is by Ricardo Schmidt, Anycast V DDoS.

RICARDO SCHMIDT: Hello again. If you don't recognise me it's because I removed my jacket. Okay. Yet another talk about Anycast. But now let's talk about a topic that it's growing and growing, catching a lot of attention nowadays, which is DDoS. Yeah, there are some very new ‑‑ well, some very recent, even some DDoS that even caused some impact on this section, actually. So the work that I am going to present now has been recently published at ACIMC conference, another colleague will present there in California during that conference. I am sorry you are going to see that presentation twice but I assure that you watching the presentation from him is much more fun.

Again, besides the paper published at IMC, we also have a technical report on this work so you can just access and find all the details of this work, I will be very brief on the presentation, we covered a lot of ground in this study. As well as with the first presentation, this work has been done in cooperation with John Heideman from University of southern California and SIG N labs. A brief introduction, DDoS stands for distribution denial of service or distributed denial of service attacks and as the name says, the idea of this kind of attack is to deny service for legitimate users and how it is done, you have a service running the Internet, you have users accessing it, and a malicious entity starts sending a huge load of traffic to your services non‑legitimate traffic. Your service gets overwhelmed with that traffic and your users do not manage to get response any more so they have the service denied. And the distributed characteristic of this attack comes when the malicious traffic comes from many different sources across the Internet, which makes this problem much harder to mitigate.

So, DDoS is getting a lot of attention because the problem is getting bigger and bigger so today to see attacks higher than 100 gigabits per second, it's something usual. We have seen a recent case of Brian Crabb's website, the guy is a security specialist that constantly reports on DDoS activities and he was a victim or target of DDoS while his site was hosted at Acme, at that time it was a new record, roughly a month ago if I am not wrong, 665 to be the peak of the attack was already a record for a denial of service. Even Akamai thought that it was too much to handle, although they were hosting for free the website so maybe that is why they kicked him out of their infrastructure. But the problem gets worse when the security specialist from Akamai comes to public and say there are BotNets out there with capabilities we haven't seen before. And actually I apologise but this slide is slightly outdated because we know that there have been a couple of attacks that have crossed the barrier of one terabit per second recently so I should have updated this last night instead of drink beers.

Access to denial of service is getting usier and easier and one of the examples is the existence of websites called booters or stressors or denial of service for high or denial of service as a service, and there is a recent case of a booter called V does that came to public because the database was apparently leaked and two guys were arrested in Israel, suspected of owning the /PWAOERT, launch 150,000 attacks in two years and profiting around 600,000 dollars from these attacks so there is a huge market behind that and there is even ‑‑ the attacks that came from this kind of people, the people that own booters.

And attacks are getting more and more frequent. We see that also at the root DNS which was again the study case of ‑‑ the case of study of this work. And they have suffered three recent big attacks and attacks that happen in November and December 2015 were the target of our study. So, there were lots of articles, blog posts coming out, saying oh my gosh someone is trying take down the Internet, someone is learning how to do it, showing off the booter capabilities. I do have a hard time to believe that someone will ever take down the Internet as some people believe but this is a topic for off‑line discussions while having coffee or beer. And what is important is that the attacks are happening, the scans are happening, or whatever you want to believe it was, it is happening and on the attacks of November 30th 2015, there was an estimated peak of 35 gigabits per second which is much, much less than what we have seen recently.

Some root servers, some root letters they say they have seen a peak of 5 million queries per second reaching their servers, and but over all the impact on the root DNS was moderate and I will tell you why it was moderate.

So, I will not bore you again with a third introduction of Anycast service is but I would like to talk a little bit what the root DNS is and how it is distributed. So the root DNS has two levels of distribution at least: The first one is by letters, the service is not run by one operator, it's run by 12 operators that operate 13 letters, so there starts the distribution of the service. And they run the ‑‑ their letters the way that they want. They use the distribution strategies that they want and the software that they want, they are completely free and there is a second level of distribution that some of these letters do, almost all of them, which is distribution of the service using IP Anycast. And so which means to add redundancy to the service they distribute the sites across the world using IP Anycast and each one of the sites will run multiple servers. So there are several layers, several layers of distribution and several tastes of software there that add to the redundancy. And the fact that they don't ‑‑ they are run independently adds more robustness to the system.

So, for this study we used data that is again available by RIPE Atlas but we didn't run the measurements ourselves, these are built in measurements that every single RIPE Atlas probe run, which are chaos measurements towards the root DNS so this data was just sitting there. We just grabbed it and analysed the November 30 of denial of service into the root DNS and we reached our dataset using our Sec data as well and some BGP data MON data. So to start with, denial impact of the denial of services what was the impact across letters:

Well, there was quite some serious impact, especially to some of the letters like BCG and H, which suffered a lot from the denial of service attack in terms of reachability. B Root is the one here that is not Anycasted. So, you see that each one of these lines here is one root letter and when there is a drop it means that one ‑‑ one or multiple of our vantage points were not able to get response from the service so the reachability was compromised and at certain moments the B Root during attacks, B Root was not even able to reply to one of our requests, while C‑Root, F and H also suffered a lot but, somehow, the service was still rung, some probes were still able to get replies. Some other letters suffered just a little bit like E, F, I, J and K‑root. So they had reachability compromised at some of their sites but all the other sites were still fully operating and even taking some of the load of the other sites that were not reachable any more. And it was interesting that letters like D, L and M did not see a attack traffic. That was quite curious because it's not that we were not attracted, they were, but the attackers used outdated information like old IP addresses so the attack traffic got lost somehow.

The performance also, we also saw impact on the performance of the letters. For G‑Root we saw reachability of G‑Root was severely compromised, but some RIPE Atlas were still able to get a response from a G‑Root, but this response was coming with an average delay of six times higher than the normal delay of the ‑‑ of this letter, and we see this problem across letters, so we see K‑root also having a peak on delay, on RTT, although the reachability was not that compromised so that was only for some specific sites of K‑root.

Then talking about specific sites, we can step up the layer of distribution and look at the specific sites for each one of the letters, the data that we used to provide that kind of information. So here I have a very nice plot which was actually created by Wouter, the previous presenter here, he was also involved in this research, I forgot to mention that. Here we have, it's simple to understand, I hope you can see something of it. Every pixel of approximate this picture is one measurement by one RIPE Atlas probe. So in the vertical here, we have 300 RIPE Atlas probes and on the horizontal here line we have 48 hours of measurement. So every pixel means one chaos query and one response we got, and the two black areas here are the moment that the attacks occurred, and this measurements are for K‑root, for three instances of K‑root especially. So the probes that were initially being mapped by BGP to a site in Frankfurt, the probes that were being mapped to London and the probes that were being mapped to Amsterdam or started being mapped to Amsterdam after the attack, happen. So what we see here is that the black areas means that no response was received, so the attack did have some serious impact on these specific sites but it's nice to see that we saw ‑‑ that we saw a lot of site flippings and RIPE app lass probes do not behave as recursive resolvers that will jump from letter to letter or site from site trying to find the best RTT possible, they are static, so this means there was some manipulation on the routing or BGP announcements were lost or removed so rerouting was forced to overcome the attack. We see that probes that were going to London were having difficulties to get served by London so they start reaching Amsterdam. Amsterdam took over part of the load from London and many probes actually went back to London after the situation was normalised, some other probes took longer time to go back to London. We don't really mow the reason there but that is what we can see were the measurements.

So, going one step further in the hierarchy of the distribution. We can also have information and see information about individual sites. Remember, from the previous presentation when we send a chaos query to a service, the site that replies to us replies with identifier of the site and identifier of the server of that site. So we could see how load balancing, queuing or whatever was going on on those sites, was handling the huge load of traffic that they were receiving at that moment, the moment of the attacks. And this can vary, even within letters, so for example, I have two examples of K‑root, this, the lower plot is K‑root site in Japan and upload is K‑root site in Frankfurt and we see that in Frankfurt, only ‑‑ they both have three servers and if in Frankfurt only one server was able to ‑‑ was taking over all the load of the traffic that was reaching the server at that moment ‑‑ the site at this moment. And it was interesting to see in the first attack on November 30th it was server number 2 and in December 1st it was server number 3, so it's load balancing, I don't know, it's backup, we don't really know, that we would have to talk with K‑root operators and even if they are willing to share that information for publication, this is a specifics of how they operate the server but we see that this can be different within a single letter because in Japan, the behaviour was completely different; all the three servers were down in terms of reachability, they were not replying as all the requests that we were sending, they were trying to balance among each other, not like in Frankfurt only one server took over.

Okay. Not only inside the root DNS eco systems we saw ‑‑ sorry, still about the root system. Not only inside the attacked letters we saw problems in the letters that did not receive attack traffic at all. If you remember from my first slide of results, I mentioned that D root was one of the letters that did not receive attack traffic directly, but at site like Frankfurt where there are many other root servers that have a site, that have service running there, they are too close to each other and they are perhaps even sharing data centre resources and the root suffer on reachability so that was resources of the data centre or transport links, whatever, that were being overloaded so redundancy here is important for the root DNS but we have to be more careful on the redundancy so we don't co‑locate same sites of different letters in the same location.

And now, yes, not only root eco systems suffered but we also see for SIDN which is the authoritative service for .nl, they have sites in Frankfurt and Amsterdam where many of the root DNS letters have sites and saw a drop to zero for number of queries reaching their services which means data centre or links on the location were overloaded because of that attack. So through diversity is important when we are distributing a service.

So ‑‑ they handled the situation pretty well, at though moment it was completely off‑line so sites were overloaded but some other sites were taking over the load of this overloaded sites and so some other letters did not suffer at all, there was very few ‑‑ very little increase on RTT, for example.

But it's not by accident that this happened; the root DNS has a very nice fault tolerant design which is careful engineering, several layers of distribution, several flavours of software running there which adds to this diversity. And yeah, okay, what now we can learn from the root DNS experiences, of course, but we have to have in mind that 35 gigabits per second seems something way in the past, now we have terabits attacks coming to services as the example of Dyn last Friday, also OVH, if I am not mistaken, in France. So it doesn't matter what is behind the attacks if it's someone scanning or trying to show off or whatever, this can cause severe damage, in the case of root DNS this is critical Internet infrastructure and we have important CDNs like Dyn suffering as well, so keep in mind through diversity, distribution and Anycast and I will be happy if any of you interested in Anycast wants to collaborate in this kind of research. Thank you for your attention again and I am open to questions.


CHAIR: Do we have any questions?

RICHARD HESSLER: Go ahead, and state your name and affiliation.

SPEAKER: Philip, and I have just a comment about your research, that some days ago we seen very, very big attack on Dyn and there was huge impact on the resolution, even inside my country there was impossible to resolve feature and huge impact but there was only affected by attack on a particular DNS host, but large ‑‑ DDoS comes into life thanks to ‑ infrastructure. Anyway, thank you for your analysis on the root ‑‑ the most important infrastructure we have, root service, and it seems like we have something nearly to be done in ‑‑ to secure more out root infrastructure service. Thank you.

RICARDO SCHMIDT: Thank you very much for the comment. I would like to add to that, indeed, the study that we have done, I just want to highlight, as I did in the beginning of the presentation, the estimated peak of the attack was 35 gigabits per second, which is way smaller than what we have seen happening to Dyn, what we have seen happening in France two or three weeks ago as well, so it seems that big things are going to happen soon.

SPEAKER: Martin. Win mitigate one of those attacks I used very interesting ‑‑ I named ‑‑ I do Anycast node announce only to specific Internet exchange. It helps to mitigate because of one of Internet Exchange there is not possible for attackers to gain a large amount of traffic for attack. Do you use something like this?

RICARDO SCHMIDT: So, thank you for your comment, it's very interesting, actually, because just answering your question first: We don't do that because we don't run the root DNS, of course; we can provide some advice from what we learn, even John hind man is one of the responsible for running B Root which is single site so it's not Anycast so it was the one that suffered most. So he learned a lot from that, so ‑‑ but what your comment on isolating nodes or announcing nodes to attract the malicious traffic somewhere, this is actually part of Wouter's research on his PhD and we would like very much to hear your opinion on that later if we can meet off‑line, but we can play around with BGP routing and see how we can isolate either by changing routing policies of individual sites from local to global, global to local announcements or by using prepending to send traffic elsewhere or something like that.

CHAIR: Any other questions? Thank you, Ricardo.


RICHARD HESSLER: We have Annie Edmundson, talking about studying transnational routing detours through surveillance states.

ANNIE EDMUNDSON: I am a fourth year PhD student at Princeton in the US and I will be talking about some work I have done with a few other people at my university. And so we studied transnational routing detours through surveillance states.

So, as we know, when Internet traffic enter as country it becomes subject to that country's laws and policies, including surveillance laws so as more countries pass more mass surveillance laws, clients and end users have more need to control and also determine where their Internet traffic is going.

So the work that we did has two different components: The first one is characterising these detours, so we wanted to see where current Internet paths are going, are they going through surveillance states and, then we also wanted to look at local traffic to see if it's staying local or going to foreign countries or surveillance states.

The second component is on avoiding these detours through surveillance states. We want to see if end users can avoid certain countries being on the path to popular destinations, and so I have a small example showing how this might happen. Say, wave client in Brazil who is accessing content in the US, this is just an example but maybe their traffic goes through Mexico so this country level path is Brazil, Mexico, US, but if this client wants to avoid having their traffic go through Mexico, perhaps they can route around the country and route their traffic through a different country, still accessing the content that they want to access.

And so we want to see if this is possible and how effective it is, and then the last question we want to answer is, can end users keep local traffic local. So, say in the case there is a path that looks like this, a client in Brazil is accessing content also in Brazil, but the path goes a foreign country, in this case the US, we call this a thromboning path and want to newer end users can keep local traffic local.

So as I mentioned before, more countries are passing mass surveillance laws. The countries shown in red are just a knew we have seen conduct surveillance at different levels of intensity, some could be collecting metadata and others forcing ISPs to install black boxes. One other thing to note is that some countries form agreements and share surveillance data so there is kind of a famous one called five Is between Canada, the US, UK, Australia and New Zealand and they share their surveillance data, so if one of them has some data it's likely they will share it with the others. And so then as you can imagine, there is some countries that react to this. These are just a couple that have made statements or are taking measures to avoid other nations' surveillance on their citizens data. One example here is Brazil after the Snowden revelations, Brazil has taken some extreme measures to avoid their Internet traffic going through the US. One thing they are doing is they have ‑‑ they are building a 3,500 mile cable from Brazil to Portugal, and they are not using any American vendors. They also have switched their government e‑mail system from Microsoft outlook to a state‑made system called Expresso and pressuring companies to store Brazilian citizens' data within Brazil and these are just a few of the things they are doing. And after looking at all of these countries, we picked a few countries we wanted to study in more detail and see where the Internet paths coming from those countries are going, and these countries are shown in green here, so we picked the US, Brazil, Kenya, India and the Netherlands, and we picked each one for slightly different reasons. Brazil we picked for the reasons I just described, we wanted to see if all these actions are taken are paying off; we picked Kenya, we wanted to pick a country in Africa, there is a lot of research going on in the interconnectivity in Africa. Kenya has one of the largest IXPs in Africa and quite a few cable landing points. India also has many cable landing point and has the second highest number of Internet users in the world. And then we picked the Netherlands because they have an extremely large IXP but because they are also becoming a popular location to build CDNs and we pick the US because they are a non‑surveillance state and also have some of the largest Internet companies in the world.

So, we picked these five countries to study, and going back to the questions we wanted to answer, the first one being which countries are Internet paths to popular destinations currently traversing. So we found that the most common destination in transit country is the US, it doesn't matter if you are a client in Brazil, India, Kenya, the Netherlands, a lot of your traffic is going through the US. I will tell you how we came to this conclusion. We ran a measurement study and we wanted to see where these most popular destinations are going so we started with the Alexa top 100 domains, and these are differential the popular domains for Brazilian clients is very different than that for Kenyan clients and each of those domains also has third party domains that are automatically accessed when a page is visited so we wanted to include these in our study as well and to get these, we connected to VPN end points in the five countries we studied and can you recalled the top 100 domains, this gave us the body of the web page and from there we extracted the third party domains. This gave us a larger set of domains that includes first and third party domains. From there, we queried ‑‑ we ran local DNS queries on RIPE Atlas probes and we used a number of these in each of the five countries we studied, and then from the responses we created a mapping of domains to IP addresses, and from there we used the same RIPE Atlas probes to trace route to all these IP addresses. This gave us a set of trace routes that we mapped to the country level using a geolocation database and this gave us a large set of country level paths from clients in these five countries to popular destinations, and from there we could analyse them and see where these paths are going.

So the first thing we looked at are the end points. So where did these paths end, where are popular domains hosted? And so this table here shows the countries we studied on the top. This is the starting point of all the paths and the countries in the left‑hand column are the destination countries, these are the end points, so the fractions in the table represent a fraction of paths that start in had a country on the top and end in the country on the left. So, for example, .77% of paths start in Brazil and the US, and most of these fractions are relatively small with the main exception being the US as a destination. That is significant for action of paths from any of the countries we studied. Actually end in the US. And so then, after looking at the destinations, we looked at the whole path, what other countries are on this path, and this table has a similar format with the country we studied at the top, those are the starting points and the countries on the left are countries that are on the path at any point in time. And so the fractions represent the fraction of paths that start in the country on the top and at some point transit the country on the left. For example, 84% of paths that start in Brazil actually go through the US at some point. Again, US is the outlier here where over 50% of the paths from any of the countries we studied actually goes through the US. One other thing to note here is that there is a significant fraction of paths starting in the Netherlands, India and Kenya, that go through Great Britain, and for Kenyan clients we see that about a third of the paths go through either Mauritius or South Africa and I will come back to that later in this talk.

So our second question on characterising these detours was about local traffic, we wanted to see if it was leaving the country, where was it going and we found despite having large IXPs both Brazil and the Netherlands' paths offer thromboned to other foreign countries. So we can look closer at the Netherlands here, and these ‑‑ this bar‑chart here shows the fraction of thromboning paths and what fraction went to a specific country. So, for example, about 40% of thromboning paths went to the US, the US is the most common one here, what is interesting is that all the other countries in this bar‑chart are actually a geographically close to the Netherlands, which makes a little bit more sense than paths going from the Netherlands to the US and back to the Netherlands.

We can also look at Brazil's thromboning traffic. The US is a little more significant here, about 80% go to the US, which is interesting because Brazil has taken a lot of effort to make this exact thing not happen. And then lastly, we can look at Kenya, and Kenya looks a little bit different here. There is one outlier, which is Mauritius, Mauritius is a small island off of Madagascar, we can see it in this cable, the submarine cable map. It's circled in red there and Kenya has direct fibres from Kenya to Mauritius and then on to South Africa and also the UAE, which can explain what is happening in this bar‑chart here.

So, to summarise this characterisation and what we found: We saw that routing detours often do transit surveillance states, especially the US and we saw local traffic doesn't always stay local and also often transits the US. Is it possible to avoid certain countries by tunneling traffic through relays around the world? And so analysing this will help us answer the question can end users avoid certain countries that are on the path to popular destinations? And what we found is that, yes, it is possible to avoid certain countries, but it's much more difficult to avoid the US than it is to avoid any other country and sometimes it's impossible. So, I will talk about how we got to this conclusion here. We needed a way to quantify how avoidable a country is and so we defined this country avoidance matrix as the fraction of paths that do not pass through country X where the country is we want to avoid. So I have a small example on how to calculate this here. So say we are a client in Brazil again and accessing content, we have three paths that we are studying here and we want to determine the country avoidance matrix for this, so there is two paths that go through the US and one that doesn't, and this client wants to avoid the US, so the fraction of paths that don't pass through the US is only one, so, the country avoidance is one‑third. Now, say we set up some relays, maybe we have one in Europe, Australia and we want to use these to tunnel our traffic through to try to avoid the US, and so doing this, we end up with these paths here. Now, only one goes through the US, two do not, which raises the country avoidance matrix to two‑thirds and having a value closer to one means that country can be avoided more often where a country avoidance closer to zero means the client can't avoid the country as often. Now that we have this metric, we ran a second measurement study toy how avoidable different countries are and to do so we needed two different paths, in this case. So we needed the client relay path and also needed the paths from relays to popular destinations, to see how avoidable countries are. So for the first path we have these client to relay paths. To get these we connected to our VPN end points in the five countries we studied and we ran trace routes to the relays IP addresses and set up 12 across ten different countries in North America, South America, Europe, Asia and Australia, and so this gave us a set of trace routes that we mapped to the country level and now we have these country level paths from clients to relays. The next path we need was the relay to popular destination path, and to get this, we SSHed into our relays and we trace routed to the IP address of the popular destinations, and this gave us a set of trace routes that we also mapped to the country level and using these two sets we could use that country avoidance metric to measure avoidability and we found that most countries are actually completely avoidable, here is a table of our results, the country that we studied is at the top, those are the starting points of the paths and for each we measured the country avoidance without relays and also with relays. The countries in the left‑hand column the countries we want to avoid, so, for example, paths starting in Brazil, without using relays about 15% of them can avoid the US, and we can see this is fairly similar among all non‑US countries we studied, this fraction is very small, which means that the client can't usually avoid the US. When we use relays and can tunnel our traffic through them to try and avoid the US this fraction goes up, sometimes a significant amount but one thing to moat is that these numbers even in the improved case, are much lower than when clients try avoid any other country. So, one other thing I want to point out here is that about two‑thirds of Kenyan clients' traffic usually can avoid Mauritius and South Africa, about a third does not and when we use relays we actually see that the country avoidance remains the same for avoiding South Africa. These relays don't provide any additional avoidance and that is because the path between a Kenyan client and the 12 relays we have, South Africa is on every one of those paths, so there is no additional avoidance in that that case. On the other hand, we see that the relays significantly help avoid Mauritius, about 99% of paths from Kenya can avoid Mauritius when we tunnel through relays.

And then the last question we wanted to answer is can end users keep more local traffic local? So this is when a relay is in the client's own country, can they tunnel traffic through to prevent those thromboning paths and we actually found that it does help some amount, so thromboning Brazilian paths decreased, when we tunnel this traffic through relays within the country. And so these relays help clients both avoid other countries and also keep local traffic local. So, the next thing; what do we do with this information. So we know it's possible to avoid certain countries with these relays and so what we did was, we built a system for end users and we wanted this system to provide country avoidance that should be usable and scaleable so we set up this overlay network that does just this. We have relays in the nine locations on this map and the system has a few different components, so first of all, we have these relays, they act as web proxies, but the system is also measurement‑driven so it conducts the same measurements that I just talked about in our measurement studies and so we need the client to relay path again and we also need the relay to popular destination path. So, these relays conduct measurements to popular destinations by running trace routes and mapping to the country level. And then we also have this oracle which will trigger RIPE Atlas probes to run measurements and since we don't have access to our clients' computers, we represent client locations with RIPE Atlas probes and so these run trace routes to the relays and map them to the country level path and so then based on these two country level path sets that we have, we can tell clients which relay to use when they want to avoid a certain country and access a certain domain and the way to implement this is using a proxy out of configuration file. And so this is a small example of a PAC file here, it essentially allows you to say I want these domains to go through this proxy, and you can specify different proxies, so domains 1, 2 and 3 should go through this proxy, domains 4, 5 and 6 through other proxy and so on. And so what we do is, based on these country level paths, we generate this PAC file for a client in a given location that wants to avoid a specific country. And then all the client has to do is point their browser at this configuration file and they will be using our system. So, as of now, this system is in its prototyping phase. We are going to continue working on it. Some other future work we have is studying the connectivity within a country, so we have noticed that depending on where you are in the country, sometimes your traffic goes to other foreign countries more often than when you are located in different part of that same country. We also want to study the relationship between IXPs and the these routing detours so we have noticed interesting things particularly with Brazil where these thromboning domestic local paths actually go through IXPs in Brazil, but ISPs don't peer there because they think one or the other will steal their customers so it actually goes through the IXP up to Miami in the US where they peer and then back down to Brazil. So we want to study this a little bit more, and then we also want to measure country avoidance on IPv6, so this whole study was on IPv4. But this can also be extended to other types of networks, we could measure the difference between education networks and home networks and so on.

So, I just want to conclude with a couple of our findings. So, we found that paths commonly traverse known surveillance states, particularly 84% of paths from Brazil traverse the US. We also found that relays can help prevent routing detours, but some of the more prominent surveillance states are some of the least avoidable, specifically the US, and we found that thromboning paths do decrease when you can use relays within that same country. So we built a system that can do just this. We have more data and more information on our system at our website, At this point I would be happy to take any questions.

SPEAKER: Wolfgang Tremmel, Internet citizen and doing routing for quite some years. Are you aware trace routes only give you half of the truth

ANNIE EDMUNDSON: Are you talking about reverse paths also?


ANNIE EDMUNDSON: We measured asymmetry at the country level, and we found that they are not symmetric at the country level, so what I have been reporting here is really kind of a lower bound on the number of countries that could be seen as traffic, as the reverse path could have more countries that we haven't accounted for. But yeah, it is something we thought about but we didn't have a way to measure the reverse path from popular destinations to relays or to clients.

SPEAKER: Second question: I have done routing now for many, many years, I have never heard of a national routing policy or national routing centre.

ANNIE EDMUNDSON: Right, right. And I don't think that we have really talked about that, either.

SPEAKER: You have it on one of the slides. National state routing somewhere.

ANNIE EDMUNDSON: Oh, yes, this is just routing around nation states, that is what we were going for.

SPEAKER: But you are aware that Internet is made up of autonomous systems which are made of ISPs which may be located in one or many states and which have nothing to do with national borders.

ANNIE EDMUNDSON: Yes, yes, and that is ‑‑ that is exactly why this problem is hard, is because ‑‑ because we don't know exactly. There is no borders on the Internet, and so we wanted to see if there is a way to route around countries, despite all these kind of challenges with the Internet.

SPEAKER: Greg Russell, Google. I was curious if you had any thoughts about how effective country avoidance would have to be before it actually really matters? It seems to me that if 1% of the traffic from Brazil is going through US, that is probably all the US needs to figure out everything they want to know.

ANNIE EDMUNDSON: Good point, good point. And I think, right now, a lot of traffic is still going through the US so country avoidance can't really even get just down to 1% from Brazil going into the US. But we haven't really thought about what kind of that threshold should be that would make it worth it. It's definitely a good question.

SPEAKER: I would guess it's important from a public relations point of view at much lower levels, but actual effectiveness, who knows?

SHANE KERR: Shane Kerr from the Beijing Internet institute. Very good research and topic. It strikes me that in this talk, and also in the previous talks about Anycasting, that we really seem to have reached the limits of what the routing system can do, and the fact that you are building an overlay network to let users actually get the behaviour that they want, it strikes me that maybe there is a wider problem and I was wondering if you or your group has done any thoughts about what the properties of a system with more control for users or ‑‑ that optimises for other properties other than just hop count would look like?

ANNIE EDMUNDSON: So we haven't really, we have mostly just been coming up with this overlay network. We have also looked at other techniques to get country avoidance, like using open DNS resolvers in other locations to access different replicas. But we haven't studied something at kind of a lower level that would make the Internet more conducive to country avoidance. It's also on our list of future work, though. Thank you.

CHAIR: We have a few more questions.

SPEAKER: Sweden. What geolocation based service did you use?

ANNIE EDMUNDSON: We used MaxMind and we know it's fairly inaccurate and incomplete, so we ‑‑ but the way we did our measurements, we would like to compare our results if we used different geolocation databases. There aren't many out there that are very complete and accurate and we didn't want to take that on as a research problem, it wasn't ‑‑ because it's its own research area, but we use MaxMind knowing that we are geolocating at the country level which is more of a ‑‑ and we also drop any responses that MaxMind can't answer, so this could also ‑‑ this result is also a lower bound in that sense. There might be more countries on this path that we aren't including because MaxMind is incomplete. But it is one of the limitations of this work.

SPEAKER: I have some activity in the ‑‑ I can tell you that Kenya is not going through Mauritius, I suspect that is geolocation issue as was said, because there is cable provider called C com there whose company is registered in Mauritius and all IP addresses registered in Mauritius but they have a cable going from South Africa to London and from Mombassa in Kenya, that is why you think traffic is going through Mauritius. To carry on what he was saying here, it's very important and very difficult to have the geolocation data accurate and you can't rely at all on MaxMind. You will have to make your own database based on latency.

ANNIE EDMUNDSON: Right. Thank you.

CHAIR: On the right here.

SPEAKER: Hi, so my initial impression was that ‑‑ websites are heavily used by CDNs so hosted by CDNs and also cached in the service provider networks, so my impression was when I go to a website I would most likely hit cache from ISP's network and not go to US, so hitting on the previous points, have you done any validation in terms of latency impact? And also, I was wondering why did you use the VP instance in one and not Atlas probes to begin with because you can measure from the residential probes and then you will have ‑‑ hit the cache of these websites?

ANNIE EDMUNDSON: For your first question, on the latency, so we want to measure the latency using this prototype, we haven't gotten there yet, but we ‑‑ it's kind of one of the caveats of using a system like this, there is probably going to be a much higher latency because you are purposely routing your traffic around the world. We use VPN end points for the second measurement study, this was mostly because of limited RIPE Atlas credits, so, we used this, because for the second measurement study we are repeating this fairly regularly, so we didn't quite have the resources to repeat it.

CHAIR: We have a lot of questions, so be patient with us, we are going to queue them up together. So there is one in the centre here, I don't know where it went. It used to be here. It's gone. Gentleman over there, go ahead.

SPEAKER: This is Andrzej CZ.NIC. I think the geolocation is not the important, so I wouldn't ‑‑ focus on MaxMind or whatever the ‑‑ the important thing is jurisdiction the company operates. It doesn't matter if it's Kenya or Mauritius, it's important that it's registered in Mauritius and operates within the Mauritius jurisdiction, so the actual geolocation doesn't matter that much.

CHAIR: All right. So we had ‑‑ was it ‑‑ right over there.

SPEAKER: Thank you for the presentation. I just wanted to point out that you are right with MaxMind or some other providers you might not get accurate data for the infrastructure which trace routes mostly point out to. We have a project that called open IP map so I suggest you and to have a look at that, we are making it a final product soon. The goal of the project is to geolocate exactly infrastructure so it's not end user but for geolocating infrastructure.

ANNIE EDMUNDSON: Great. Thank you.

CHAIR: There was one person got up leaving ‑‑

SPEAKER: My question is regarding the VP points that you are producing, of those VPN providers, basically Cloud VPN providers use machines hosted in the Cloud or are they actually going residential networks as I believe this can affect the paths?

ANNIE EDMUNDSON: Yes, they are hosted in the Cloud. So we haven't done more analysis different end points, but it's a good thing to look into. Thank you.

SPEAKER: Wolfgang again. Are you aware that you are only measuring IP layer? There are several layers below you over which do you not have any data like there can be an MPLS layer below you which you do not see and the fibre layer below you you never see and we all know what can be done at fibre head layer in beach somewhere

ANNIE EDMUNDSON: Yes, and it would be great if we could measure it at a lower level. We just don't have a way to do that as of now. And trace routes were a much more accessible way to measure. But thank you.

CHAIR: Any other questions? Thank you, Annie.

That brings us to the close. I would like to remind to you please rate these sessions, we are going to be back here at 12:00. Thank you very much.