DNS Working Group
27 October 2016
At 3 p.m.:
JIM REID: Ladies and gentlemen, boys and girls, this is the DNS Working Group. If you are not interested in DNS you should be in the other room. Before we get started on agenda for today, a few announcements and practicalities to run through. First of all, if you have got any little devices that happen to ring or buzz or bleep, please put them into silent mode. If you are coming to the microphones to make any comments please state your name and affiliation very clearly because this session is being broadcast and the people following would like to know who is speaking and what they are actually saying and who they are representing, also for taking the minutes. We have two gentlemen here from the NCC someone monitoring the chat room and we have got Matt taking the scribes and recording the minutes of the meeting itself.
Now, first item on the agenda is the minutes from the last meeting, they were circulated recently, I can't remember exactly when, but can we take it as read that those minutes are acceptable and have been formally approved by the Working Group? I would assume that silence implies consent, unless someone speaks up we can consider them approved and they are duly done. We have no open action items at the moment so nothing to review there.
The next item on the agenda is the appointment of new Working Group co‑chair and this is the second time we have been through this exercise and the annual exercise, we had two candidates were nominated and I had to say I am a disappointed that we got so few expressions of support, we all had to think about 10 or 12 people in total expressing an opinion and that is a little bit disappointing so hopefully next time around more of will you come forward and express your feelings about who would you like to be in this role. But it's quite clear that from the expressions of interest and support that were posted to the list it's very clear that we have a clear and definitive choice and I am happy to announce that Shane Kerry is going to be the next DNS Working Group Chair so congratulations to Shane.
(Applause)
And we are on to the next item, we have our first item, which is the NCC report on the DNS infrastructure, and that is Anand.
Good afternoon, I am Anand Buddhdev from the RIPE NCC, and I am going to do an update for you about the RIPE NCC's DNS services over the last few months
First up, I would like to introduce to you some of the members of the RIPE NCC who participate in DNS service provision. You may know some of us already but we have some new colleagues and I thought it would be a good time to introduce you to them. On this slide you see the six people who are mainly involved, myself, Colin who is over here, inn go who is present in this meeting somewhere around, we have Paul, Florian and our manager, Romeo, and the six of us worked together to provide all the DNS services of the RIPE NCC. So, feel free to come and talk to us about any concerns, anything that you might want to discuss with us about our services.
Next up I'd like to talk about K‑root, the RIPE NCC operates one of the 13 root name servers of the Internet, the one we operate is called K, AS 25152. Since the end of last year we embarked on a expansion plan to add more sites of K‑root using BGP Anycast, and we have been doing this quite successfully over the last several months. And what we have now is 44 active sites around the Internet, five of these are what we call our core sites and these are multi server high capacity sites that can take on bigger query load, and then we have 39 sites which we call hosted sites, and these are hosted in cooperation with members of our community, our hosts purchase the hardware and provide the network and requirements and we run the service remotely. So we have now managed to expand K‑root into all sorts of areas where we previously had little coverage.
We have diverse range of implementations for routing, we have BIRD, Cisco, ExaBGP and Juniper so this gives us some diversity and protects us from exploits in the various routing implementations. As I mentioned previously we use BGP Anycast and the purpose is to lower latency and to increase availability.
We also have been running quite successfully a diverse mix of DNS servers, we operate BIND 9, Knot version 1.6 and NSD4 and these have been serving us quite well. Knot version 2 has matured into a very nice product as well, and we are now looking at it, and once we are satisfied with its performance and fee fear set we plan to update from up to Knot version 26789 we have more applications in the queue from different hosts who want to host a K‑root server and so these are continuing and we hope to add a few more nodes before the end of this year.
If anyone is interested in hosting a K‑root node, please come and talk to us or send us e‑mail or find any one of us here and we would be happy to talk to you about requirement and provisioning.
Next up, I would like to talk about the authoritative DNS services of the RIPE NCC. So besides K‑root, we also operate another DNS cluster from AS 197000, and this cluster is meant for the other DNS services of the RIPE NCC. It is also BGP Anycast from three different sites, London, Stockholm and Amsterdam, and in this authoritative DNS cluster we host ripe.net and various related zones of the RIPE NCC. We also provide secondary DNS service for the other RIRs, and they are forward and reverse zones because we have a reciprocal agreement with them so they provide secondary for our reverse zones and we to theirs. We provide secondary DNS for several ccTLDs and I will be talking more about this in a slide coming up. And for members of the RIPE NCC LIRs who have large Reverse‑DNS zones, we also provide secondary service on this cluster and we have something close to 4,500 such zones. Similar to K‑root, we have diversity here with BGP implementations as well as with DNS, so we also have BIND, Knot and NSD and Cisco and Juniper for routing.
The ccTLD secondary DNS that we provide came up for, let's say, a review last year ‑‑ sorry, not last year but the year before that, and the community worked with us to come up with a set of guidelines which were published in RIPE document, RIPE 663, and this document basically outlines the criteria under which ccTLDs qualify for secondary DNS services were the RIPE NCC. This is to make it clear for both the community and for ourselves as to who should get the secondary DNS service from the RIPE NCC, and it also makes it transparent and clear for everyone involved why we provide service to one ccTLD but not another one, for example. So, at the last RIPE meeting in Copenhagen we said we would start evaluating all the ccTLDs after Copenhagen, which we have been doing, and we have applied the first criteria, which is zone size, and evaluated that 28 ccTLDs do not qualify under this criteria. We have contacted all the operators and asked them to do a graceful migration of their services away from our DNS Cloud. There are some other ccTLDs that do not qualify based on their name server certificate, we are also evaluating all of these and contacting them and we will soon be sending e‑mails to those who don't qualify to also move their service away. I don't have the numbers for that now, though.
This process of evaluation and contacting the ccTLD operators will continue until mid‑2017, by which time we expect to have moved service away for those ccTLDs that do not qualify. Those that have qualified and remain with us will be required to sign an agreement with the RIPE NCC for provision of this service, and this agreement will be reviewed periodically. So that if there are any changes in a ccTLD status then we can reevaluate whether they qualify for service or not.
On this authoritative DNS Cloud we also operate other zones, we provide secondary for. The forward and Reverse‑DNS zones of other RIRs, these will remain. We also host a number of Internet infrastructure domains such as route ‑ service.org and AS112 .net, we continue these infrastructure to mains and will continue to provide secondary DNS for them but any other secondary zones, especially of commercial operators that we have had, will also be asked to move their service away.
Something else that we started looking at towards the end of last year was resiliency for our domain names, ripe.net and related ones, because DDoSes are becoming bigger and heavier and we would like to ensure that in the face of DDoSes against our name servers, service for ripe.net continues, so that all the services that depend on the ripe.net domain continue working and that our members are not as affected, for example.
So, in July we did a request for proposals, called for a request for proposals, and between July and September this process ran, and we received some applications. We received three proposals in total, and we selected the one that best satisfied all our requirements. And as a consequence of this, very sign managed DNS services is now providing secondary DNS for ripe.net and related zones. I would like to point out that we are not using solely Verisign for this, so we are still maintaining DNS service within our own infrastructure, we have other secondaries with RIRs but we also have Verisign in addition to this, and this is, we consider this important because it makes no sense to bundle all your DNS with one provider, for example.
The contract that we have with Verisign will be reviewed annually. Next up I would like to talk about another RIPE NCC service, DNSMON, how many you here are familiar with DOM DNSMON, so that is a good number. It's essentially a distributed DNS monitoring system so the RIPE NCC has been running this for many, many years now. Originally, it was based on the TTM, the test traffic measurement, network, but when TTM was retired, DNSMON moved on to using RIPE Atlas probes for measurement and visualisation. So, we use more than 50 RIPE Atlas anchors to monitor various domains from different vantage points on the Internet, and these probes perform SOA queries, host name dot BIND and version and enable NSID flag and they do these measurements over both UDP and TCP. All the results are collected and processed back at the RIPE NCC and there is a very nice visualisation at the URL that is on the slide, Atlas dot RIPE dot ‑‑ DNSMON so you can take a look at it there. Here is an example of a DNSMON visualisation, on the left‑hand side the text in red shows the name servers for a given domain so, in this case, for example, you can see the name servers of Spain, dot ES. And the green patch on the right‑hand side shows you the results ‑‑ summary of the results from all the various probes that are measuring these name servers. So green means good, it means that the name servers are answering and that the latency is not too high. If there are red patches or orange patches it indicates possible loss or latency or things like that. And this visualisation is actually quite advanced, so you can click on it, you can zoom in and out and you can click on individual cells for more detailed data, you can request trace routes from adjacent cells and all kinds of things like that, to help you debug issues that you might be experiencing or seeing in DNSMON. And because this is being measured from a distributed network, it helps show problems that you might not be able to see otherwise.
So, the domains that are present in DNSMON, mostly they are ccTLDs from within the RIPE region but previously there were no clear criteria about which could be in DNSMON and which not. There is another RIPE document 661 which has guidelines about which can go into DNSMON and the main criteria any ccTLD which is within the RIPE NCC service region qualifies to be added to DNSMON and if the ccTLD is not in the RIPE NCC service region but either the admintive contacts or technical contacts are members of the RIPE NCC then they can request that ccTLD to be added to DNSMON. So if you feel that you are a ccTLD operator under these guidelines and not present in DNSMON, please contact us and talk to us and we can have your ccTLD added to DNSMON.
If you are not a ccTLD, then we have something else for you, which is called domain MON, this is a relatively new service. It is essentially DNSMON's little brother and you can use it to measure any domain that you own or operate. The only caveat is that you need credits in the RIPE Atlas system. If you have enough credits then you can activate domain MON for your service and it will do measurements similar to DNSMON and will visualise them in the same way for you. So go check it out.
Domain MON, there is one other difference with DNSMON in that it doesn't measure from RIPE Atlas anchors but from RIPE Atlas probes so there is that small difference but other than that it's almost the same as DNSMON.
We have also been doing some DNS server benchmarking this year, and the motivation for this is again these pesky DDoS attacks, they are becoming bigger and more frequent and we need to be able to scale our services to try and defend against these DDoSes, and we wanted to have a more flexible upgrade path with all our current equipment, for example, we wanted to try and upgrade some sites from 1 G ports to 10 G ports, 10 gigabit. But one of the challenges there is we need to understand the limitations of our software and operating systems and hardware because while 1 G is fairly easy to do on even modest hardware 10G is a whole different thing, it's ‑‑ there is much more packets coming in, higher bandwidth and you need to know whether your OS and DNS server can cope with this. So wave test set up with a switch, 10G ports, three Dell servers with 10 processors, one runs the DNS software and someone a sink and we test at BIND 9.10, Knot DNS 1.6 and Knot DNS 2 and NSD4 and PowerDNS 4. The query source serves using TCP replay which can take a PCAP file and replies and I compiled it with Linux called Quick‑TX which allows TCP replay to place packets on the wire at very high speed by bypassing very high layers of the networking stack. I replayed a five‑minute PCAP from one of our live K‑root serve he is and sent close to 7 million queries towards each target server. Out of these 20 packets, as they were in empty packets that did not contain anything useful and so we did not expect responses for them but all the others should have had responses. So all the DNS servers were configured with the root zone, the ARPA zone and root dot services zones, rooting was set up to send all the responses towards a sink server and we use IP tables with pre routing and output chains and the raw table to count queries and responses and avoid connection tracking.
The sync server was similarly configured with IP tables and pre routing to count responses. Then I did test runs, I used TCP replay to play back these queries at various different rates, starting at 100,000 and stepping up bay 100,000 each time and did I more than one run because I wanted to pick the best result and eliminate any strange anomalies. And at the end I would replay packets at a very high query rate, the maximum that can play at to try and stress the DNS server. So I am not going doing into the details of all this testing, I'd like to present about that separately some other time but I would like to summarise some things that we found: We discovered very quickly that CentOS 6 just doesn't cope, the network interface is reported they lost about 85% of the packets coming in. So we quickly updated to CentOS 7 to do this testing and that can easily keep up with packets at high rate. It did not lose anything. And then the other thing we discovered was that of all the name servers we tested, NSD4 was the best performer as long as you tune it with two different things, you have to increase the server count because only sponds one worker process and if you enable one of its newer features, SO reuse port, then it really outperforms and does the best. However, even with this set‑up we found that we were not able to saturate the 10 gig link and that is because the CPUs were maxing out at about 8 gigabits per second, so adding faster CPUs or more CPUs to a server would enable us to saturate 10 gigabit link.
We have Reverse‑DNS. As another service of the RIPE NCC. So all the address space that the RIPE NCC manages needs to have reverse delegation and the users of, or the LIRs who receive address space from us can request Reverse‑DNS delegation with us, and when they do this we do pre delegation tests on their domains and we have been using software called DNS cheque which was written by IIS unfortunately DNS check has been abandoned, there is no more development happening with it and it's now showing its age. So, we have decided to use, to switch to Zonemaster, which is a newer product. This has been developed together by IIS, the Swedish registry and AFnic, the French registry. Zonemaster has better tests, more clearly defined, it handles all the newer DNSSEC algorithms and is being actively developed. So, we are going to do this migration to Zonemaster in the next several weeks after this meeting. Users should not experience any significant changes. If there are errors they will see error reports and these might look a bit different than they previously did so everything else should continue working as it did before.
Of late, there have also been many transfers happening of address space between RIRs and this affects Reverse‑DNS provisioning, of course, so we accept Reverse‑DNS provisioning to the RIPE database to the main objects and we do automatic provisioning in the parent zones. The software that we have developed and are running just itself automatically when address space is transferred, and it publishes delegation information in zone lets for the other RIRs to pick up, and it also picks up zone lets from other RIRs to stitch back into our parent zones. I would like to note that some delegation can take up to 24 hours to be published because this is a slow moving system. So, but it hasn't caused any major problems so far. Finally before I finish my presentation I would like to make a mention about a hack‑a‑thon that is going to take place next year. It is RIPE Atlas hack‑a‑thon focused on DNS and it will take place in Amsterdam somewhere around April, I am told. We don't have firm dates yet, but please watch the mailing lists and announcements for this and if you have any questions about this please come and back with myself or speak to Vesna, and she can provide more information about this. And that is the end. Thank you for listening.
(Applause)
JIM REID: Thank you, Anand. Are there any questions?
GEOFF HUSTON: APNIC. I am kind of curious, you mentioned you spent a lot of time, money and effort provisioning the infrastructure around these root name servers so you are trying to answer every query. How many of those queries generate an NX domain responses?
ANAND BUDDHDEV: We publish stats about this and from what I remember about 60% of the queries generate X domains at the moment.
GEOFF HUSTON: Are there smarter ways of saying no? If you are spending all this time, money and effort to say no, are there easier ways to do it than you are currently doing it?
ANAND BUDDHDEV: I don't know of any off the top of my head but I would like to hear ‑‑
GEOFF HUSTON: Of saying no.
ANAND BUDDHDEV: Yes. Another thing we may need to find out where these queries are coming from. My colleague Colin discovered the likes of Chrome were generating lots of queries to determine whether they were behind firewalls or not and things like that and we feel that is an abuse of the DNS and maybe we should be speaking to them about trying to do things in a smarter way.
GEOFF HUSTON: Could I make one other observation about doing things in a smarter way. Around 15,000 IP addresses accommodate around 96% of eyeballs. If you put a router that had a FIB of those 15,000, and treated those addresses as very important, what would the query rate be from just those?
ANAND BUDDHDEV: That is a very interesting experiment that somebody should run.
GEOFF HUSTON: Thank you.
JIM REID: Very nice answer, Anand.
(Applause)
Next up is Ondrej Surrey who is going to be talking about the filtering tools that have been dropped into the next version of Knot.
ONDREJ SURY: Hello. And I have a topic on DNS based filtering that is a little bit smarter than usual ‑‑ just a quick recap, what I am going to talk about, it's ‑‑ what is the Knot Resolver and new features we have that are related to the DNS firewall as we call it. There is demo running at the demo.knot.dns.rocks and starts every two minutes. So if you break it because the web interpace is not really supposed to be open to the intranet, so as I say for ‑‑ I restarted every two minutes. But you can look at it if you are able to watch me at the same time. And you can see some of the stuff I will be talking about in real life, didn't dare to have a demo.
So, just quick review, what is the Knot Resolver. It's ‑‑ we like to call it platform for building recursive services ‑‑ DNS services because it's not only a DNS resolver but since it's built on top of C and Lua and has a configuration, it's really extensible and real flexible since you are not tied to just the configuration options, you can write a script inside a configuration, it has no internal threading and sychronise, and since we have now port already mentioned by Anand it scales by firing up the same, the Daemon on the same IP and port and since the Knot Resolver has shared cache it works automatically.
A little bit of history. We released first version that we tagged ‑‑ somewhere in May 2016, and we released next version around August 2016 and if you have a ‑‑ also by cz.nic it's running on resolver version 111.
And the new features in release DNS over TLS, it uses socket activation so run with really low privileges because the sockets are BIND by system ‑‑ pass scripters to the Daemon, has DNS cookies, and it has http interface and DNS firewall. So http interface, what is it really about? It's web interface had a can be used by a real person, but it also provides a restful interface for various Daemons to use like can provide a simple statistics or export ‑‑ use you can inter with your monitoring, built on top of Lua, it looks real nice.
This is an example of the metrics, you can add more and less metrics in their own line interface, you can try that with the Daemon I told you about. It also has a real nice ‑‑ this could be animated if it was a PDF. So it shows dots on this map where the queries are going to, and on top of that there is the DNS firewall can use that as well. So, what is the difference between DNS firewall and access list? The typical axis list in current DNS servers are basically based on IP address usually source IP address, and it can do some actions like forward ‑‑ show them different sub zones or forward the queries somewhere else or deny or pass or do some stuff like that. We also have in the DNS application firewall as we call it, but we have more rules that can look into the packet and the DNS query and have some logic based on the, for example, Qname and IP address or matching the patterns in the queue names. It is some logic built in you can easily implement more matching rules if there is need and also more actions like you can truncate based on the source IP address or forward the query or mirror the query to different perhaps, some monitoring, DNS monitoring software running on different port or IP address you can forward the query there as well or mangle the responses.
And this is really useful for malware mitigation, for example, there is some common ‑‑ there is some other running on some website or IP address and you can just block it using this. So the functions that are exported to the Lua config file and there is a control socket you can change the rules on line, so you can list the rules and then you can add, delete, disable, enable individual rules. And I show awe few examples of that.
So the rule is built like a selector, then there is and/or operator and selector and you can use more of those and there is action. So, you can match on the QNAME, you can match on QNAME pattern, which is it's not ‑‑ then there is source address which can be IP block and destination address and the actions are pass, it's quite easy, you just give the answer, the deny, return, NX domain with stuff in additional so you know what is happening, the drop ServFail. Then there is common reader route which is you rewrite all IP addresses that match the pattern to some different IP address that ‑‑ I told you if there is some mall ‑‑ no mall war you can block all those and some other IP address. The other comment is rewrite but it's used to rewrite the QNAME, if ‑‑ based on the QNAME you rewrite just for specific QNAME and A or AAAA record, the mirror will just send a query to the copy of the query to somebody for IP address, forward the straightforward and truncate will return the end TC bit.
So, here are few examples: For example, you want to block all queries to RIPE.net so this is straightforward. Because you don't like RIPE. Or you can drop all queries to random string dot Knot.cz so you construct Lua pattern for that and drop all those queries. If you want to rewrite the IP address then you can say I am putting ‑‑ protecting just those clients so that is the source address, and the common reroute will rewrite the IP address on the right or the subnet on the left to the subnet to the right. Or the other ‑‑ right command, you can say rewrite just the A records for RIPE 73.ripe.net to a local host, for local host clients.
The query mirroring against straightforward, there is a condition on the left and mirror and where to send it. And forwarding, again you say for this kind of clients, forward it to some other IP or you can for example, forward the sub zone to different resolver, if you have hidden DNS server for your office and you have resolver on your local machine and you want to use that resolver in your local machine even if you come to the office you can add to forward all names under the sub zone to different name server and you can ‑‑ the good thing about this is that it will also be modified using the web interface and the ‑‑ lass sort of common library write the rules, delete the rules, stop them, enable them, again it's all running on the demo. And there is a restful interface for that, so you can have a system that will modify the rules on the fly, so you don't have to restart the serve or reconfig it, you will just feed the rules using the restful interface. This is example how to get to the rules, so it's get the IP address it's running on /daf, and it's ‑‑ it returns results in JSON. And here are more examples how to modify that on the fly. So you can add a new rule using post, http command, you can get the rule, patch, change some parameters that have or delete the rule on the fly. So, that is it for my presentation. There is a really nice documentation for that on Knot ‑ resolver.cz, if you have any questions about the features and the firewall, fire ahead.
JIM REID: Are there any questions? One from the chat room.
SPEAKER: I am from RIPE NCC. I have a question from a remote participant. He is called ‑‑ from uni cycle and it's more of a comment, I guess. He says "I have been using Knot Resolver since a while, before were virtually impossible to create so my compliments and thanks for making this. ‑‑ oh he has a question: When will the next release be? Will it have DNS steps already? There is another question but I will ‑‑
ONDREJ SURY: The next release ‑‑ the next release ‑‑ there are a few things we need to finish and features that are in the queue, so I believe we might be able to release before end of this year. But I hope so. And for the DNS tap, I don't think we have this on the plan but if Leo could write us we will fit it somewhere into the plan and DNS support ‑‑ if he hears me, I think he does.
SPEAKER: He has another question: DNS /TK*EUS of.com is also a nice Lua tool to filter specific abuse, a nice way to configure failover, personally I am more in favour of replying instantly and do load balancing with eBGP or iBGP, however can we expect similar load balancing functionality?
ONDREJ SURY: Yes ‑‑ well, we certainly want to expand the set of the features, but the DNS dints is universal tool to ‑‑ it doesn't answer on itself basically, as far as I remember it, but it asks for the information for the back ends, so it's ‑‑ we are not in the same situation but it depends. Again, if you have a certain feature request or issue, we are quite open in our plans and there is everything on WIKI and you can fill in issue with feature request and talk to us and we are certainly consider implementing the useful features that it will be useful for everybody.
JIM REID: I actually, I have a question for you Ondrej. In relation to the filtering capabilities you were describing there, one thing is actually would you consider adding ability to filter based on things like the DNS packet headers and also things like EDNS options and also on the side of the action side of the filtering rules you have an explicit drop where you are saying return NX domain, could the drop actually be ‑‑ do not acknowledge it in any shape or form and have a separate one return NX domain or ServFail?
ONDREJ SURY: When I was writing this presentation I was thinking about do we need to extend the set of selecters, so the answer is yes, but ‑‑ if you have something on your mind, please drop me an e‑mail.
JIM REID: I was thinking in the context of nine key buffer attacks that were prevalent a few years ago, it would be nice to hack some code, if there is for 9 key buffer size throw it away.
ONDREJ SURY: I understand. It should be fairly easy to extend because it's, as I said, this whole thing is written in Lua and come Niles Git code but it's fairly easy to extend because the Lua modules have access to all the library functions, so it should be quite easy to write it.
JIM REID: Great stuff, thank you very much, Ondrej.
(Applause)
Our final talk for this session is my colleague Jaap Akkerhuis and he has had the miss more tune to be involved in the ICANN UGtld stuff and the impact and we are going to get an impact on his recent work.
Jaap Akkerhuis: Commissioned by ICANN and it has this nice title: Continuous data driven analysis of root stability. The numbers I present here are actually somewhat dated, they are not definitive numbers so there is, disclaimer, the details can change, especially the days and the numbers and ‑‑ what is it? To analyse the technical impact of the introduction of the new gTLDs and on the stability and security of the root server system. That is in IPF and note it's technical so we don't, the social aspect, the economic aspects and all that stuff are out of scope with these, just how does the root behave. And that the approach is, we chosen for the data driven using fair amount of public available data, things like Atlas data and other archives and also a lot of interactions with the broader technical community trying to find what kind of answers or questions people actually have, and we will see where we can ‑‑ give some emphasis to that. And we also tried to build on previous studies which are doing more or less the same and we might actually have some comparable data in the past and maybe also to the future, hopefully. Anyway,
Breaking news. The final report actually ‑‑ actually draft report went to ICANN either last night or this morning, so they will, it will be in time for the people ‑‑ in time for ICANN to start public common periods starting ICANN 57 so ‑‑ but so that report will change in details of what I am going to tell here. Anyway, so, let's go back to our schedule performance.
So, what is it all about: Well, it's all about the hockey stick grow of TLDs. Since the beginning, there is hardly any grow of TLDs but after the ccTLDs got established, now this very small trickles of new TLDs came in, and that was absorbed by the system very nicely, and now the last two days we suddenly got, it went from 500 to about 1,800 in a year time and so that is ‑‑ the question is, is this going to be stable or will there be claps like the other hockey sticks we have seen in ‑‑ around, and well, to answer that question, is actually, you probably want to answer the question, what is the stability of the root server system. And well, so, here is rough picture but how you can look at it, when ‑‑ one single node is not available then probably don't see any degradation the service worldwide when the whole root service is available things are getting really in the red and all this call it in between, and what happens when one root server operator drops out and ‑‑ well the service be degraded or still be okay. So, you can draw lines where you get the influence of these things in and this is more or less what happened in the November/December attacks on the root service and some of the root server operators get degraded performance but if you look at the DNS system as a whole, the ‑‑ then the actually a lot of people didn't really notice until they had the papers, and so it's depending on how you look like, I mean, the stability is kind of a moving target and you will notice that after we made these pictures and, actually the root server operators brought out their own statement about what the root stability is, and if you reach to that it actually looks like this picture as well. So okay.
Next. The possible impact of new gTLDs on the whole system. Well, we build kind of a tree about it. What can be indicators of the stability and as we can see from the outside we note that we are not using any of the internal date of the root server operators so ‑‑ so. So, on the ‑‑ I always have to look on the left side, things like data ‑‑ or is that the right side ‑‑ data collected whether or not the data it produces are correct and where it comes from, this could be of course DNSSEC is getting broken more often than and whether or not the announcement of the new data is correct, that is more the data stuff part of that. On the right side the operations stability, in how far is that influenced by the new gTLD? And the things you can think about is whether or not the response time stability changes, whether or not the responses are more mixed and the query rates stability, there a lot of changes there and the various factors, whether or not you see more valid queries or you see more ‑‑ well, these are the things we kind of looked at. There is some precaution we have to make about the data use. As we said, as we said, we only looked at publically available data and we cannot really guarantee whether or not this data is completely trustworthy so we just have to work with whatever is there. And different collection methods can lead to different are results and we have seen examples that have as we went, what is going on here? Effect of four in amount of queries for certain root servers is kind of weird to come and see. Not all the collections are complete and especially over time for the detail thing some root servers work, I mean ‑‑ data collections some didn't, so the history of some of the data is also relatively short so it's difficult to say a lot about the long terms, sometimes.
And over the years a lot of the data kind of changes in the way that formatted so we have to be very careful not to pick out wrong data and do sometimes into some uniform data to prevent errors. That is about the data.
Furthermore, we also did some paper analysis and that is on request, that is via interactions with the community, so there is a list of questions, really want to know X or I want to know Y and we looked, you are not supposed to read this, I mean, this is one of the pages we did for the exercise and whether or not we can find anything relevant data because some people can ask more questions than you can answer, with an ‑‑ or whatever the saying is. Anyway, what we look at today is impact between number of domains and the queries rates to the root, what is the impact of the initial delegations when start up, do we see anything change and the other thing is the DNSSEC validations, didn't change, broke more, one over the old one and things like that, this query time distribution changes. I mean, one of the objects of the new gTLD programme there will be new ways of running DNS and so you probably might see that in the queries type, yeah, I have to laugh sometimes as well. Is there a geographic affinity for the geographic new gTLDs, things like dot Madrid or Berlin. Do we see ‑‑ people are wondering whether or not we see localised queries stream.
Let's first have a look at the new gTLD queries to the root. Here you see from the ‑‑ from the data from 2012 up to the last data how the query load was. What you see: The blue stuff on the bottom are absolute garbage and you notice over the years the garbage is only rising, and the red stuff is the ‑‑ part of the delegated, I cannot read my own notes here, the delegated TLDs and but so the top part is actually the valid queries and the bottom part is the invalid queries. InVal add queries are queries for gTLDs which has not been yet delegated although still already approved. If you get your microscope, then you find somewhere a little stripe which shows you the official delegated gTLDs.
So, what we really ‑‑ what we notice there is the invalid queries are really rising during the time, we don't really know why that is but anyway, so, looking at the traffic generated by the new gTLDs, it is actually just very ‑‑ not a lot to say it ‑‑ and we try to catch kind of rule of thumb, the guys from ‑‑ being statisticians whether or not they could find find of indicator whether the size of the zone was indication for amount of queries you can expect. And they came out with this rule of thumb and shows the relationship between the size of the zone and amount of valid queries for that zone and what it basically says is the ratio is about ‑‑ some measure about it and then just, I cannot find any explanation for protocol why this should be so this is pure statistical point but if this is valid it might help us to say something about the future when more gTLDs are popping up and this is all based ‑‑ on the root dataset but ‑‑ for new gTLDs this ratio is lower for all the other TLDs but it might be matter of time. So, this is ‑‑ so here is some more about the statistical stuff. So maybe it is some indicator for future expansion of the TLDs but time will tell.
What happens when the initial delegation happens for TLDs, we are in the lucky fact that we actually had data from H root when it changed the IP numbers so there was a long‑term data and in which quite some extra delegation were going on so we could actually see some of this these effects. They are astonishing, the numbers are actually small, but here are some patterns we see that ‑‑ that gTLD A is actually, we see first a lot of queries to it, well a lot, a couple a day and when they get delegated suddenly it drops down until very low it looks like the queries actually end up being a cache of this gTLD and being mostly being ‑‑ that is what you kind of expect. The same more or less for although it's much pattern ‑‑ you see ‑‑ yes, the peak that happens similar pattern like that, and more interesting is you see some random going up and down queries during the day and when it gets delegated suddenly the log goes up and seems to be kind of stable, a lot of 80 queries a day is not really a lot, so it's kind of ‑‑ this is looking at with electro microscope to do ‑‑ to do this ‑‑ do something very big, and so it's ‑‑ most interesting is based on gTLD, D, you don't see anything, suddenly it's got delegated an enormous lot and 80. Then it goes down to less than 10 again, and so I guess that's ‑‑ they get delegated, everybody is happy, the salespeople see that it works and then continue to do whatever they were doing before. And anyway. And well, the thing is, a lot of questions because ‑‑ does it change the first RTT return time? Does that change and I have been ‑‑ for standard when ask for red or blue one does it really matter how long it takes? But anyway. We looked at it as well and here are the graphs and you see not a blip. We see some variations but that is not more variations than the standard tap ‑‑ the standard variations, so nothing to see here. And then data correctness, I have a long learning process since, I think 2011 or so, something like that, checking all ‑‑ twice a day whether or not DNSSEC signed TLDs, whether they keep working or not, and so this is kind of twice a day I checked that, and just came handy for ‑‑ I did it out of curiosity, and this talk about it at ICANN, so that is but the result is ‑‑ you don't really see that it's worse before than after. What you ‑‑ what I did saw was the new ‑‑ when first batch of new TLDs popped up, after suddenly there were a lot of failures but that's people apparently forgot, most were expired, signatures and some people learned to monitor their domains quickly so not to have things expired. And but it's not ‑‑ it didn't get worse, actually if you look mostly you see more ccTLDs having the chain broken than these new ones and okay.
Query type distribution that is also interesting one. Here are some query types distribution and by transport, is what we have here, whether or not ‑‑ and so, various years and the pattern looked the same, actually, most is UDP and I forgot what the details are, actually, to be honest, and but okay, let's go back. But in the end ‑‑ but ‑‑ but now there is nothing really significant, you see some changes but that is what it is and if look at query type distribution by the RRT picture is slightly more but if you look closely through all the details you don't really see anything specific happening there, so that's ‑‑ so nothing I can at least find out, nothing to see, boys and girls. The geographic selection of new gTLDs, this is kind of a pain because not all the things out so picked a couple of them, and that is a couple of cities, the bottom one is for Moscow and we looked at the distributions, all the Anycast around the world to see whether or not you see pattern popping up. Well, you ‑‑ not a lot. The only time we saw that big difference between one Anycast instance and the other one was for Tokyo, and so the Osaka instance had way more than, although realised it was still microscopic amount of queries, I mean all .1 for the queries were in Tokyo so that is not really ‑‑ actually, also see that what you can see is that Anycast kind of spreads around, it does work and apparently most of the uses are also worldwide for those domains if they are being asked anyway. Some more observation about the data. I think I already mentioned that and, some details here. I mean, actually, what you see is that the world is changing much quicker than we currently sample the data. I mean, this is ‑‑ it's ‑‑ Anycast system is way more quicker than, it's got measured and so there are more of these stuff and the other thing is that the data gets lost, we find some interesting changes in the data that, I told you about the ‑‑ that we ‑‑ some of the the data for the initial delegations, I mean, some of this data doesn't exist any more because the disc got cleaned out. This is kind of problematic. It's so much data. And history also gets lost, finding out where all those Anycast things were placed, is a nightmare because people are changing their schemes of where these things is, on regular basis so purposely don't document where it is, so you cannot even properly guess and others do, but ‑‑ okay. So actually, some of the observations we made in SAC still hold and it should be much better and less ad hoc way of finding all these things and so one of the things, said at that time instability of services sufficient information, you cannot make proper models to predict feature and similar remarks were made by KC and Vixie. What are the other things, what will happen? There is some suggestion for next step and but it's part of a challenge for the whole DNS measurement community and let's try to standardise when at least the type of message people do and all these ad hoc stuff and form it's a which kind of pops up and so that is, and another thing is sanity checking of the data is pretty high when it's not properly documented, what it actually means and if you find factor 10 difference in some type of the data you cannot explain it and you wonder what you are actually looking at, so but ‑‑ but also actually some of the data you really want to have is not measured like you really want to see per domain things popping up of per gTLD, makes it much easier instead of trying to second‑guess stuff. And yes, and there are some of the data ‑‑ found out that in 2014 apparently the B Root server is out of sync for 14 days with the rest of the group, which is kind of odd, but given the sample frequency it doesn't really matter, but if you ‑‑ but finally these things make you kind of uncomfortable to trust the anyway.
The summary:
The data quality varies and so on but look closely enough for the impact on the root. Nothing to see here. There is not really anything ‑‑ what is ‑‑ what we can find, that the new gTLDs did anything with the root system, I mean that is basically what it is. And I mean, especially if you consider that all the new gTLDs is 1% of the traffic that hits ‑‑ and local which are not even delegated. So it's kind of weird looking at the noise here. In the microscope view you see some interesting that is more kind of weird but whether this is any effect on the broader scope of the whole systems, I don't think so. And this is what I have to say, really what we have got say is consortium, SIDN and NLnet Labs were part of it as well and TNO, any questions?
JIM REID: Are there any questions for Jaap? Nobody. Going, going, done. Okay. Thank you, Jaap.
(Applause)
And with that, wave famous first, we have actually finished ahead of time. So I would like to say thank you all for the coming to the Working Group's first session, thanks to the NCC staff backed up with the audiovisual stuff, guys doing it the scribing, very nice stenographer lady as always and I will see you after the coffee break.
LIVE CAPTIONING BY AOIFE DOWNES RPR
DOYLE COURT REPORTERS LTD, DUBLIN IRELAND.
WWW.DCR.IE.