Open Source Working Group
27 October 2016
12 noon
CHAIR: I think we can start. So welcome everybody. Thank you for joining us. You make a very good decision, you join the best Working Group here at RIPE, it's Open Source Working Group. My name is and Anna Philip and this is Martin Winter and we are co‑chairs of this Working Group. So before we start we need to do a little bit of administrative things.
First of all, microphone etiquette, you know we are in very unusual room. We are not used to having microphones at each seat you sit. So, please if there will be time for questions, we will ask you for speaking and just after we ask you, press the button and you can speak. Just one person can speak at once. Please don't press the buttons if you were not asked to do so. That's very simple. And I think we can start with the working group.
First of all, we have very good agenda. As usual there are three sort of bigger presentations and we have sometime for two lightning talks. At this point I want to ask you do you have any additions to the agenda? Do you want to amend something? And I expect you don't but just in case... and I don't hear anybody trying to speak. Perfect.
And also, we need to approve the minutes from the last meeting. The minutes were published a few days ago actually, and I am sure all of you read it.
MARTIN WINTER: If you go on the meeting archive on the RIPE website you can see all the links there.
ONDREJ FILIP: Again, if you have any comments, it either now or you can still send us an e‑mail but I don't think there will be any controversial thing there. Before I will ask Martin to introduce the speaker, let me introduce the general practitioner or monitor, Robert, he is sitting over there, so thank you Robert for being so kind to helping us, and Anna for taking minutes. So, those people will be responsible for the minutes, please read them next time.
And I think that's all from my side and I will ask Martin to introduce the first speaker.
MARTIN WINTER: One more thing I want to say, we had, if you look at our chart that we have like chairs election every four meeting, we had a call‑out two months before the meeting, there were no, nobody speaking up, we are happy to continue it, so that was it. If you want to become a Chair of this Working Group the next chance it like before the fall meeting next year again and normally the call for basically, you announce yourself if you want to become a Chair about two months before the meeting.
Okay. Let's talk about the different talks we have today. We have three main talks and then we have a few short lightening updates. So we start out in a moment with Christoff with PCAP BGP parser. Then we have Sander Steffann on the DHCP kit. Then we have Andy Surrey about continuous integration and automated management. Then we have two quick lightning updates. One is Gert Döring, talking about some new features and then the hack‑a‑thon an update which Nick Hilliard does on behalf of all the people who were like hard‑working there in the past week and talking about the BIRD's eye, API and other things.
Okay. With that I want to introduce our first speaker. Crhistoph Dietzel. Welcome.
CHRISTOPH DIETZEL: Thank you for having me here. So, my name is Chris and I am here on behalf of the DE‑CIX RND tool team and we want to present a tool that we developed for internal use but made it also available for popping on GitHub just recently. Credits go to my colleague Tobias who did the main junk of the work implementing this cool tool.
So, let's start off with IXPs route servers, by now I guess most people here know what it is, basically that's a sort of route reflected IXPs and those routers process a significant amount of data. This is crucial amount of data for IXPs because all routing information will go through those route servers and redistributed, so if something is going wrong or going well or whatever, and you want to get to know what's going on, you need to look into the data of this route servers.
So, reasons for doing so. It's, for instance, customers are calling and want debugging assistance and since they don't see prefixes they actually announce the route server but they don't appear in the Looking Glass for instance, so one needs to look into the route server and determine whether it's a configuration error or something else is wrong. Maybe the IXP itself like especially the R&D team wants to know how changes work and why things happen. So for instance there is a new peak at the IXP and we wonder if the reason for this new traffic peak that someone announced more prefixes, changed the configuration, announced a different set of prefixes or if just customers and members send more data towards the prefixes and that didn't change.
Other not so nice things. I mean, the Internet also has its downside. And there are incidents. So, route hijacks or route leaks and to investigate this you also need to really look into the BGP data.
So, we are using bird. And the issue with BIRD we have is that we don't find the built in tools that we need for analysis. So, for instance, there is limited long term export of BGP information. So, what I would like to see is that I just start an export and that continues over time and it does the end to another system and there I want to filter, process or analyse the data. This is not supported by BIRD. So, the continuous export into MRT files doesn't also work, and there is no simple filtering method before I export something because sometimes for me it's enough if I just get the prefixes sent by a specific member router.
And also, there are no insets insites. Even if there would be a powerful export feature in BIRD. No insights in incoming BGP advertisements are given, since there is this best path selection and I basically don't really know, like, if I want to look two weeks after it happen, why this was the best path, why it was chosen or why it was not chosen, I really need ‑‑ into the messages coming into the router, but not the one sent out to the other members.
So, first I started off with a very simple solution. I just did a TCP dump at the route server's interface, so I captured all the relevant BGP packets. Those packets I processed with tshark, the command line, however this is a quite complex task because you have like that long chain of commands and filters in there and then you want to change something, and it's really a hassle. It worked for some basic analysis, but when my boss came over and started to continuously ask me more about those things and can you please look in this and that, so it didn't really scale for me any more. It was too much effort and at the end of the day I ended up writing scripts that write the commands to have it sort of automated.
So, the output is hard to process in an automated fashion, so, I wanted to have a bit more flexibility. And why a shark support of BGP is not that well. So there are fields within BGP messages that I can't simply filter on but actually I would like to filter on those things.
So, what we did, we ended up writing our own tool, because this was the practical solution for us.
Here is the name, it's very simple. It's PCAP BGP parser. It's written in Python. It runs with Version 2.7 and 3.X. so it's Open Source. It's on GitHub, I published it the day before yet and it's under licence, Apache 2.0 licence.
And this tool is made for processing PCAP data, dump data, streaming data and comes with a real set of features, I'm going to walk you through all the features now. Or let's say the most important ones.
So, it usually reads PCAP either files or if you wish, as a stream into standard in, N, so you can process continuous data, and it also is able ‑‑ it's built and designed to read live data from the network interface, it's not fully implemented yet to give us another one or two weeks and then we hope that this is going to work as well. And it's a very modular written, so, we really took care that this module, modular, modern programming language with Python, it will be easy for you if you like this tool and want to use it, to extend it in whatever way you need to extend it.
So, the filtering inside this tool happens at two stages. When the raw PCAP data comes in, then filters which are based on the packet itself are applied first. And then the data is processed, transferred into the Internet data structure and then other filters like BGP specific filters are applied. We designed it that way because it makes it very fast. If you just want to receive the packets from a certain router's IP, this works very fast because the actual parsing does not need to happen they point. It just needs to happen for filters that are on BGP specific fields.
So here is some ‑‑ here is a list of the filters you can apply. Like, when you start the tool and you have PCAP as an input, you can filter on certain BGP specific things like the message type, the prefix in this message, withdrawn routes, next hop, AS path, communities, and the typical setup source, IP destination and IP and Mac addresses. So basically all what is within the BGP messages is also there to be filtered.
So, the filtering works in that way that it's a positive matching, so if you want a specific message to be shown, then it has to match. And it's very easy to combine filters. As you can see in the example there, I just added two prefixes and those, if you add like a set of filters, they are all end connected, which means like, either IP ‑‑ the first IP or the second IP and the other filters like the BGP filters are end filter at this point here.
So, we can output in numerous ways. So there is the human readable output which comes in a nice format if you really want to play around it and just want to get the results and understand what's going on on a very small data set. Then you would go with the human readable, it's easy to read and you would use it for exploring the data sort of. For automated analysis, there is the JSON output, so it's written into JSON files, and JSON files are good to deal with in any scripting language, so for other systems as an input.
And there is also the line based format, which is very ‑‑ the typical unix Linux way to just have all the information requested and filtered into a line based output. So, at this point, you can specify which fields you want to have in the output, like maybe you are just interested for the prefix and the next hop IP, and so you can just get the output of those two fields.
However, there is a default which gives you the time and the message type, the BGP message type, and the prefix.
So, what we did to evaluate the correctness. I had to get out my tshark tool chain again and just to run it once more, like hopefully for the last time, and what we did, we just verified the result on very simple metrics like we filtered on specific things such as prefix or origin AS and stuff like that and then we just simply counted the number of messages or the output we got with our tool and I got with tshark, and so, we made sure that it's really always the same output and the time stamps are the same and so on. And we did this with several hours of TCP PCAP captures from the route server at DE‑CIX.
So, at this point, it seems that it works correctly. However, we keep looking, since we just finished the first version of the tool and put it online.
So there are also no limitations. This Kafka export support, so you can correctly support the results into Kafka if you have sort of tool chains then you can easily integrate it, however we can experienced some trouble with Python 2.7, so with the 3.X, it works fine. However, there seems to be an issue with the 2.7 for now.
So, there is also a packet re‑ordering issue since we didn't really rebuild the entire TCP stack so when we get messages, sometimes we need to re‑order it in a specific way because the nature of TCP is that it doesn't come necessarily in the right order. However, we just do that in a limited way, because if you wanted to do it completely like a complete re‑ordering of the packets, we would need to wait until an end point so that would limit the capabilities of live export. Because you need to wait, or you need to sort of build a time‑out system and so on, so we don't do that for the sake of performance. Because, like usually you need to run like through the entire file first once.
And not all features are implemented yet, so there are some minor things that we didn't implement yet, but however, it works well and I used it for some analysis already and I'm super happy compared to my previous tool chain.
So here is a very short evaluation of the performance, and we see this graph. The first one comes with a heavy set of filters, so we applied a bunch of filters after ‑‑ like the first set of filters, like the IP filters and the BGP filters, and we see that the duration in seconds is like, that it is very fast because we have the IP filters which don't need to do the parsing, and then the tools very fast, and the other two graphs show that even if you use heavy, or make heavy usage of BGP specific filters, which are the expensive ones, then you see where the yellow line that it doesn't take much longer, just a few seconds longer than if you would do it without the BGP filters. Because the blue line is sort of the bottom line which does just the default values and doesn't do anything basically, just changing the sort of output format.
So, at this point, I am going to conclude. So, BGP PCAP parser is an open source tool and Apache tool will make your life easier if you run BIRD or if you at some other point want to do analysis on BGP and have PCAP files and you want to filter or look for certain things, and it's nicely ‑‑ nice to integrate it into a tool chain such as a shell bash Python environment. It's nice for live parsing at least for the data we have at DE‑CIX. We are about 20 times faster than we need to be able to pars to do live parsing.
So, thank you, and now I'm happy to take your questions.
(Applause)
CHAIR: I see one hand.
AUDIENCE SPEAKER: Gert Döring, standing in for Job Snijders. Does it do large communities yet?
CHRISTOPH DIETZEL: No, not yet.
GERT DÖRING: Why, the co‑point has been assigned 15 hours ago.
CHRISTOPH DIETZEL: You're right. We failed miserably, we should have actually implemented!
AUDIENCE SPEAKER: Peter Hessler, this looks really interesting and I have started working on a port for it, but I have noticed that there is, there is not really the framework for it to be installed as a system‑wide utility which would help a lot of external packagers who are working on this. It's a request we would like for the future.
CHRISTOPH DIETZEL: Okay. Thank you.
AUDIENCE SPEAKER: Philip, NETASSIST. I am happy to know that somebody actually use such utilities, and I actually used such in my last job and when I was given to filter out some interesting packets there was UDP and not TCP streams so I didn't have so much trouble with resembling TCP stream at all.
But, should Python, it's a good language but not so much about when you are making concurrent applications and even parallel. The multi‑processing in Python is a big trouble and a big shot in the hat, I think so. The way to implement the ‑‑ the better way to implement is actually to make it, to divide it in several stages but I'm very happy to you find out the best way to process, to process the data. It applies in this any condition and any environment, no matter what you have, no matter what routers you have, even if you have some closed source Sisk routers and so on, you may just easily capture data and analyse it, it doesn't matter the vendor. What I recommend is you try to wrap a C pick up and maybe you will get some more performance but I'm not sure. Anyway, I may help you if you reach me in the room, I'll be happy to find a better solution for you, but I am happy that you already use it. Thank you so much.
CHAIR: Any other questions? I have one short comment, which is I was wondering if there is some remark in some presentation that some feature is missing in BIRD, so I would just like to say we don't bite, we developers, so if you need some features, maybe you can approach us and we are very happy to cooperate. Just a small remark on that.
Any questions?
CHRISTOPH DIETZEL: No offences, but we just needed it like in weeks...a feature request probably takes longer than weeks.
ONDREJ FILIP: Anyway, thank you.
(Applause)
MARTIN WINTER: Okay. Next up is Sander Steffann.
SANDER STEFFANN: Hi. Good morning. I am giving a short update on DHCP kit. Last time I presented this was the last RIPE meeting but I presented it in the IPv6 Working Group and obviously the Open Source Working Group was very offend that had I didn't present it here. So... here I am.
So, what is DHCP kit? It's IPv6 DHCP library and server framework. It's Python 3.4 plus. GPL licensed and the whole idea behind is that you can just customise it to fit whatever your needs are.
The reason I started this project was I was doing a project for a Dutch ISP, fibre to the home project, and we wanted something very simple. We just wanted to say okay we look at the remote ID we get from the relay, from the switch, we want to see okay it's this customer around give him the right prefix. Nothing dynamic, nothing magic, just look at who it is, give him the right prefix, done. And I was really surprised that I had a lot of trouble configuring this on existing implementations. At that time, this was a year‑and‑a‑half ago.
The existing BGP servers are all really good at dynamic things, and they implemented the standard way that people ebbs expect a normal DHCP server to behave. But then if you get into an ISP situation where you want to do something slightly different, you want to use different provisioning stuff like that it suddenly becomes very hard.
So, obviously I wrote my own DHCP server, like anybody would do, and I actually focused on the flexibility. Because there are really good DHCP servers out there for IPv6. But none were flexible enough. So I focused on the flexibility bit where you could just plug in anything you wanted to, connect it to your own provisioning systems, stuff like that.
I have to thank the SIDN funds, I received some funding from them this year to continue development on this project. So this was really useful, so, thank you to the SIDN fund.
Okay. A bit more detail. The basic structure of what I wrote.
Basically, there is a main process that starts up, reads the configuration, sets up logging, all the boring stuff, opening the right sockets. And then all the main process does is receive a message, delicate it to a worker process, the worker process handles the request, sends back the response and the main process has a call back that sends the reply back to the client or to the relay. Now, the interesting bit is what happens inside the worker process. Because there you can put plug in filters and handlers, because the process itself doesn't do much by default. But, you can put in a filter say, okay, I want this bit of the configuration to only apply to clients coming from this sub‑net, if they come from that sub‑net I want to advertise this DNS server to them, and so you can just build a very flexible configuration, and those handlers are actually pretty easy to write.
The configuration is Apache style using Z configure and it hatches the same structure, so you see the sub‑net bit, that's a filter, that's a filter that says okay, only apply this bit to clients coming from that sub‑net. If they are coming from that sub‑net I just want to get their assignments from a CSV file and I want to give them these figures as name servers and then outside of the filter I added an example, SolMax RT which is an option that tells a client how much they should re‑try a solicit message. That one is meant for clients that request something and don't get it and they just re‑try too often which causes a lot of noise, so that's an advisory option to slow the clients down a bit.
Just a small example of how you could configure a server.
Now, I am quickly going through these slides because there is quite a lot of them. This is an overview of everything that I have already written for you to use.
So, what's already available? The basic things. If you get a request from a client, it contains a client ID option, in every response you have to send it back to the client. So there is a simpler handler that does just that. There is a handler that checks the server ID because a client might address a message to a particular DHCP server, still using multicast, so, if the message is for a different server, this handler just says oh, this is not for me and just ignores it.
And there is all kinds of options. There is option for the interface ID, for relay messages, there is a preference option, which is standard, options for telling a client that sends a Unicast request instead of a Multicast request, that that's not permitted and sends them away. There is handler that implements the rapid commit protocol. Handlers that set the right status codes. Most of the normal stuff is included.
Then the filters. There is two interesting filters. One is the sub‑net one that you have just seen; the other one is for the elapsed timed. Every client sends an option that says okay, this is the elapsed time that I have been actually been trying to get a response. So on the first solicit they set the time to 0 but every re‑try they increase the time. So what you can do for example, is on one server, configure elapsed time and say I will only reply to messages if this elapsed time is more than 50 seconds. The other serverable reply to everything. The back up server will only reply to clients that complain they have been trying for too long.
Then we have a bunch of informational options. You can set the DNS options, recursive name servers, the main search list. Zip servers, most the addresses and domains options have been implemented. There is both an NTP and SNTP option which are completely different. I have no idea why there are two options in they are so they are both implemented. We have an option for the DS‑Lite where you can set the A FDR option for the DS slight deployment and implemented the map options for map E and map T, I couldn't find any other implementations of this. And when I was debugging it I found out that also wire shark didn't understand what they were, so I'm also working on the wire shark extension to actually pars those. It makes it a lot easier to debug your own code.
Also, options provided by relays. You have the standard interface ID option, but there's also a remote ID option that indicates which client it is, there is a subscriber ID option that can indicate which client it is. There is a link layer that includes the Mac address of the client as seen from the relay. So all of these options have been implement and for example, the handler, the CSV handler that I showed you before can use any of these options to identify who the client is and then assign addresses and prefixes to them.
We have some informational optings and we have the SolMax RT and INF MAX‑RT to slow clients down.
We also have a rate limiting handler, this is an interesting one, because there are some DHCP clients which are really, really annoying. They do a request, you send them back like sorry, you don't get an address from us. And what they do is they immediately re‑try. So, you get dozens of DHCP transactions per second from a single client because it just keeps re‑trying and re‑trying. The only way to actually make them shut up, is just to stop responding to them. So, a built a little rate limiting option where you can say okay I identify the clients using the remote ID from the relay and if a client is trying more than this number of times and that number of seconds, just ignore them. And once you start ignoring them, they actually slow down and wait for the normal time‑outs.
So, it's ugly, but this can really help in for example ISP where I started doing this, they had a big problem that they had about 10 clients that were sending 99% of the DHCP messages.
Also, I have an option, and this is nicely to combine with the elapsed time option that I talked about before. It's just an ignore option. You can say okay, in the filter, if the elapsed time is less than 30 seconds, ignore all solicit requests, so that way like I said you can build a night failover situation.
Next option, static assignments. Like I said, CSV file based, really simple, identifier, address, prefix, just one line, and in that way you can just provision all your customers really easily.
Now, the downside of the CSV implementation was that if you change the CSV file you had to reload the server to make it load a the new CSV. So I also added a version that uses S Q light and the advantage of that is you can incrementally update the database file and server doesn't need to be reloaded, it just queries the latest entries from the database.
So, this is what I'm actually using in production now in a couple of DHCP servers and ISP, and it works great. There is even some things in there, for example, some sanity checks that if you do an update from the new version of the CSV and that would result in deleting 90% of the contents of your database. Then you can just configure it okay, if you are dropping more than 90% of the entries, please don't do that, please leave them in there, so you can actually manually intervene if necessary.
So, that way we can just script everything without a big risk that, by sending a corrupt CSV file will delete all the clients.
There is also a simple option to adjust the timers, the T1 and T2 timers in DHCP v6, tell the client when they should renew or reBIND their addresses, so these options you can configure how often the client comes back to check if everything is still valid. Sometimes you need that, for example, to keep the relays and the forwarding state in the routers up to date.
Then, of course, Open Source Working Group, so, I expect you all to participate and start writing code for this. Writing your own extensions. This is a very simple example. Say we have X, Y, Z extension, we have some extra DHCP options that we want to implement, and we actually want to have a handler that we can use that we can configure. So, this is what's needed for that.
So, this is the layout I personally use, you don't have to use this. I have the package for the extension, options are implemented there, we have a set up dot P Y for installing it and then we have a server extension which actually is the bit with the handler in it, so we have a panned letter, we have some objects representing the configuration as a Python object, and we have a bit of XML that actually defines the syntax for the configuration file.
So, what do you do in the setup? Well you can actually define entry points. You can just say okay, we have, we are now introducing a new option type or a new message type or a new DO ID type. We are offering (DUID) extra configuration for the server, configuration, extra handlers, those kind of things and you can specify that in the setup.
Really ease, you just say okay, we have DHCP kit v6 options, option number 999 is implemented in this file in this class. That's it.
For the server extensions, you don't point to a class, you point to the right folder. And then it automatically picks up the component of XML and from the component dot XML sees what the configuration options are, and in that file, there is also a link to which class then implements all the bits.
So, this is a very simple example. You define the standard XML stuff. We have a session type called XYZ, we have a handler of a factory that will create the handler, so this is the bit that represents the configure as a python object. You can add a description, you can add an example of how you are going to use it. The same goes for the options you have under it. You have a section, XYZ has an option address with an address behind it. And this is the definition of what the address is, the data type in Python is provided. I get the description, I get the example, and those descriptions and examples are actually automatically extracted from the configuration and used to build the documentation. So, if you do it like this, you also automatically generate the documentation from the definition.
How does it work? Well, the configuration is read, but at that point one of the things that's in the configuration is the user ID that the servers should run as. So while reading the configuration, we're still running at route. So we try to do as little as possible at that point, just read the configuration, do some sanity checks, after that, that bit of the configuration generates a factory object. Then we draw productive lengths and then we execute the factory to get the real implementation object, and that is done using lower privileges, so that's done to separate the bits that need route and the bits that don't knee route.
Then when that's done, all the workers are started, the handler is copied there, but there are some things that you don't want to share. For example, you want each ‑‑ well, maybe you want each worker to have its own database connection. So, we do a third initiation step in the worker where each worker process can also do its local thing if it's necessary.
So, of course, I have already done a couple of extension projects. The first one is for tech colour, CPEs, they implement the SOL‑MAX‑RT option, but they are still using some private ID for it. So they are implementing the right option, but with the wrong ID.
So this was a really simple thing to implement, and this is one of the examples of how you could write a simple extension.
Then another one I have already done is Kafka integration, you can configure the server to send all the transactions that happened, both the request and the response to Kafka, it's a message bus system, and that way you can keep track of everything that the server is doing. And then the next extension project is actually a Looking Glass that uses this. So you can just run ‑‑ this a Django application ‑‑ you can run it on a separate server, it listens for the all the messages and collects all the messages from the different DHCP servers and you get one nice overview of what is happening on the DHCP level.
It includes all the messages, you can search for client IDs, for remote IDs, things like that, this is useful for example for a support desk to actually if a customer calls and says, oh, I'm not getting any addresses, you can actually see what the client is asking for and which response you got back.
Now, over to the current status. I'm currently working on a lease query and bulk lease query. These are protocols, they are extensions to DHCP. If a relay does, for example, prefix delegation, when it reboots, it doesn't know about the delegated prefixes so it doesn't know which routes to insert into it's routing table. So, the bulk lease query can actually make a TCP request, sent DHCP over DHCP and saying okay I am this relay, can you please remind me what I should be doing and then the server can send all of the DHCP transactions to the relay so it can rebuild its routing table.
This is the project I'm working on and this is a bit complex because like I said, DHCP v6 everything is UDP, except the bulk lease query bit where you use TCP. So I'm still adjusting the framework a bit to allow for that.
And after that, when that's done, because there is still a big change and I want to do it before calling it version 1.0 but when this bit is done I'm actually planning in a couple of weeks to release version 1.0 and call it official.
Now like I said, of course, you are welcome to participate. If you are doing something with ISPs E enterprises, if you need a really flexible DHCP server, the way you can just build whatever you want with little effort. Please come and join.
One of the other use cases I have seen that was really funny was students. I got some messages from students that were studying IPv6 and actually used my code to understand and experiment with how DHCP v6 works. So, that can even be a nice learning tool.
If you are interested in doing anything, please send me an e‑mail. Thanks.
(Applause)
ONDREJ FILIP: Thank you very much. Are there any questions or comments?
AUDIENCE SPEAKER: Hello. Andy a, I would like to ask if it's possible to make this DHCP server actually insert routes into the routing table when they assign prefix, because if it's used in some small systems where there is no relay agent, this is somehow complicated in the current DHCP servers.
SANDER STEFFANN: The only thing there would be that you can only insert those routes if you are running a route, so it would be a bit tricky to right the header in the correct way to ‑‑ you probably need to start up a separate worker process that handles all the routing table updates but apart from that it should be no problem at all.
AUDIENCE SPEAKER: Peter Hessler: Thank you very much for especially considering the RFC prejudice separation and adopting capabilities of this. You are listing on the wire, yes, you can't is trust what's being received to thank you very much for thinking about that.
AUDIENCE SPEAKER: Philippa, NETASSIST. Thanks for making such a good project, making flexible BGP server. In NETASSIST we made ‑‑ DHCP server. We had a big problem integrating it with our switching capabilities. We had a problem integrating our options like remote already circuit ID and link layer address space and so on. There was a big trouble for us but we made it on a C caught. Anyway I think it would be a good replacement. Much better than we have. Thank you very much.
SANDER STEFFANN: Thank you. And if I can help you out, just let me know.
ONDREJ FILIP: Any other questions or comments?
(Applause)
MARTIN WINTER: Okay. Next up we have Andrei, he is going to talk about easier ways of putting the CI system into your Open Source projects.
ANDREI: IRR. I am one of the many Andreis at cz.nic and I want to talk about a software that we use to build a software at cz.nic, and how to do it an easier way.
So, again the premise I am always working when I'm running our infrastructure is that we should own our own data. So, we tried to avoid using all the funky GitHubs and stuff like that, so this is about how to run it yourself on your own infrastructure controlling where the data goes.
So, the contents will be, well, I don't know if you know Jenkins, well basically let me ask, who knows what the continuous integration is? Okay, a lot of hands, cool. And who knows Jenkins or runs Jenkins? Also, quite a lot of hands.
So, the first thing about Jenkins and how to make working with Jenkins a little bit easier. And then we have other two contestants in the continuous integration field.
So, let me just quickly recap the basic CI concepts. So, you have a source code, which is usually kept in GIT nowadays, some still use subversion but most of the software is in GIT now. And it has some common branches and outwards users and stuff like that. Then you have workers, runners or slaves, and those basically the machines that will build or do the stuff for you, and then you have jobs or builders that have the job definition, how to build your software or how to, for example, deploy your software. And then you have schedulers and triggers, that means when the build will trigger or like on the comment or when there is midnight or something like that. And last thing is a thing called publishers or artifacts and it's the last part when the build is actually finished and it usually produces some TAR bar or binaries and these are called artifacts and then you can do whatever you want with those. You can deploy them on the server or publish them as a new nightly TAR bar or something like that.
So the basic continuous integration work know is like this. You receive a trigger which may be time based, then on each build node, you run ‑‑ you take out the source code and you prepare the environment, run the script to build the thing, and then archive the artifacts and then you can at the end you cleanup the environment.
It can also trigger the next job, so you can have a chain of jobs.
So, Jenkins Job Builder. If you know Jenkins you know that it has a huge web interface and if you need to configure something or build a knew, add a new job, it's quite a complex task. So the Jenkins Job Builder is a thing that originated in Openstack because I guess they had a lot of Jenkins jobs and you can basically write a templates for jobs, and it's written in yaml or JSON, so, no more XML, which is used in Jenkins. And by using the templates, you can have the same jobs, multiple times with differences in those. And it makes things much easier if you have, I don't know, hundreds of jobs which can easily happen.
So, how to build Jenkins, or. It's written in Python so you use virtual end and then you use the PIP command and insert the virtual environment and then it's very easy. Just three comments and you are done. And then you do the basic configuration. This is the real URL I'm using to build a Debian packages where you can see how many jobs it can have. This is the very simple configuration for the Jenkins Job Builder and then you have the configuration for the jobs. This is simplified, but this is an example how to build a package for a knot resolver. So you have a name, you have then this is just a tag I use because I have multiple repositories there, so this is a repository for a knot resolver. Then you have distributions, architectures, and branches. You can also build from different branches. And then you have GIT URL and the jobs and this is where templates come into life. So, each package has three jobs, the one builds the source, the one builds the binaries and the last one runs tests, which is sort of testing of the Debian packages if they clean up after themselves when you installed them and didn't install them, stuff like that. So, this is an example of one of the job templates. This name has too the name I mentioned in the job definition. And then here you can use the place holders, which are also defined on the slide here, so this is GIT URL is defined here, and then it's used in the job template here. So this is just generic template you can use for multiple packages, and then you have the shell script you build the packages, then it will archive the result. And as a last thing, it will trigger the building of the binary packages.
So, this is how it looks in the end. So, well for example for knot resolver it's not that much but if you look ‑‑ I had to make a free views for BGP packages because it's something like 161 jobs in the Jenkins and it would be just crazy to configure all that by hand. Because you had to copy the template for each and I would spend ages doing so and with the Jenkins Job Builder it was like a breeze, so I have ‑‑ well the definition for the packaging and then I have just have URLs, the name of the packages and basically that's it and it will do almost all stuff automatically.
So, this was a list of the packages, and this is the, one of the jobs example. You can see the arm platform is failing for me. I haven't had the time to fix that yet. And it's actually not the not resolver but lip UV job which is there as well. It makes it very easy to build a multiple packages that are similar or unique environments for same package.
The build bot has a different concept. They say that it's more a job scheduling system, it's version Python, it was ‑‑ I was relieved to see so many new Python applications today, so this is written also in Python and the configurations is also Python script which gives you flexibility. And again, it has some repository, it has a build master and build slaves and at the end there are some note fires that I don't know, publish the status of the builds, the do, IRC or VAT e‑mails, stuff like that.
So, again, the installation is very easy, again, we use virtual MF and PIP. Then you just configure the master and run it. Same for the worker. We just, the name for the slave. Again, you just install it and then configure the same user name and password you have in the master configuration. And then this is again simplified configuration, you just import some Python modules, and build up the configuration with workers, get the source, you can have different, the chain source ‑‑ this is the triggers basically, this one is on the GIT comet, and then you have a build factory where you add tabs, for example, here you check out the code, here you configure, here you make all and last step is run tests. And you add this build factory to the builders and that's basically it. There are more options but this is the basic configuration and this is again very easy way how to configure it and if you know Python, this might be exactly the thing for you.
It has a simple web interface where you can, for example, force a build. And this is how it looks like in the end. And this is the example of the build, and here all the build steps.
And the last thing I have tested for the CI is GitLab continuous integration which got integrated into GitLab a few releases ago and if you run GitLab this again might be the thing for you.
Well, the GitLab is written in ruby, which was the cool thing like two years ago. So, the GitLab runner is written in go, so it's even more cooler, new languages. It uses the dot GitLab CI Yaml configuration inside the repository, and again the installation is I am so sad when I see could recall STPS, pseudo bash in the installation instructions when it's actually just two comments to add a BGP key and other repository. Because they provide packages for that.
Then you configure the runner with a token you are provided in GitLab interface, and the configuration for a project is similar if you know Travis, it's sort of similar, it uses similar names, so there is something you run before the domain script like install the dependencies for the build and then you have a multiple job definition so you can have more of them. You can sort them, stuff like that. This is again simplified, just for this presentation, advice for all the system to before you start fit fiddling with that, but still, it's very easy to start with, with all the systems, if you are not afraid to read the documentation.
So, this concludes my talk. I wanted to show you the different continuous integration systems we either use or consider to use at cz.nic to build our software. So if you have any comments, questions, tomatoes, please now.
AUDIENCE SPEAKER: Hi. Leslie Carr, Clover Health. Great presentation. I love using CI tools. I also wanted to mention, if your Jenkins can be a little tough to set up the server yourself, but if you are using GitHub you can also use circle CI and Travis CI, are ‑‑ they host it, very easy, they are free options.
ONDREJ FILIP: Any other comment or question? If not, then I think we can thank Andre. Thank you.
(Applause)
MARTIN WINTER: We are coming to our lightning talks. First off is Gert Döring talking about open VPN.
GERT DÖRING: Hello everybody, this was planned to be given in five minutes and be very quick about it. I just got told I have 15 because the speakers before me have been quick.
Let's see, more time for questions I think.
You know me as Address Policy Working Group Chair but part of the things I got sucked into in recent years is open VPN maintenance. Open VPN was developed by totally different people that effectively abandoned the public project and the new group of people took over and well one of them is me.
A few words on the versions we are dealing in, in the GitHub there is a release 2.3 branch where all the stable stuff is coming from. Only bug fixes ever go there, or stuff that is needed for long term compatibility. Like, something changes in the wire protocol and we expect 2.3 to be out there for five more years, and this being really important as a compatibility fix, it goes in there, everything that's intrusive might introduce really interesting new bugs, won't.
Then there is GIT master, which gets all the development and the bug fixes of course, and at some point in time, this will be released 2.4, which might happen at the end of this year, if we get enough testing and confidence.
We actually tacked what we have in master two weeks ago as 2.4 Alpha 2, which is really close to a 2.4 release, so I thought come here and brag about it.
There there also is a 3.X branch which calls it itself and looked like and smell by open VPN but it's a completely different thing. It's a reimplementation in C++, it's a client only. It was done basically because the original open VPN author got bored and decided to try new programming language and also because he wanted and IOS client for the iPad.
This stuff that we maintain is GPL only, and cannot put GPL code into the AP store unless you have written agreement from all the developers to do a licences to a closed licence as well, which we don't have for the old code base. The new code base is also GPL but if you want to contribute to it, there is paperwork to be signed because that has to be do a licence to be able to use it in the app store, not our decision. Anyway...
The URL down there gives the new features in 2.4, and I'm giving a quick run through that now.
One of the biggest things is actually improvement of the Windows support. One of the major issues with 2.3 on Windows is that you have to run all of it with system privilege because otherwise cannot install routes. This sucks. 2.4 has a Windows service that runs with system prejudices and the Gooey and the open VPN process runs with user privilege, that's your privilege, and talk to the background domain name to gets routes installed. So there is your prejudicial. Privilege separations.
We don't currently have something like that on all the UNIXes, but the mechanics being used for this could be extended, but the Unix ecosystem is complicated. Like you have system D network manager, RC scripts user running open VPN, systems running open VPN, so actually agreeing on a privilege separation model here is more complicated than you would expect.
Crypto. Open VPN 2.4 supports the AS GCM Crypto modes with encryption and authentication in a single path which was actually complicated in that case because you need to talk to different APIs of the Crypto libraries, cannot just tell the CRYPT likely please do that but you have to talk to different functions. And to make this really useful, you have to do it in a comfortable way. Open VPN uses TLS for the control channel, so, that part is easy, you just tell the Crypto library, yes, this is okay, and the TLS library will do it for you. But the data channel uses in 2.3, a pre‑configured cipher. So you tell the client and the server, use cipher blowfish and that's what you have. When you run a big server you have 100 clients out there, you want to ungrade to AS, you have to upgrade everything in one go. Not very good. So, with 2.4, the server learned to talk different ciphers to different clients at the same time, and when it recognises an incoming 2.4 client, it can auto upgrade both sides to AS even if you have everything set up for blow fish if 2.4 is talking to 2.4, they will do an AD.
The wire protocol is still fully compatible with everything down to 2.1. So, you can really upgrade as you go.
One of the changes we did that actually broke people's configurations was the tightening up the TLS ciphers in open SSL. Removing stuff that really shouldn't be used anyway. And that blew up people's mycrotic VPNs. They acknowledged this was a bug on their side and upgraded their side, so if you run into problems, just upgrade the mycrotic firmware.
IPv6. That's something dear to my heart. 2.3 has sort of IPv6. It gets the job done but it's not pretty. On the connect to remote server side it's not really dual stacked, it can do IPv4 and IPv6, but it will not be smart about that. 2.4 does it the right way, call get address info, figure out whether the server has v4 or IPv6 and then just try both if the sequence the operator system recommends. And that means with 2.4 you can reasonably run it on a NAT 464 network. If it's a v4 network, you tell it do only v4. It doesn't know that your v4 only server is now v6 capable by magic of NAT.
The other very important bit that looks simple but was quite a bit of work is make overlapping v6 in v6 work. Overlapping is you talk to a server that says tells you, yeah, please route this 32 to me, and the server address is part of the 32. Or the default route. So, you need to figure out where is the current default gate we're pointing to and so the host route to the DPN server to that gateway, clean it up afterwards. It sounds simple across the platforms. It's quite a bit.
Roaming.
2.4 has learned to support roaming clients. That is a client is assigned a peer ID, if the client moves to a new ID address for whatever reason, roaming between Wi‑Fi and 3G or whatever, the server will recognise that this IP address is unknown but the peer ID is known and if the packet validates against the POP key of the client, it will be accepted and the client seemslessly moves over to that new IP. That's really totally seamless, single packet. Obviously this is only if you do UDP.
Minor features. We dropped Windows XP support. Because, well, it is officially no longer supported. And the new IPv6 stuff is just not possible with a limited APIs Windows XP has. We intend to support XP on 2.3 for a longer time, so, we are not taking away security software from those users still stuck there, but they will not get the new features. Period. To support more legacy stuff we added IBM AIX support. And there are a few minor things that are useful still.
Code quality. We actually do continuous integration. We use build bot. Thanks to the previous speaker I don't have to explain what build bot is.
This is the platforms we consider officially supported. On different architectures, different word sizes, so, there is a whole zoo of VMs that builds ever single comet we push, and actually runs tests like starting up a VPN, push be data through it, shutting down the VPN and checking everything that is been cleaned up. So, we actually, even it's called Alpha 2, we think the code quality is quite good. Also, then rod client gets build on GIT master so the client side of this has been tested by millions of people, really good ‑‑ really, really well tested. The server side no so much.
So, we still think the code quality is good. But open VPN has too many options. And I have been told that people use open VPN specifically because it has all the knobs to twist on. But that means there is option combinations we have never heard of, that we might have broken in 2.4 accidentally. If you are open VPN, please test this. And if it's not working, report back. That's it.
I think I can take one or two questions. There is a list of URLs to find information, and thanks for your attention.
(Applause)
ONDREJ FILIP: Okay. We have like five minutes for questions.
AUDIENCE SPEAKER: Hi. Ben from Sweden. Thank you for this. I use it and I am happy with open VPN. I'm a little bit intrigued by the branch 3. Is it really ‑‑ it's not really a branch, is it?
GERT DÖRING: No, it's actually a separate project. It's a complete reimplementation. It shares the wire protocol so the IOS client can fully interoperate with 2.2, 2.3, 2.4 server coming from the other world. But as a new implementation. We talk to each other. It was made by the original open VPN 2 author, so, sort of like we still like each other, but he went off doing commercial stuff based on open VPN and he got bored and decided to learn C++, so sort of... but it's not ‑‑ it's not a competition in who has the nicest thing, but the goals are slightly different and the licensing.
AUDIENCE SPEAKER: Mash can you say. I have got two small questions. The cipher negotiation, have you considered how it interacts with stuff like down grade attacks?
GERT DÖRING: This is ‑‑ I could only touch this. There is actually two different aspects to cipher negotiation. One is the TLS that's used to negotiate the control channel and inside the control channel client and server negotiate which cipher they want to do for the data channel. Right now, the negotiation is actually quite limited as far as data channel goes. The client can signal I can do AA D and if that signal to the server the server will just tell it use AS256 GCM. So there is no real down grade attack here because everything that that can be negotiated is stronger than the default blow fish. On the TLS side, that's one of the reasons why we changed the TLS ciphers to be more strict, which of course, gave compatibility issues. I'm not the Crypto geek, I am the network and packet geek but we have a very good Crypto geeks that tell us what we do is sane.
AUDIENCE SPEAKER: There was one thing like you could in theory if you could plan plate a channel you could to tell to disable the AA D attack, it would down grade an attack from to the default cipher
GERT DÖRING: If you can do that you have already broken the TLS handshake. So that negotiation bit is fully encrypted and authenticated so if you break that we are tow. If you can inject routes and everything.
AUDIENCE SPEAKER: Just elaborate that go it would be a down grade attack if you could do the control channel, just because it's an upgrade path doens't mean there is a downgrade attack back.
GERT DÖRING: In that case the client will stick to what is in his config files. If that contains cipher none, we're toast.
AUDIENCE SPEAKER: Another question about the pool filter. Is it just white list based or can it be white‑listed based or is it just purely a blacklist?
GERT DÖRING: It actually has a syntax. It can say pool filter accept regulation X or pull filter deny something. And they are evaluated in the order you config them, so people use this to say the server is sending me 10 routes and I only want one of them, so, reject all the other routes. So it's quite flexible.
AUDIENCE SPEAKER: Question from Jabber. From Dan, who is just a customer of an LIR. Will we be able to push IPv6 DNS servers like DHCP options with IPv4 in open VPN?
GERT DÖRING: The short answer is there is currently no support for that. The long answer is that is complex. Pushing DNS servers is magic. In Windows land, there is a virtual DHCP server running to tell Windows about DNS servers and that's a v4 DHCP service, so cannot push DHCP investors over v4 DHCP. In Linux land, this information is exported as environment variables that then goes to script depending on which operating system you are on that might or might not update resolve config. And all this is v4 only today due to, well, no time. Windows side is tricky because ‑‑ well, DHCP v4 Linux side in theory is just enhancing script, but operating ‑‑ distribution variations, this is a lot of work.
MARTIN WINTER: Any other questions?
AUDIENCE SPEAKER: Phillipa, I will try to make it as short as possible. I seen you have support for ASGSM finally, is a there been ASNI?
GERT DÖRING: We are using the SSL library that you compile against, so you if use open SSL with all the drawbacks, you have ASNI and it works as fast. I'm not exactly sure about embed TLS and hardware support right now, but we're not doing the Crypto ourselves, we just link the Crypto library that is there.
ONDREJ FILIP: Thank you very much.
(Applause)
MARTIN WINTER: Next up we have Nick Hilliard. He is talking about standing in for the hack‑a‑thon team who worked on the birds view API card.
NICK HILLIARD: Good afternoon everybody. As Martin said I am standing in for the Birds Eye team who are at the IXP tools hack‑a‑thon on Saturday and Sunday last. The proposer of this project was Barry O'Donovan from INEX and Barry wanted to set out to build a new method for querying information from the bird BGP domain name. So, this is what they ended up with.
So the people involved, Barry, did all of the AP I back end stuff. Daniel Karrenberg, MATious has beening produced this. The NetNod guys had a look at implementing a native JSON API in bird but it didn't work unfortunately.
Why are we going to do this? Well, there is a lot of reasons actually. The main reason is that we needed a secure and reliable Looking Glass. There are Looking Glasses out there for Bird, they are a little bit heavyweight, and these ‑‑ this was code that is actually going to end up running on live production route servers, and production route servers are not the sort of places that you really want to have huge code bases. We just want to have really bare‑boned systems for our route servers so that there are fewer attack surfaces to get in there. But in fact INEX had a bunch of other reasons for doing this, one of which was monitoring, we didn't have enough visibility into what was going on on the production service side, and also, for analysis and problem solving, figuring out what things went wrong, you know, having audit chains and that sort of stuff.
We think it's the sort of stuff that a whole bunch of IXPs will be likely to use as well.
. It could have been possible to have adapted existing code, but in fact, in many cases it's actually easier to look at what's out there and reimplement. Frameworks give you the ability to do that.
The interface mechanism in particular is designed with security in mind, security as a top priority. And there's a whole bunch of features here which, which were taken into account. The first thing is that it's rate limited, so if somebody comes along on the Internet and just starts pathology too hard, it's not actually going to affect the production route server, which means that there is no way to spank the route server using this, which is very handy. There is actual multiple layers of rate limiting in there.
There is an enormous amount of parameters checking. There's the Bird side client, or at least a server API is written to take in requests using a web API only which means that you can narrow down what you are getting from consumers, and do strict parameters validation on all of your input data.
On the back end then, there is wrapper scripts and Bird C is configured to use only restrictive mode which means you can only execute show commands. You can't do server reloads or anything like that.
This provides BGP access only. This is no support for OSPF or ripper or anything else like that. We'd like to think that the back end of this project is temporary and it would be really lovely to see native API support built into Bird at some stage. Go BGP already has this, and it looks really, really interesting, and really nice.
That's a hint Andre.
Here is what some of the JSON looks like. It's pretty simple stuff. It gives very basic information. There is no example here about how the routing information is structured, but it's also pretty simple as well.
So, Daniel Karrenberg, wrote a CLI consumer in Python. It is intended mostly for networking people, because we all kind of like command line tools, you know I'm pointing and clicking on a web is okay, but actually the real work gets on on the command line.
It's intended to mostly for human use, but it can also be used for limited scripting. It's written in Python. It's not fully packaged yet but I guess that will happen in time.
So, here is some test outputs. So, your main command here is beye. This is the status command, you can assign, or at least you can specify what software fields you want outputted here using the C command, or the command line parameters. And if you want something that's a little bit more machine parsable, you can use the this parameter, it has limited support for further processing. This is a more complicated example. This is showing the BGP protocols. So, this is actually from a live system in INEX Cork, and that was pulled a few days ago. So, we use very complicated protocol names. There is reasons for doing that, but it just gives us a unique interface into exactly what the route server is doing.
And you can see that this is all the information that's pulled up, and again you can specify any of these parameters on the command line and run commands like that for checking established connections, that sort of thing.
Here is a routing dump from one of the routing sessions. And, again, you can see it presents the sort of information that you want to see if clearly readable form amount of so pretty useful stuff.
And you can ‑‑ sorry, I should also say that this is also a multi‑RIB aware, this is actually look at one of the RIBs and if you look at the master RIB, it will give you a different information because route servers have multiple ribs so it's pretty important to have this in from the start.
Okay. I'm not going to do a demo because that's not really going to work from here, but if you want to click on this link, you can go Matia's web consumer, that's live, and it has a couple of route servers and route collectors hanging off it from live IXPs.
This is some sample output. So, again, it's the same sort of thing that's presented on the command line, except that these were all clickable links, you click in there you get routing information about what's going on.
Okay. So post hack‑a‑thon work. Barry spent a couple of days during the week building a PHP based Looking Glass, and this particular link here, RC L‑kicks‑IPv4 dot INEXIIE/LG, that is live, you can click on that. There is a whole pile more documentation updated in the GitHub repository.
Monitoring. We now have monitoring of all our Bird sessions and this is really good. So, throughout INEX, even though we are a relatively small IXP, we still have 24 Bird Daemons running. That's quite a lot. So having a decent quality monitoring feeding into our NAGOS warning system is really handy. We're handling approximately 350 route server sessions. So, we need full visibility into what's going on in there and this gives it to us. Obviously that's fully built into IXP manager.
Some references. Live API is available on that URL. The live consumer is there. All the code is up on GitHub, we would encourage anybody who has an interest in this this sort of thing to download it and take a look at it. If you like it, run it, and if you have any suggestions for improvements, either submit ideas, or even better pull requests.
So, thank you very much.
Any questions?
GERT DÖRING: Just a very quick question. How long is the default cache time? Because, I usually end up being in a situation I am on something it's not working I fiddle with my config, look at the Looking Glass again and if that is then cache be the stuff for 15 minutes, it would get in my way.
NICK HILLIARD: You are absolutely right. It's configureable in code, I think it's something like a minute or two, but it's fully configureable, and in fact, the API gives the age of the data. So, when you are consuming the API data, you can actually tell how accurate that data is. So, any good quality consumers should print up to say look, this data is 30 seconds old or five minutes or whatever it is.
GERT DÖRING: Tremendous.
ONDREJ FILIP: Thank you, any other questions. If not, I will just mention that I understood the sort of hint you gave me and we will take it seriously.
(Applause)
MARTIN WINTER: That gets us to the end of the Working Group. Remember, please, rate the presentations, go there on the RIPE website, go and rate the presentations. And also, if you have, should start thinking about if you want to present something for the next RIPE, I would appreciate if you would tell us earlier sometimes than usual, so feel free to contact us as early as you think you have something or if you want to hear something but you think you are not the person to present, we should try to reach out to some specific Open Source, let us know too and maybe we can convince them to attend RIPE and present.
ONDREJ FILIP: Again, thank you Robert for monitoring the Jabber and Anna for taking the minutes and also the stenographer.
(Lunch break)
LIVE CAPTIONING BY
MARY McKEON, RMR, CRR, CBC
DUBLIN, IRELAND.