/299$35c3-talk-9532

Hallo Du!
Bevor du loslegst den Talk zu transkribieren, sieh dir bitte noch einmal unseren Style Guide an: https://wiki.c3subtitles.de/de:styleguide. Solltest du Fragen haben, dann kannst du uns gerne direkt fragen oder unter https://webirc.hackint.org/#irc://hackint.org/#subtitles erreichen.
Bitte vergiss nicht deinen Fortschritt im Fortschrittsbalken auf der Seite des Talks einzutragen.
Vielen Dank für dein Engagement!

Hey you!
Prior to transcribing, please look at your style guide: https://wiki.c3subtitles.de/en:styleguide. If you have some questions you can either ask us personally or write us at https://webirc.hackint.org/#irc://hackint.org/#subtitles.
Please don't forget to mark your progress in the progress bar at the talk's website.
Thank you very much for your commitment!

================================================

Ladies and gentlemen on this very last day of this conference the honor and pleasure to bring to you two guys here Jeff DiLeo and the old some who are going to day care about the lecture. Colonel tracing. B b f. So for the ones who don't understand this extend that Berkeley rockets filter. And what are we going to do is actually introduce you to the functionality of it of the Linux kernel. So they will call for some practical uses of the technology beyond Mirco profiling. And for me most interesting is actually that going to play a little bit Gault's here on stage is that only only a little mad with power. I don't know what kind of Goff's maybe the spaghetti monster or Ruby around the stage it's yours. Give them a warm welcome. Thank you. Borko. I'm Jeff. This is Andy where security consultants at NC Sieger. So we Hek things for a living and find bugs and tell people how to fix them and stuff and one of the things that we've been playing around with recently is this SPF stuff so DPF is extended. But what does BBF so DPF. For those of you who don't know maybe everyone does know is Berkely packet filter which is a bytecode VM designed for being run in a kernel to process network pack network packets really really fast and you filteringP.F. Is this sort of completely unrelated language that was designed in the linux kernel as it's own thing and merely kind of took the name. It's designed to be jaded with sort of direct not being to 86 64 and other modern architectures. It's sort of designed around tranced piling a sort of restricted seal limited language subset into this code so it can run very fast and it's being applied to anything and everything in the kernel that might benefit from having programmatic logic being applied to it. You interact with it through this BP offices call and basically you send it some code you send it some other meta data about what you want it to be doing and if it likes it it gives you a file descriptor if it doesn't it gives you an error.
So basically the real power of this stuff is what you can apply it to. So what does it look like. Well this is one form of it. This is a horrific form of a macro hell and see that basically no one should ever be writing this. But what is BP. BP if it's basically a BP implementation that just through all the rules aside added a bunch of registers. Added a bunch of functionality to call arbitrary sort of kernel functions that are registered into it and it's just applied to all sorts of things. It has a bike code verifier that we're going to henceforth refer to as the validator because it's not really good at verifying nor validating. For that matter but it does. And the main functionality of DPF really is the helper functions that are exposed to it depending on the context of what you're doing. So you know your socket filtering EPF isn't going to be able to read arbitrary kernel memory but your Kaypro probie PEF is so y y. High performance in plain packet processing safe smart safe high performance playback packet processing with network tunneling and custom might be tables rules and syst safe high performant in kernel programmatic operations with call filtering to reduce the need for buggy kernel modules or all kernel modules with a firewall subset. Okay. Wait what. So as BBF has sort of gained all of these features the Y keeps changing and people find new reasons to like it. So what. Why. Really. So it depends on what you're doing. We like to hook all of the things especially the Linux kernel so this gives us the ability to do that without risking crashing the kernel which happens when you start writing kernel modules in C. The talk is about kernel tracing after all. So you have this interesting us because it has potential to give Dietrichs which is sort of the dynamic instrumentation framework for Solaris and for BSD and MCIs. A good run for its money maybe not in the power of the things that it provides you metadata for sources of events which Linux is probably n
ot going to get to that level of unified design anytime soon but it's more programmatic and focused on lower level operations and taking like C code and compiling it to this and then using the fact that you've basically written C code to operate on C code in the kernel. So the trace is focused on like one off human based command line types of operations but TPF enables you to do all sorts of crazy wacky things. So let's talk a little about tracing tracing is basically very fancy logging of program execution. That doesn't really mean much for us. We don't really care about the logging so much we care about you know getting out things and hooking things and making do things and observing things so we care about dynamic tracing so it is dynamic tracing. There are two main kinds. There's the kind where you enable and disable existing logging functionality which we don't really care about. And then there is the kind where you add arbitrary functionality that wasn't there before. We care about the latter but we also still don't care about the word logging. We care about dynamic instrumentation so the two main kinds of dynamic instrumentation depending on your perspective there's things like function hooking and things like instruction instrumentation and depending on what you're doing and what you're targeting for your. Maybe your cooking is actually implemented with instruction instrumentation and you know PPF is basically like that. So let's the alternate title of this talk is instrumenting Linux with Ebeye for fun and profit a title that would definitely not get accepted in 2018. Had we submitted with this. So let's let's go in the history of when it's just a little bit on its tracing technologies. So the most important thing is keep probes which are over a decade old but they've done faster recently. And basically nowadays they allow you to hook a space any instruction in the kernel really. But if you try to hook a function at its entry it'll have extra logic around s
eeing it to completion on a return. And so it'll try to analyze it to determine whether or not it can do the fast path and if for whatever reason it can't do a fast path of finding all the exits it will essentially just use a break point and single step which is slow and you don't want to do that. If Trace is a basically file system API for adding things like kape probes into the kernel and doing all sorts of other stuff it's not super important from our perspective but it is used as part of BBF events is a whole bunch of crazy profiling stuff. But one of the key features is a very fast moving buffer for copying data from the kernel to userspace very quickly trace points are essentially that former kind of dynamic tracing where the functionality was already there and you probs are essentially probes for userspace memory and he DPF is this fancy combining robot that plugs all the things together so he DPF basically integrates with all of these different kernel technologies but the core concept when using it for tracing is that you have your SPF program and you plug it in to some sort of data source using one of two API is currently then you use something like the perfect events ring buffer or a memory map to map as your output to userspace. The latter actually can take is a bidirectional mapping between userspace and kernel space so the userspace can actually update that and send input to the kernel. So the sources that we're working on generally are UK probes probes you trace points and trace points which are basically trace points. The way this works is that you make a whole bunch of crazy syste calls and chained together. You start with your DPF sis called actually make your program. Then you use the trace API incisiveness to basically make a probe and then get 90 for it. Then you get a perf event by calling perf event open and you pass it in the the value that you got out the id of the Cape rope and then after that you've made that you attach the program to it an
d then you enabling. That's the old format. In more recent kernels there's a slightly different form where you can skip the the trace trace effect entirely but in practice it still ends up getting used because this magic number of 6 on the slide here is actually generally gotten through it but you don't need that. So everything else basically follows the same price except for the raw trace points which leads the perf open entirely. So how not to use the PPF is to do is to not know how how system D is apparently using it which is by assembly by hand. It's really hard. It's basically impossible to do anything complicated or fancy and it's also highly error prone. I'd be very surprised if that code was correct. So what should you do. You should use this thing called bcse. It's basically a framework for compiling C to DPF or NPF and then hooking it up to all of those fancy sources. This talk I'd like to clarify is not really about bcse but we basically end up using it for everything because it's essentially the only consumer of the kernel API which is also the case because the people developing bcse are also the people writing the kernel code for the DPF. So essentially the writing their own API is for them which are otherwise essentially completely undocumented and where they are documented it's completely lacking and none of that multi-color multi syste call stuff is really documented anywhere. I got that from reverse engineering how this stuff works. So TPF VCC is the only real option for doing this hooking. So how do you write a tracer with BCSC. Well generally you're going to use the python API that it provides and you're going to unfortunately have to write it as a single python file so the bigger this code gets the more complicated and harder it is to write it and your C code. You're generally going to store in a python string and then the structure. This is sort of a big ultimate structure but the really important thing here is you have your string you tell b c
c to compile it to F and you tell it to register it to events likeK.A. probes and your maps for Io and an event handler callback functions and everything just works. The single python file is actually because of weird limitation in it. That probably will get fixed at some point but right now it's sort of annoying to get around it. So let's write some code. This is what a very simple EB f k probe looks like when using bcse. ABC has a lot of code generation going on behind the scenes so this fancy Kaypro underscore underscore syntax. Basically lets it know that there's a cape that you want to use a probe on Ciss open which is the kernel side for the open source call this the name of this changes between kernel versions at least. Recently there was a change. It has a slightly different syntax now but it works the same way. So we hook this thing and then we just call a print k which is sort of similar to your generic print. If you've ever done kernel module development and then we run it and we get nothing. Why do we get nothing. Oh yeah. Jilib see piece of crap. So Jilib C has for the past while basically made all of your attempts in C code to open actually go to the open at call instead. So we need to actually hook open at and so once we do that we start seeing all of the events all over the system but lets generalize the code a bit because why should we have 2k probs when we really just want one piece of functionality because as it turns out both open and open at go to the same place. Do Ciss open which is the underlying implementation in the kernel for them and so we can now hook them at once and in the print k we now put a percent s and we'll try and print out what the path name is and we start seeing all these random proc FS things in system D journal D is doing all sorts of wacky stuff and system D2 it's really scary stuff but PTK is the trace print k is actually considered harmful and the reason for this is that it's a lot like f trace and that there is one log
buffer shared across the whole system which means that messages from different active tracers are kind of running into each other and no one knows who belongs. What belongs to who you're Aebi PEF programs also have a race condition in them with this. Because the way that is safe is that it's anchored to the executing process that has created the object or maintains control of the file descriptor and so essentially when your last remaining process with a handle enough theF.D. dies the DPF gets unloaded and then it unloads from the probes that it may be traced attached to. So there's a weird race condition because you terminate but if you're K probe is still registered hits then it runs the code it tries the log but your Kaypro even gets detach attach afterward. But there was also no client to receive the message the message is just stick around till someone decides to read them. So the next person next process to try and trace the stuff doing print k is immediately going to cease messages from other people. So that's a problem. So instead what you have to do is you have to rewrite the whole thing to use your own sort of custom data a passing mechanism and so in this way that we do this as we use the perf output perf events Ringe buffer and so we use this fancy macro that basically BCSC provides us to our C code and then we also set up a scratch space because the buffer stack is 512 bytes and we're operating on paths that are being opened and paths have a max length of 40 96 almanacs and so that's not going to fit. So we need to have some off stacks scratch space to put it in so we use this per CPP array that is entirely safe as long we stay within the 1 threat of execution. And so we do this and then we just have this sort of annotation for how we access the field that we only have one element in it it's always going to be there. This is never going to fail but we have to put it into police. PEF validator. After this we just copy all of the data in from the call into
the script space and then we perf submit it which copies it to the fast buffer for for sharing. This is a lot of copies unfortunately there are ways to optimize this but we're not getting into them right now. So on the Python side if you're familiar with Python C types it's an API that essentially allows you to interface with native code so we define the structure that basically mimics RC structure and then we register a handler function to. That will get called every time the submit happens and we register it to the table named output which is the table that we defined in the C code. Then inside the handler we do some casting and then we print the data. Then at the end we set up this Kaypro poll call on the BBF objects that bcse provides us and the naming of this has actually changed in one of the more recent versions of bcse but it still it's still there and it basically pulls for all the perf events for us. So how does this all actually work. Well let's just go into a little bit of cleaned up as trace output. So the main thing to focus on here is this Prague load up at the top with the type Kaypro because we want to make Kaypro we get to follow the script or 5 for that. Then all of the CIS Colonel to tracing stuff is the API for trace the basis of US and the way that you interact the way the critic Kaypro through it is you write something to it and then you probe at the ID number that you set out to get the actual ID value for it. Then you open a perf event onto that Id passing it as the config it's a very flexible call sequence so some config may be one of anything depending on what type of thing you're actually doing it's very much like an ioctl that then we take that and we call and ioctl actually on that file descriptor and then we actually use that to attach the PEF program to it then we enable it that way our program is actually associated with the probe through the perf event. So what does this DPF perf output thing actually do fancy macro. Actually doesn
't do anything if you look at it it just sets up it creates a dynamic struct type and it creates an instance of it but it never actually fills in the fields. That's a bit weird because basically it doesn't do anything it's fake the whole thing is just completely fake. It's marked out code that's going to be replaced with code generation anyway so it just needs to pass like the compiler. You know quick check. And then it gets replaced. This is actually pretty common way to do code based API. So what exactly is going on when we call this perf submit over here. Well it turns out that this gets replaced with a perf event output helper function call. And the important thing about this other than that it passes all the arguments in is that it passes in a flag called Current CPP identifier and this does when it's passed that is it attempts to pull a perf event object on the kernel side out of a perf event array. Using the current CPE was the index. The reason that there is a perf event array there is because back in that estrus output we had a couple of these map update Elam's DPF syste calls which set index 011 with some perf event file descriptors. So at the beginning of this whole thing the first thing we actually do is we create that perf event array and then we get a file descriptor of 3 back for it. Then towards the end we start opening up these perf events and in green is 0 and 1. These are actually the you indexes and then we get back for it. File The Scripter. Then what we do is we call this update ellum and we use the file descriptor from the event array and then we pass in this key thing and this value thing that these weird hex numbers these are actually pointers so key is actually the pointer to the zero value for the 0 and value is actually a pointer to the 8 and then vice versa at the bottom where key is 1 and then the value is 9. Then we just start pulling on these file descriptors and everything's great. So let's switch switch gears a little bit. Let's tal
k about the UPDF validator or hell it's very very painful place to be in to make SPF safe because otherwise it's doing all sorts of wacky things in the kernel it actually has real code when it calls functions that can read and write memory it just takes values and things it needs to be safe so when its kernel attempts to validate all of the code before actually loading it. So you know various things are you know your simple kind of checks that everyone's aware of. It is not allowed to loop or jump backwards to prevent infinite loops because this stuff runs straight inline the kernel. This stuff doesn't get like preempted or broken out of. If this thinks manages to get in an infinite loop the whole kernel hangs at least on that one thread. But even if your code doesn't have any loops the validator may reject it for a whole host of reasons. You know you could have calls that aren't static inline and then you're technically jumping either forward to it and then when it returns you jumping back. Or maybe the function is defined before you. And then you're actually jumping back to call and it doesn't like that and so then we you know we start unrolling all of our loops to get around that but then you have compiler optimizations that kick in. So sometimes your code was actually wrong to begin with but the compiler BCSC uses an optimized compiler pass and so through Killinghall VM and so sometimes you're wrong code if it uses a small enough identifier is a max operation count it'll just inline the whole thing and on loop and unroll it. But depending on what you do elsewhere in your code and you start adding features slowly but surely maybe something about the optimization changes and it's no longer able to do that. And then you get an error that doesn't make any sense because the part of the code that you changed has nothing to do with where the error is coming from. So the validator also tries to ensure that all of your calls to those helper functions are passed safe argu
ments. But the logic for this is kind of bad. There are all sorts of things that when you're clearly have bounds that you're not breaking out of everything is fine the validator just doesn't realize it doesn't pick up on it. The reason for this is usually that the optimizer has actually cut them out and the compiler is smarter than the validator is. And this leads to a lot of problems. So the more code you try to add to make your code obviously safe with bounced checks the more the compiler aggressively optimizes them out and so then the validator is left with nothing. So you need to play really crazy tricks to prevent the optimizer from leaving out the checks so the validator can know that you're doing them and then additionally turning on the updates to either B or the Linux kernel. You may have valid code becomes invalid or invalid code that becomes valid. Anything can happen sometimes only as valid as a really creepy like we've had code that was rejected or accepted based on whether or not a function returned a bool or a size T. That was either zero or 1 it was being stored in a u 8 T. And depending on other parts of the code that would change or comment out one line somewhere else in the code the boot was accepted or the size T was accepted. We just comment that one of them except the other rejected made no sense at all. So we at a certain point I got really mad and I wrote this kernel module that just really Hakli hooks into the validator to bypass a lot of the check so I could just write code when I need to write damn validator doesn't know what bounds are. So I will tell it what the bounds are. They're all safe. So it turns out the validator is actually super like tightly coupled to the interpreter. In fact you can't just turn it off because as it processes through the code it sets all of these fields on parts of the code that are necessary for it to actually load and run. So you can't you have to put in these really careful hooks into it to skip certain che
cks and then set up fake bounds on other parts after they've been set before with bad values so you don't get over the world. So we have this hacky kernel module PEOC called the YOLO f of course of course it implements its own custom function hooking implementation because it's useful in weird cases where we have DPF but we don't have like f trace API available because someone modified something in their kernel config caveats super xx 86 64 only like very specifically. I have just shellcode in there in like strings and and arrays and it probably doesn't work with kernel current kernel versions. Almost certainly not 420 which dropped a couple days ago and you know if you have unsafety BBF Kotey could it could very well crash a kernel. We are going to make this code available more to prove a point and then anything else. No one should actually use this. We have a link to our repo. At the end of the slides will actually be making it public right after the talk. So if you don't have the luxury of such a hacky kernel module what can you do. Well there are a couple of things that you need to do to appease the validator. One is to initialize your stack memory. Now this is actually a generic sort of vulnerability in networking kernel services anything that crosses a trust boundary where if you have a struct on your stack or anywhere else and you haven't initialized the memory and it has padding spaces between the fields and you try to do something like meme copy it or just write it over to somewhere else. You're going to copy what's in those padding fields which is maybe not what you meant to do. So NPF really gets mad at this for no real reason because like you're super privileged when you're doing this and you're already intentionally leaking kernel memory so it doesn't really make sense that it gets mad about you doing this but you basically need to do everything that you can. If you're copying something from the stack to basically just not have those padding fields ther
e are so many things you could do next. You want to implement your loop so a lot of pragma unroll through clang VM API to statically unroll any static bound loop loops and then everything is basically static inline function calls. Any newer kernel support not having not but also new ones on those new kernels yet so it doesn't matter then a lot of kernel code even if it is static inline functions that are in headers that are aimed that are aimed towards helping you parse various pieces of kernel data structures. They don't really work that well in BCSC sometimes because bcse every time it does a pointer to your reference to memory that's not in the DPF memory region. It will actually instrument that it will just code it to be this call to a helper function called DPF code read that will actually pull the data out and so it's not great at converting chained calls in anything within weird scoping contexts in C and so some of that will just fail to convert and it just won't work. So basically there are a lot of kernel functions where you're dealing with things like the task struct where anything that you want to get out of that you're going to have to call BBF prob read manually and pull it out piece by piece which is really annoying. Then you depending on what you're doing if you're trying to go for super high performance. You may need to sort of implement your own ring buffer to conservatively send data down to userspace with as minimal amount of copying as possible. If you try to do this you need to actually have some. You can't loop. So you need to have a switch that gets called every time that attempts to update a synchronized counter variable and then at the end of it it needs to roll back over. This is actually very carefully written code that appeases the validator. If in this particular case we're just doing 0 and 1 if 2 happens we're supposed to cut back over to 0. So if you try to actually put a case to it here. DP fef validator would just not accept it for w
hatever reason. So you need to you need to have the default case actually flip it over and you can't have another case actually flip it over for whatever reason. So and then a lot of the kernel data structures are actually very dynamic. Not in the sense that they're just pointers to pointers to pointers but in the sense that they're actually crammed struts that are just byte copied into each other. All ham fisted and if you want to pull things out of them you need to essentially walk dynamic structures of pointers to things and pull them out and do stuff in he doesn't like loop so you need to do that statically with both unrolled loops and static inline functions and so you just need to be very careful if you want to do it it's very tricky but it'll work. Usually the most important thing when doing this though is actually to know when you don't have enough sort of unrolled loops to fully count out all the data and you need to build gracefully when you do this. Know that you're truncating the data somehow and then lastly are not quite lastly but one of the more super important things when you're when you're doing things in the kernel both DPF. The reason you do it is because you want it to be fast. Otherwise you just use audit D for most things. So you want to actually be doing a little bit of preprocessing in the kernel to do filtering but to do that filtering you need to actually do things like comparisons and other things like that and if you need to determine that a PID was within a set. How are you going to do that without recursion or loops. Well what you can do is you can do something like have a balanced binary search tree in userspace walk through it and cogent out what the comparison would be statically for the SPF code. This works actually quite well. And then lastly one of the things that LPF validator really hates is dynamic length byte copies where the length comes from something that you read from somewhere else. Mostly because the compiler will optimi
ze things and it will be left in the lurch without enough information to know what's going on. So the way that we got around this was to basically use a weird static inline function that would kind of break up the logic and then finally it started passing stuff is crazy and then one of the most helpful things that you can do for bcse at least is unable to fully debug output. That will actually dump out all of the PEF instructions but will also dump out the C code. Sort of preprocessed already that is being implemented with those instructions so when you get an error you can actually look back at that and figure out where in the assembly something went wrong. Good luck. So it's easy right. That's it over so since for security consultants we had to ask ourselves can be used for defensive measures. We said why not. I mean PPF is really fast. And you know if we're successful we can improve the state of auditing. So yeah let's let's do it. So what does security monitoring software do. Basically it just watches everything. Program executions file access network traffic administration operations. And hey OK probes can do all this. So what would make this a good fit. Well tracing programs they can see everything and they can hook into any internal kernel function we can observe all kernel and userspace memory. So start off let's implement some example. You know modern tasks were going to begin with hooking the exact v call and just keep it nice and simple nice refresher. Forgot the Kaypro underscore underscore and then says exactly. I'm just a nice simple treys print k. All these examples she's going to keep it that. Not to be confusing. So now do something more interesting. We can compare the file path with known standard directories were Viners would be. So for example bin or these are a local bin and we could do this just by checking the filename parameter via the syste call. So we have it set there. We have our directory prefect's set for this one we're just going to ch
eck bin. It's too complicated for you DPF to check all that all the other directors and we have our unrolled loop there. We're just going to check. Bye bye bye. And see if it's in Ben or not and if not we're going to let you know. So another thought that came out of this was we have a web app. Would we be able to know if it's executing stuff it's not supposed to be. So I mean we can imagine that we have a web app that is just a front end to ping takes an IP address user input. You know usual stuff and we want to know if it's executing anything other than ping. So right here we just had the C and same thing hooking says exactly the dead if def PID is assigned via Python via a defined using ARG parse and if it's not equal to the pit of our Web application then just don't continue checking anything so we're copying the file path to a temporary buffer so you can get the length of it and then we're doing a quick exit. If the if they're not the same length but if it is the same length are going to make sure that it's not paying them to use it. So now were successfully monitoring file executions. Next we can watch for file opens from CRN directories. OK again nice and simple straightforward you can hook do open. But let's make this somewhat useful. We're going to see what if something is accessed under the root directory and same thing you have to pass it on the loop and then just check by byte. If it's actually in the directory but we just have to let you know something. All of these examples are insecure dangerously insecure. Just because it PPF makes sure that it's not going to affect the colonel doesn't mean that the stuff it's running is safe. And in fact the limitations make it extremely difficult to write secure code. One of the things we noticed was a time of check the time of use and we noticed this when you Kaypro bosses call. So the user supplied data that you'd get from the hooked says call can change from the time when you copy it to when the kernel copies it
and actually performs this call. And this is relatively easy to test for. So we'll just start with a threaded program where the first thread copies to different file paths into one character array. And then the second one just opens and you see here some nice output we got that everything just is all mixed and how we avoid this problem will our recommendation is to hook internal kernel functions instead of this calls preferably where the values already copied and are on memory. So there's some things that we have to take into account and make sure that we think about when we're creating these things such as how foundlings work. So what if it's not accessed by the absolute path from inside the directory from a relative path. And can we fix those. Well I mean we could attempt to by comparing the value or attempting to canonically it ourselves but Linux internals trucks are just complicated to parse and it's just way too complicated for what it was designed for. Or we can try to find internal function later on the has it but for this example it didn't get in. It just gets the same exact value as the call gets so we notice said VCC has example code for network monitoring and we also noticed that it didn't properly calculate IP IP header offsets pacifically did not account for the fact that TCAP options are variable length. So it was possible to screw for TCAP header so we sent them APRC and of course a patch. And here we have the patch where we're now checking the IP header length in general. Make sure you're acquiring SRG validation because it doesn't have a copy from user so you basically could be tricked into reading kernel memory that was supplied by user userspace pointer and then just manually verify the pointers. So we asked the question again can it be used for defense. Not directly the limitations make it hard to use it in this fashion. And instead it's much more useful to observe the data that flows through the system. So we have Unix dumb which is basically T
CAP Ddumba for Unix domain sockets. It demonstrates our successful fight against the validator and for that will be at the insides. So this basically just hooks the stream some message and diagram some message and retrieve a message header contents from it. And we use python like Jeff mentioned before to dynamically generate C code. So this is used to tweak the PEF program at runtime in quotes and also helps us get around the loop restrictions. So here is the binary search tree that Jeff has mentioned before and proceed to array which keeps track of the ring buffer slot because we can loop we have to generate the switch that Jeff also mentioned before and this is the code for that and we have to make sure we're checking the values in the kernel space and user space because there's a race condition where C might overwrite what Python has because it doesn't know that Python is finished with that value. So we make sure that C knows not to use the value if it's still in use. TPF isn't that good defense. What else can we use it for. Let's talk about offense. So let's assume someone bad gets some privileges and a modern Linux system maybe like Capsis Aben in a container containers is safe for. Really. What could they do with the BPA. Not actually all that tracing stuff cannot only see everything. It can also re userspace memory I repeat it can also write userspace memory. So DPF prorate user is the magic call that we can make from all these tracers. It allows us to write only to rideable users based memory so unfortunately we can't write to the text section. We can write everything else basically though so what's anything useful in these in these regions. Well buffers for reading writing data come in through the CIS calls. Maybe there's some sensitive stuff being read in a sensitive file descriptor script or maybe codes being executed from it. Maybe there is a privilege process doing that I wonder. Oh yeah. Crine CRINE. So we wrote. I wrote this thing called the con job w
hich Spoof's cron jobs. It is the Chron auto POLLNER It hooks all of the stat family of sist calls and basically whenever Chron is looking to get information to know whether or not it needs to reload ETSI cron tab it stats them and looks at the last modified date. So we just update that to make it very recent so it always loads the file and then we hook open that and close so anytime that files opened we save theF.D. that's going to be used for it and we store it somewhere else later and if it's closed we stop keeping track of it and then we hook read for thatF.D. and if we see read in there then in the probe return we actually stomp over the contents of the file was actually supposed to be with our own little root command. As one does so it's brief demo. This is actually a tenor it's slightly tweaked because Docker happened to update their app armor profile a little bit before Vought after after we made this and it killed one of the vectors and how this ran. So we we we can get a little bit more but but now we've basically just updated this to do the attack. So it was we see in here this route command is injected right at the top of Chron tab. So that's cool. What else is what about this. So we use some maps to have all of the stuff communicate safely with each other. Then we use some hash maps to pass them between the pairs of those things and we also have this nifty function that allows us to keep incrementing the time forward without having to go back to userspace. So what else can we do with the TPF going to go broke. If you recall we can write to the stack the stack has return addresses. We can also read the stack and all of userspace memory we can scan for the text section and shared libraries. I think I think people see where this is going so I wrote this thing called Jilib Sipadan which is the system D auto partner. So basically it scans PID one memory for Lipsey it backs up the stack content then injects RAAP chain into the process right where it's about t
o return to the RAAP chain basically runs the deal open on a path that we provide. We load our malicious library into pid 1 and then we clean up after ourselves all nice and clean. So to talk about what's going on. Not this one so we have this piece of it called Find You Lipsey. This guy are basically dumps us out. Return address and. The base of Lipsey. And then we plug it into the pawn Lipsey function that basically goes and does the rock chain which I assembled by hand based on the CJ you Absi which has lots of fancy gadgets all very useful and so we run this and then eventually returns and run it both loads our shared library which if you see the C code is actually what was there. So it's going to do assist log and then it's going to run ID and output it the temp evil temp evil is now filled with root and we've also seen a log coming from pid 1 System D. Hello 35 C 3. What a time to briefly go over how this whole thing works. Essentially hook on timer the set time because in all of System D basically calls this thing once a minute guaranteed and it's very reliable. We then scanned forward from the stack based struck that it passes forward so that we can find the return address and then we we basically scan for return addresses back into the call stub because we know what's in there. It's going to set x to the right the right ID number and then to assist call instruction. So we keep parsing forward until we find it and then once we find it we use it we find where this is Kolstad is we use the offset that we know from the beginning of Lipsey and we return both the return address pointer and this after we made this. We didn't realize that there was a Haikui to actually get all of the registers assist of the process because it DPF NBCC doesn't normally provide it to. You have to scan memory to get it but. So this is still useful for other things though. Then we hook timerF.D. again and in the return value for this we actually then go and we backup the stack for safe
keeping. Then we write the RAAP Chanan. Then we return userspace. The timer if the set time then returns right into right after the call instruction was. Then it returns interrupt chain or up chain. It sets up all the all the point all the variables all the all the arguments and then calls the deal open after it's done with that we don't want our module to have to clean up after this whole thing so we actually do as we have been watching clean up after itself so we actually call close with magic. Negative value that otherwise is not going to do anything in the kernel the is not going to care. But we actually in our hook on clothes use that as a signaling value when that is hit we actually write back most of the original stack except for the last remaining gadget then we write a new RAAP chain past the end of where the stack originally was the reason we do it in two parts is because the deal open itself could potentially use up a large amount of stack space and it would have Cobert anything that we put there. Then we return userspace into the last part of our old RAAP chain. The last gadget shifts RISP forward to the new Ronke chain. The new RAAP chain then just writes back over the the the last remaining parts the old rock chain with the original values and then it clears RTX so that it looks like the timerF.T. call that we finally eventually return back to. Seems to have succeeded so you would see is actually super stable like all of these gadgets work across multiple versions of Lipsey cross different Linux distro. It's all a great super portable. You'll have to regenerate the specific offsets but they're all basically the same gadgets. So what else can we do with EPF use it as intended. So you know once you're at once you're saying that Kaypro you know you shouldn't let anyone stop you. Right. You know you can prevent processes from interacting with the colonel. For example you can cross the seas from listing BBF programs and probes. You can stop them from creati
ngF.K. probes. You can stop them from loading kernel modules and you can stop them from phoning home. So this is important because the DPF probe right user actually when it's loading when the program is compiled and loaded this has a bit of logic in the kernel that sends out a message warning that this is in use it may corrupt things. So that's a it's a pretty good warning sign. But you have a weird race condition because anyone reading the message has to read from it and then if they attempt to use any TSYS calls to phone home you can stop those. They need to basically already have something like DPD K to do nonces call direct memory mapped packet IO to network device so that you can't catch them. But even then there are all sorts of wacky things you could do to stop them from phoning home. And there's also this magic DPF return override return helper that we had problems getting to work but it's supposed to be able to just stop this call from happening. So you know the one downside of all this is that we need to keep our DPF process alive right because once it once it dies you know the whole thing goes with it. What if we could make a review. Kate probes basically immortal. We've already taken over PID one. So why not just run our like no other probes from Pidd one itself. Once we're inside this means we stay alive until the system shuts down or vice versa. If good ones the system goes the system goes down with it. Which is great for us because pid 1 is now System D for the most part so everyone knows that thing is just falling apart all over the place. They're not even going to think they're being attacked by space age food kits. So every DPF is useful for everyone except people trying to build I'd guess on top of it which makes no sense. You know it needs to get a lot better at supporting that use case to be honest. And right now it just it isn't there. So a plea to Colonel Valparaíso who might be writing these things you really need a copy from users. No not ev
eryone has to check the pointers manually that someone didn't try and open a path that is actually a kernel address and you know you also maybe you want to provide helpers for these trickier file structures and things like that so people can actually get full paths out of them maybe give some direct memory comparison operations. We don't have to do these weird things. And also Meme says Please meme set said I don't like to have to waste my my limited amount of instruction bytes performing a meme set. So just some greets out to people. The VCC developers have been super helpful. The building really cool tooling. All this stuff is just getting better and better every day. Julie Evans has a great blog posts on how Quaye probes and all this stuff works. Same with Brendan Greg and Jesse Frizell. It has been building this thing calledF.D. which is almost certainly going to be the thing that I would want to inject into pid 1 as my root kit manager definitely because it manages all the things you can't hide from the future. Are there any questions. So it's going to be motors. We went into the Flying Spaghetti Monster mode with questions. Ladies Gentlemen number two you two great talk. Can you just turn off compiler optimization and BBF and save yourself a lot of trouble in the elevator if you want to go modify a piece of the code that does it. That whole part of the code is in C++ because it's all VM based and so you'd have to rebuild. BCSC to do it. I think there are. There are the ability to set other flags onto it but I'm not sure if when it where it does the injection into it if it's before Hafter and you could then overwrite the optimization. Number 3 what kind of revision of you to do is kind of bargain. Which color vision are you tried this way. Because there is a lot of improvement. Everything. So. So a lot of this we've been basing on primarily two kernels I think for 15 from a boon to 24 for 18 from 18 to 10. I want to say 16 or 17 from Cleveland X number 2 seeing
. It's easy to inject stuff with a whole kind of pretended to be one in the first place. And if I need it anyway. So I see that a need EPF in Kotal photo based stuff like Firewall stuff and all but how do I stop. As a user 2 like Skip's I'm installing to inject more stuff in my coding. So it's hard you say that once more how to make it harder for unknown Skip's to inflict more EPF my Conal units. You need Capsis admin to do most of this stuff and at least on containers there are a lot of operations that are in like app armor profiles and SC Linux stopping people from interacting with CIS FS that is used for a lot of this. The problem is is that in the newer kernels they've implemented another way around this that just involves using either the direct PPF syste call or the vent open without going through the system fest's. So that's going to completely bypass that protection. Probably not sir. Do you know if any of cloud providers which offer a container like yours as a service are rule number haven't looked if they're giving you Capsis admin for real that isn't usernames based. They've got other problems going from Capsis admin a full root is like. It's more of a neat party trick. It's not really like a full privilege escalation but if you happen to do it in something where someone is trying to apply all these other restrictions on your container and you happen to find a way out that would be bad. But also they probably shouldn't have given you Capsis admin in the first place. Great did someone out there and knew him someone else knew. Well Jeff Jeff and Andy and the old song Jeff Layo. We're going to shut it down here. Give them a warm applause.