Hallo Du! Bevor du loslegst den Talk zu transkribieren, sieh dir bitte noch einmal unseren Style Guide an: https://wiki.c3subtitles.de/de:styleguide. Solltest du Fragen haben, dann kannst du uns gerne direkt fragen oder unter https://webirc.hackint.org/#irc://hackint.org/#subtitles oder https://rocket.events.ccc.de/channel/subtitles erreichen. Bitte vergiss nicht deinen Fortschritt im Fortschrittsbalken auf der Seite des Talks einzutragen. Vielen Dank für dein Engagement! Hey you! Prior to transcribing, please look at your style guide: https://wiki.c3subtitles.de/en:styleguide. If you have some questions you can either ask us personally or write us at https://webirc.hackint.org/#irc://hackint.org/#subtitles or https://rocket.events.ccc.de/channel/subtitles . Please don't forget to mark your progress in the progress bar at the talk's website. Thank you very much for your commitment! ====================================================================== Thank you, guys. This Congress has been a real experience, this was my first one, so it's great to be here as a presenter for the very first time through. So we are both researchers from Munich. We are doing the and this is the topic that we have been looking for, looking at for the last three years. So. Briefly, what the stock is going to cover. We'll tell you what their motivation is, why we care about this technology, what this technology actually is. We will talk about that and we will show you demos. But really, the three concepts that we will try to focus on in the stock is isolation, interpretation and interposition after we've covered the basics. We'll look at cloud security and how this technology applies to cloud security and discuss open problems. And this will lead the discussion into criminal code and code integrity. And that's where Tom is going to talk to you about. And then we will have some conclusions. So really, our motivation has been looking at Malvasia Melder collection, Melodia analysis and using virtualization and virtual machines to do that. We also been looking at intrusion detection and intrusion prevention. I was actually part of a group of cyber a project last year that looked at that for seven months and we created a prototype that will actually show today. But the technology is also applicable to debugging and more importantly, started blogging using virtual machines. And of course, cloud security is a big part of that as the entire cloud is based on virtual machines. But the upcoming things are mobile security and using the arm processors that are in your cell phones to use this technology to provide some sort of protections against malware. Which is not our motivation is to do DRM and espionage and stealth fruitcakes, but this technology is also very applicable to that. But we are really interested in the defense side. So you heard a lot about hacking. This talk is not about hacking. This is about protection. So when you're your virt ual machines and you need to tell what's happening in a virtual machine or you need to control it, even the common approaches that you have today is that you install something in it, right? You have virtual box installed, virtual box tools. You have very useful virtual goods you have. Then you install those tools, which is very easy to implement and convenient. It can sometimes use shared memory or just Muslim. Just use the network to communicate with your server outside. You can also use network monitors, snorkeler or whatever idea systems you have, which is better than in gas stations, which really have no isolation. You're running in the virtual machine network. Monitors have this isolation that they are outside of the machine, so. It's more protected against attacks from within the same vein, but at the same time, you lose context. You look at the network, you see very limited information about what is actually happening within that virtual machine, especially if the traffic is encrypted. There are some steps into using live forensics on virtual machines or on physical machines, even where you can just scan the memory which has isolation and context, but you really just have a pass if you into the system. So all of these are valid approaches, but they all have their limitations. And this is where VMI really comes into play. And the basic idea behind VMI is that you look at the virtual hardware of the system, and just by looking at that, you try to understand and reconstruct what is happening within. In that sense, it's very similar to be back at the inspection where you're looking at packets and you try to reconstruct what's happening within the operating system, and you need to have some understanding of what operating system is running within that virtual machine. Or if you're doing deep packet inspection, which operating system segmented the packets to be able to reconstruct it. And for VMI, really, the three points that we want is isolation, interpretation a nd interpretation in which in isolation, I mean that you have some sort of increase the resiliency against attacks. And you also have complete view of the system. You have access to everything that that system is doing. And since you are in a more privileged level of the system, you can even interpose yourself into the execution of that machine. So we're going to look at these three things. So first, isolation, this is why we are really moving things out from a virtual machine and not having youngest agents is because if you run the cold outside, you can avoid ingest hooks, which is the most common attack vector on anything that you want within that machine. That makes tampering harder because you have isolation provided by the hypervisor. Of course, that depends on how good your hypervisor is and how hard it is to break out from the machine. But provided the hypervisor is secure and your your hardware is good, then you have increased trust in the code that is going to do what you want because it's in a protected region. By doing this, you also gain some performance because instead of having to deploy like an antivirus software in every virtual machine that you want to protect, you can just have one antivirus software that just protects all of your virtual machines. In that sense, you can avoid things like the antivirus thornber. All of your antivirus scans kick in at the same time on your virtual machines and just kaled the hardware. So interpretation is a very critical thing because you really want to understand what's happening within that machine and we have a very heavy focus on memory because memory is really the common point of all the hard work that that system is using. But we, of course, also have access to the CPU registers, the desk and the network, so those can come in handy as well. But we're really going to be focusing on memory. The reconstruction of the state, though, from memory is really hard and there are many problems with it and mostly because of complexity, you look into a virtual machine and it probably has Linux running or Windows running, and that's a large piece of software. Trying to understand what that large piece of software is doing just by looking at the original hardware is hard. And in the case of Windows, you don't really have access to the source code. So you really have to reverse engineer a lot of the things. And your code is running outside of the real time machine. The data that code is interacting with is still potentially tempered, tampered. So you have a dilemma of what data can be thrust into the machine. But let's start at the beginning, so if you're looking at a memory, what you see is a virtual machine, introspection is physical memory, but of course, that's not what the operating system is using. It's using paging and virtual memory, which has been around for a long time. And the basic idea is that the operating system sets up the tables and the hardware walks these tables when it needs to translate to a physical address. So it's a nice interface between hardware and software or software, provides the data and the hardware actually walks in. The problem with this very basic thing already is that while we have a bunch of different paging systems, so we have the 32 with Beijing and two extensions to that, and then we have the 64 bit Beijing. All right. So now you have to reconstruct all of these paging mechanisms and emulate what the hardware would do. So that part is fine. That is defined in the Intel manual. That's should be straightforward, except there are three bits that the Intel manual says that are at the optimal operating system to decide how we use it. And Windows does actually use these bits, at least two of them, which means that we have a difference between which operating system is actually running, how these tables look and what we can get from memory. So, for example, volatility has this line when it does a translation that looks at these software defined bits in t he pitch table entries that says that if this the 11th bit is set by the 10th, it isn't then that page is present in memory. And they do that for every translation, even if the guest is actually Linux, which is of course, incorrect. So this is, of course, not a big problem because, OK, we need to understand that we are looking at Linux and not do that, but already shows that accurate reconstruction is complex and you can easily miss things. And this is still the case with volatility. There are, of course, other problems with memory, like if memory is paged out, then you don't have access to it directly, but we have access to the disk. So now we have to look for the file and reconstruct the swap file, which is, again, totally dependent on the operating system, on how it's implemented, which just adds more complexity on top of that. So that, unfortunately, we can avoid some of that complexity, for example. Now we can inject fallout from the hypervisor to have the guest operating system bring those memory pages back from disk so we don't have to understand the file system and the swap file. But of course, that takes time because the machine has to execute to bring that back and it's only going to be available in the next release of them. But it's progress. So what are problems with Beijing is that you need to find a page labels and the way forensic schools find the page labels in memory is by really just having a signature that they scan for. So in this code segment here, we have, again, volatility which defines that forbids there, which is the signature for scanning for page labels. But that's actually just the signature for a process. In fact, it's just part of a header before a processing memory. And of course, that's not very robust with BMI. We can do a little better. We have access to the registers and we can just read out the concrete register from the virtual machine and just use that for translation. So BMI has advantages over just raw memory, Forensic's. Afte r we have been able to translate virtually the physical addresses, of course, we need to understand the kernel and reconstruct what the colonel sees and what the colonel does, which really requires debugged information. And Microsoft gives the debug information for free. But of course, the format is proprietary. It's the PDB format, which fortunately over the years has been slowly reverse engineered. But it's a pain to work with. But recently, recall, which is a form of volatility by Google, really now nicely supported, where you take this debug information from Microsoft and just dump it into Jason files. So with Microsoft, this is actually in between those. This is actually quite nice. This is very workable. With Linux, this is, of course, a bit more problematic because we have a ton of different kernels. So even if you're not taking into account your custom compiled kernel, even if you just have stock, Ubuntu kernels, there is really no across the central repository that you can get these debug information from every distribution has its own. Maybe they do have it for all distribution. It's probably gone. So having that information is not as easy as on Microsoft Word. They just have a very nice central repository. So there is some version that in that sense to have a central place where you can grab the debugging information and that's the federal radar server. But it's not really used. I mean, how many people have even heard of it? I don't see any hants. Right. But there are at least some initiatives in that direction. Going back to scanning, so even if you have the information you need to find those data structures that you care about, right, you want to find the data that you are interested in. So you need to find first the kernel and then the processes and files and whatever else you are looking for. And there windows, at least this is done by looking for full headers, which is essentially just a debugging header attached to memory allocations that Windows do es, money that locates a file object, for example, and very similarly to the. Scan that we saw before. These are usually for bite signatures that you can scan for. But the problem with that is, is that you have a lot of them and you don't know which one is valid and which one isn't because you have partial structures in memory and old structures in memory. And then you just have false positives because you are scanning for four characters in memory and you're going to have a lot of that. And really, memory doesn't get reset, so if you're free or structure, it's not like it's magically disappeared from memory, it's still going to be there, just the operating system, three sisters, it's not there anymore, but it's really still in memory. So you need to have more respect to validate those hits. So now you have even more complexity in trying to interpret which data is valid and which one isn't. And that just makes everything more fragile. And of course, people have discovered that this is fragile, so in 2012, there was paper about one bite modification we just broke finding the labels and then we'll just give up because it can translate to physical memory. But this year, earlier, there was another talk about just adding a ton of fake signatures. And then what we will throw up as well, because we can't tell which one is real and then you have to go through by hand. And that's really not a feasible approach. And that just really highlights that we have fundamental problems with trusting the data and we don't have a good way. To evaluate which one's which data structures are valid from what we find through scanned. And that's where interposition becomes critical, because if you know the execution flaw in the system and you can trap at specific locations, you know what state the system is, the system is in and then you can avoid scanning. So, for example, instead of having to search for the Kleberg structure, which forensic tools use to map out some basic structures, we can just use the VCP registers and we can automatically find the kernel. And since we use the debug data, we don't have to use the in-memory debug data. We can just use the one that we generated. And then we have complete map of the current. Furthermore, the hip allocations that they used before forensic stores to find files and whatnot can be trapped automatically when the colonel actually allocates something on the heap. And what's great is there is actually native support built into Xen, so you don't have to custom patch some weird hypervisor, you can just use what's already provided. And this is great because this was actually designed for debugging, but it's really unknown and there is not many documentation on it. And if you try to even read the API, it's not going to tell you much. So you really need to look at some sample codes that are hidden within the Zen source, but it's there. So let's look at what I'm actually talking about. So this is not a live demo prerecorded it, but. For all intents and purposes, this will be so I'm running on some four point four and I have two vans running down zero and Windows seven. Thirty two bit and I can list all the processes that are running in the system. This is very similar to how forensics does do it, so that works, and I can see that I see more processes running than what process parts manager tells us and we note. There are the kernel modules which show up for and the US colonel that you see so you can tell what's loaded in memory for that machine. But these are all using the kernel internal data structures, which you might not necessarily. So now what we see here is actually the live execution of the Windows kernel where I trapped all internal system functions that result from system calls. So these are not the system calls themselves if I'm catching, but the functions that are being called from the system calls. And as you can see, there's a lot of things happening within windows, even if I'm not doing anything right . So it's impossible to see what's happening, so I just dump it into a screen log so I can grab through the live output. So these are actually the functions that are being called, if you can catch some of them. But I mean, I'm not doing anything and there is a ton of them being called all the time. But this gets you an idea that you can actually jump into the execution of the system and see what's what's going on. And if any of these functions you have a deeper interest, then you can actually look at them. And for example, one of the functions I'm trapping is the heap allocation function where I can check what structure structures Windows is actually requesting to be allocated and keep allocations, of course, are very much in the first part of the system. So if I just move the mouse around, you see a ton of ILD structures being constantly allocated on the system and that's just by moving the mouse. Now, as we have deeper interest in some of those structures that are being allocated, we can do that as well. So, for example, if I want to check what files are being accessed by Windows, I can do that just by watching what objects are allocated on the heap. And I just clicked on some random personalization. And these are all the files that windows those accesses in the background that. I'm not even doing anything. This is just still loading, right? It's still going. All right. So if you want to debug an operating system, this is really great because you have full understanding of what what it's doing. And just to show you that this is actually what's happening, I just create a document here on the desktop. And as you can see, there is still a lot of things happening, so I don't even necessarily see the file that I created, so. Actually, I just going to cut through that screen log to see if the file was actually in there that I created and yes, it's there and it's actually a test that the data because Windows automatically adds the extension for me. So I did not know that . So the way this works is actually using four types of events. And then unfortunately, this is for Intel only, but that's good enough. The first type is move to see register events. So this is there are three controller registers the operating system uses to define various features. The CRT is, of course, holds the stable address that is used for the current executing processes. The physical translation zero zero zero four holds. Various options can be used to flush the yabbie and they can trap every time those registers change. So we have an understanding that a new process gets scheduled in the operating system. So that's great. The second most important one that I use here is debugging breakpoints, so these are just really the debugging points that you would use if you run GDB or Olie debugger, which is the free instruction hex code, and you can just ride that anywhere in the kernel code pages and it will trap and you can actually configure the machine to trap into the hypervisor run such a break when it happens. So that's pretty great. The other event type is via the EPPY violations, which is the extended stables where we actually have a whole other set of stables that are maintained by the hypervisor that maps what memories allocated for that virtual machine and before this was done via the shadow stables. But now we Deepthi. This is all managed by the hardware, but the good thing is that you can set different permissions in the EP three tables than what the operating system sets in the virtual machine. So you can actually trap various accesses, read, write or execute instruction stretches from memory. And then the fourth one, which is also critical, is the monitor trap like a single thing, which is an invisible single stabbing feature built into your Intel processors now where you can actually single step of virtual machine without having the anything within that machine, knowing that these are being single step. Well, there is a bunch of other features that Intel now allows you to trap on, but these are basically the four ones that the Zone currently supports. But as you can see, this is already pretty cool, like you can do a lot of these four types. As I said, using the zapper is really not very nice, and it's kind of hard to actually wrap your head around how things go together, but unfortunately, you don't have to. So the way I implemented this system is using Vme, which is a hypervisor agnostic library that we have been working with and actually extending heavily, which is repr API around Xen VM or even if you have a raw file dump's, you can do introspection on and it supports all the paging that is out there. Plus, I recently added armed support to it so you can actually do introspection on Android devices, for example. But for now it's basically Windows and Linux. It has a Python interface. And really the idea here is that you write code once using VMS and then if you switch the hypervisor underneath, it's not going to matter as long as the drivers are set up properly between like VMI. And then you write code once and it's good to go. You can use it to read and write into memory and it has repr around ZENN events. So it's actually intuitive on how you need to do like single stepping and setting up things and it's open source. So it's our GPL, so we are free to use it with any project that you want to implement. So a little bit more details about how the actual tracing happened that we just saw is I injected a breakpoint into the accelerator for the attack, which is the heat fabrication function. But of course, when that function is called, the memory is still not allocated. We need to catch when that function finished. So we need to extract the return address from the stack trap that is. Well, when the return address is hit, then we can actually extract the address where the memory got allocated, the three key areas that there can be actually a bunch of different threads calling this function. And while using in this function, it can be complex, which so you need to keep track of all the colors of which ones are actually active. That's when it returns. You know that, OK, this was the structure that I was actually waiting for. And then you know where that structure is allocated. But it's, of course, at that point is just a memory address. Its structure is not initialized or anything. So now you have to really watch that memory region as its first, like being zeroed out and then slowly updated as the operating system fills in all the headers and the information that we really care about. So, for example, here we would care about the access to the file and the file name. So we just just set up the empty traps to monitor as that page is being updated. But then, of course, this traps the entire page that that structure is also you're going to have unrelated right events. So there is even more logic in there and that adds overhead. But as you could see, it was quite responsive. You could move the mouse around and interact with the machine. So it's not too bad. What's really cool about, though, about hip tracing is that some basic kernel, rootkit mechanisms can be really sidestepped. So, for example, directional object manipulation, which has been around for 10 years, is the idea that you can break the integrity of kernel data structures without actually affecting the execution of the system where you would have, for example, a process in this case in the middle that doesn't want to be found by task manager or by the user. And what it would do is unhook itself from that process list that I actually showed you in the beginning where I listed the routing processes. And that's just the link list. So if you just switch the pointer out, the structure will be still in memory, but you won't be able to find it through the link list. But of course, now you keep tracing, you know, exactly where every structure is allocated without having to walk the Lingle's. So who cares if it's on from he re? I know exactly where in memory that process structure was allocated and that increases the trust in the data. And furthermore, I can do some type of cross validation to see like, OK, I know this structure gets allocated at this address, but it's not showing up in building lists. Well, that's probably some rootkit. So let's look at all of the dental. We are learning processes and of course, I have pains running, I cut down the output so we can actually see what's happening here. And what I'm trying to show you here is that you can really catch any event that you want. So in this time I wanted to catch when the file gets deleted, but before it actually got deleted and what I did is I actually fired up volatility and done that found from memory before the operating system was actually able to erase it. So now it's in the temp folder and there you go. It's actually extracted into the control domain. So you really have full access to that virtual machine and you can reconstruct everything that's happening within and even extract files. That verb in this example closed and deleted. But I was able to extract from memory because Windows doesn't actually delete it right away. And this is a very handy when you're dealing with malware, because a lot of the time what happens is that you have right caching enabled on the disk. So when you actually say save this file, Windows doesn't actually save it to the disk right away, just queues it up and sees rates for a while to have enough rights it off and then writing to disk. So even if I would look at the disk after I saved the file, I might not even find it because it's still in memory. And for malware, this is actually usually the case because you have temporary files that the malware dropper extracts and then loads into memory and then cleans up after himself. And you really need to catch the delete events, otherwise that memory region can get recycled. So, for example, what I was doing first is I was just running some malwar e samples and pausing the VM. Then I saw some files that were interesting and then I tried to go there a volatility and dump the file. But of course, the memory already got recycled. So there is really a very short time frame where you can actually catch these files. So interposition is really critical if you want to do malware analysis. Let's look at all of them. That was a fun. So this time I have Windows seven, 64 bit. And there is some information about it, but it's 64 bit Windows seven. And there are all the processes running, and as you can see, there is a task manager to three, six, four, and what are going to showing this demo is how you can actually take full control of that virtual machine, not just extract files and monitor passively what's happening. So, for example, what they're going to do here is actually hijack the task manager to start a calculator for me between the virtual machine, which is quite great because I didn't have to install any custom software within the virtual machine to take control of that. I just need any process that is executing with it in its future machine. And I can do whatever I want. No passwords, ask no username. Really. You have full control, right? That's what being in a more privileged level of the system actually means. And yes, you can fire up, command the and pass whatever arguments you want to it. And as you can see, I actually get the return value as well. So I know what the PIV of that process that got created is. So if you want to execute some function within that virtual machine, you can actually extract the output of that and then you can just pipe this together. So you have a like an external shell for that virtual machine effectively. So now what they're going to do is actually just fire up Internet Explorer and send it to some websites of my choice. And there we go, the virtual machine just happily does that, and it's pretty much instantaneous, so I really can't control what the virtual machine is doing. So i f you're like a sysadmin, this is great. You can install software within the virtual machine, close processes, really whatever you want. So all the demos that I showed you are actually part of a Malvika analysis system that I built for my PFG, which is, as I said, built on like VMI volatility and recall, and it's released for free for you. So all of these demos and tools are now yours to play with. And again, the important thing here is that for our analysis, you really don't want to have anything identifiable within the virtual machine that you are running in, because that's what Melder usually looks for. So if you have the winningest agents, you don't have any guest artifacts that nobody can look for. And then you have a more stealthy environment to do your analysis in. And, of course, you can extract all the temporal files that you would otherwise miss. And OpenOffice just crashed. Doesn't really like and better than most. Not at all that I wanted to bring your attention to, which is really just fresh out of the oven, is. Debugger, integration. So, as I said, all of the features built in December really designed for debugging, so why not use some of your favorite debuggers in this case, gdb? So this should be online by now? I haven't actually had a chance to test it, but this has just been released. So this allows for stealthy debugging using the hypervisor and you can really debug your operating system. So if you are developing a kernel driver or whatever, this is really handy. And of course we can add debug integration, which is coming next. So go check this one out to. So let's go back to isolation real quick. So now what we did actually is we moved a lot of the security stack out of the virtual machine and we can do a lot of that. But what that really does achieves is just moving the target exploit is getting harder because you have hypervisor based isolation. But of course, now you're running in a more privileged part of the system. So if you actually manage to break out using the same vulnerability potentially in the security tool, then you have a bigger reward. And it's not like it doesn't happen. I mean, there was just a couple of weeks ago, a month ago, local privilege escalation in OSX, which is a host based intrusion detection system. So it's kind of naive to think that our system wouldn't have any vulnerabilities. So why not separate the security stack into also deeply religious machines and then has some features to achieve that? And that's the security modules. And what that allows is to really disaggregate the trusted computing base that you use with an. So you don't have to put the security stack with Indonesia, as I did in the demos here, but you can actually create a virtual machine that has control over a set of domains that you want to protect or just maybe a single domain that you want to protect without affecting anything else on the system. The way it works is. It creates a wrapper around hyper calls and has a policy to. Define how the interaction between domains can happen. This is actually a piece of software that has been contributed and maintained by the NSA then, and they have a bad reputation. But in this sense, they actually do some positive work as well, because without their work, it would not be possible to do this. As I said, what it does is actually a report on hyper calls and then you have the flash policy engine where you define which virtual machine can do what I recall and what the target is or that type of call is. And in that sense, it's very similar to Linux. You can use the same tools to find your policy already to allow and check policy. And it's disabled by default. But I mean, that's you can recompile and then have this. But really, it's only usable from Zank four point three on Linux, three point eight. So that was actually my first batch of Linux and three point eight when we were actually testing an early version of the system. That was not an original design yet. And we disc overed that the Linux kernel actually did some excess control checks itself to see if it is Bamsey or not. And if it wasn't M0, it would deny issuing the hyper calls that we needed to do this type of security. Of course, if you have access and defining what is allowed and what isn't, you don't need the kernel to tell you that. And furthermore, that's really an arbitrary check. If you have the rights to do insert kernel modules within your kernel, then having a security check there is really not going to get you much. So my patch was really just removing that surplus check. And now with these tools, we can actually start thinking about cloud security. So we have a mechanism to have different security policy for different users of that cloud. And the idea would be, is that we start monitoring of your machine before it goes live, so we have some sort of baseline of integrity in that. OK, this is my backbite hasn't been released on the net. So I start monitoring and we can see if some critical data structures get hijacked or maybe even then the code, if there are any inline hooks being injected, we can detect that and we can really just limit what we are trusting in the data to stuff that is defined by the hardware, because if malware changes those structures, the most that they can achieve is dossing the system. So they probably won't touch it if they have some better use of that machine. But we are back at what data can we really trust in the system? So, for example, with the events that I showed you with EPT violations, there's already some limitations in what the hardware tells us and there are Cordner cases. So, for example, remodify right instructions which involve destruction from memory and then right back to memory, which is usually used for mutex isn't concurrency stuff. The Intel manual says that while it's really implementation specific, Venerates says the redbait, so these were all they said the right bit. But Vatterott says the redebate, it's well, we don' t know what happens. So that's not really cool. We actually back patch that in zone four point five, which will be released next month. But there are ambiguities and there's a lot of ambiguities like that in the Intel manual. So we actually wrote a paper about collecting all of these. And these are just really some of what the limitations are. A bigger limitation is really the tagged translation leukocyte buffer, which was introduced in 2008 both by Intel and Andy. Which essentially cashes the translation of all the physical addresses into a cash that you cannot query, it's really just for the hardware to speed the translation up. And now if you have a tactlessly, that means that the labels don't necessarily represent what translation the guest actually use it. So what we do with VMI is we look at the labels in memory and the emulate what the hardware does, but we don't have access to this cache, which is a problem. Because we detective about the cash, the entries, survivor visit, even entry, so these are actually persistent Tibey entries and that means that the route potentially can mocksville the page tables in the guest without actually crashing the guest. And we would have no idea what it's doing because it can set up the page table to point into some benign code called region. And when we try to see what code is actually running, we see that, oh, it's calculator, it's no problem. But what machine is actually executing is something totally different. Of course, there's some limitations to that, so depending on what hypervisor and guess the operating system you're running. So, for example, Zain always assigns a new tag whenever the schedule is a new process. So you would have to do some malicious modifications to the page through essentially every time a new process is scheduled. So we might be able to detect that. And Windows seven, surprisingly, is actually pretty good against this because it always flushes the global pages very regularly. But if you're running Linux on Cavium, well, this is a more realistic problem. Well, if you think about cloud security for a moment here, there is really no need to move everything out from the virtual machine. So with malware analysis, there was a reason you don't want to have any artifacts within the gas that it can detect and shut down really quickly. But with cloud security, we want the malware to stop executing as fast as it wants. Right. We don't want it to stay alive. So maybe it's enough to have some sort of security agent that we can protect from the hypervisor level. But it would have better performance and better visibility into the system because it's running in the same context as the machines. So detective problem doesn't exist for investigations. So we can do some sort of hybrid approach potentially. And this is actually where the hardware is heading. So in the upcoming Intel CPU's, there is going to be this extension called Intel View or where virtualization exceptions, where you can actually track the NPT violations within the gas so you don't have to trap out all the time into the hypervisor. So you will have really better performance and then you can do some sort of protection of that code that's running within the system. And as I showed you, you can really control what that system is doing. So you can not just control the code, but you can also control the beta. So you really can achieve securing agents. What our approach would be for cloud security is to really just reduce the size of the guest operating system. There's really no reason you need a full Linux stack in your operating system that just serves Apache. Right. So there are some risks in that direction. So just in this Congress, we saw a mirage or was but there is also not from colonels and OS, which really just try to achieve that. That reduces a virtual machine into a process and then just use the hypervisor as your scheduler. Just kind of ironic if you think about it, because processes back in the day w hen they were introduced, they were called virtual machines and now we would have virtual machines becoming processes again. So we are kind of going in a loop here. But also, we could try to secure the Angus kernel, because what we have actually been discussing goes forward, the blacklist approach, we look at what malicious changes happen to the system and we try to deny that. But that, of course, places the burden on the defendant to enumerate all the possible things that could go wrong. Well, good luck trying to do that. So now I'm going to hand over to Tom to talk about the whitelist approach, which might be a better alternative. Yeah. So if you want to verify the integrity of the system, which we have to to run our youngest agents and we have to see what the kernel is in the system and whether it's changed by malware or not. So what we propose is a whitelist approach that would allow verified changes within the system to the kernel. So for that, we need to validate and see all the changes in the system that we want to allow. So code integrity basically is assumed to be an easy thing. You have the code which from your binary, from your kernel, you hash that and you compare the hashes. If it's matched, then the kernel is or the integrity is there, but actually lets employees runtime patching or runtime self code patching to have performance optimizations within the system. So for that, if you run a Linux kernel, you now have to differentiate between legitimate and malicious changes to your software. So there are two kinds of changes that are done by the Linux kernel. So for the one thing, there are the easy load time patching things out. The easiest thing is the reallocations, which or alternative instructions which are architecture specific. So dependent on the CPU, the systems running are dependent on the hypervisor. The system is running. There are different instructions patched into the code of the system. So depending on the hypervisor, for example, some func tion can be implemented that way or another one. So but you can say, yeah, load time patching can be handled by loading the kernel in a secure environment on the same architecture maybe, and creating a hash. And we still have no problem. But on the other hand, we also have runtime patching employed by the kernel, which is, for example, employed for hardware hotdogging. But this has to be validated and verified continuously as the system is going on. So I want to show you two examples where this is actually applied in the current Linux kernel. So, for example, S&P, WLOX is one of those mechanisms. So currently, if you have a virtualization on a system and you need scalability, the number of actual CPUs that are allocated to the virtual machine may change during runtime. So this gives a problem because if you have a single threaded operating system and you have four locks, then you don't have to like Midcity of that lock because you have only one CPU anyway. This changes if you have multiple CPUs, because then you really need atomic operations and for performance reasons. If only one CPU is present in the system, the Linux kernel does not use those atomic operations. But instead what it does is when another CPU comes to the system, you automatically patris all those locations in its code to atomic operations, which are slower then but required in that place. And also this mechanism can also be used to replace entire functions within the Linux kernel. It's currently not used as that, but the mechanism is basically there. And another thing where runtime patching occurs in the Linux kernel, for example, there is a mechanism called jump labels, which is equally for performance reasons. So you have something, some checks in the Linux kernel that will be unlikely passed. So other than just checking constantly, you know, is it an unlikely case as of now? An unlikely case, you just patch out the jump to a certain code snippets out of there. But once those functionalities are enabled, for example, by a user or by any hardware mechanism or whatever, the Linux kernel just patches the jumps into the code to the function that should be executed. And with that, we have the additional problem that we don't have only like two possibilities. Yeah, the patches or the jump is there or the jump is not there, but the location where the jump points to also. Just to point to a location which is consistent with the entire system state. And here you have the perfect thing for something in return oriented programing where you just need to have an arbitrary jump within your code. So these are mechanisms that we have to defend against and to verify that we already heard. We have simple approaches like locking the current. The easiest thing with a hash based approach, as I said, is the NIHAAL changes to the current code and runtime. But here we have the problem that we completely disable all of this legitimate patching approaches. And, uh, yeah, on the other hand, you also can say, yeah, well, most of the code is static, but just a couple of locations might change. But here you have an equal problem. You know, the number of hashes that you have to maintain for every location that might change is in very large numbers. So you have to maintain very many caches. And also the Linux kernel in its current form has a problem that for some code pages, both code and data, reside on the same executable page. So the kernel for its code pages uses large pages, which are basically two megabyte pages that are used for kernel code. But on the last page, there is still some spare memory that would either be wasted or the Linux kernel gives that to userspace applications if they allocate some memory. So here we also have the problem that we don't know really is this code, is this data? What is to verify? What should be on that page? So this is also a problem for hashing the kernel. So what we propose here is a trap and violate approach using Vme. So we know patching only ha ppens to predefined location. So from the binary we can derive which patching mechanisms are there and where or at which offset in the binary the code will be patched and also with which really values under which circumstances. And for that we can now retrace those patching mechanisms and understand what they really mean and also see their system state that it's consistent. And also this fixes another problem because the code patching is not an atomic operation. Like, for example, if you patch entire functions, that cannot be done at once. So there's always like between the two good states, a bad state in between, which is also good because it leads from the one to the other. So you have to have a system which is aware of all those intermediate states and can handle them appropriately. So we want to or we propose to write events to criminal code. And when the kernel code is changed, you can validate that the change is not malicious and that does provide the integrity of the running Linux kernel. So this basically was about the integrity of the code. And with that, I want to conclude and as a summary to say, VMI supports a wide spectrum of applications from malware detection to cloud environments, and VMI gives us the free association interpretation and interpretation. But depending on what you want to do with VMI, it depends on which of those three features you want to have most. So the pure problem I, as in isolation, is not required for all of those use cases. So you have to see whether you want to have an agent which you can secure, or if you want to have very intrusive mechanisms which make the execution of the virtual machine maybe still so you can have like the more in-depth you the Dallas performance or trade between those. But as we said, the hardware support for all of these mechanisms is continuously improving and gets better there. The tools that we showed you today are open source. You can look at the websites, look, VMI, dot com, dogtooth, dot com, and you can find us on Freenode and our contacts respectively. After that, I just want to say thank you to all of the names here on that slide. Without those, it wouldn't be possible. And conclude our talk and be ready. Are your questions? If you have questions, please line up on the microphones and if you absolutely must leave the room, do it quietly. No one. I have a question on the run time patching, does your code work if the function trishaw in terms of dynamic trace and also with the upcoming patch or craft with trace feature, it works currently with a function tracer, with a function replacement. It does not currently work. I have not implemented that yet. And also for the tools are open source. That last part is still to be published, but this will happen in the near future. OK, thanks number three for me. OK, it's me. OK, uh, one question. You shot, uh, this tool for this, uh, tracing of the memory states. So you mentioned that it's, uh, only supported under 86 right now. Other plans to support our Lexcen as well at some point of time. Yes, I am actually been working. I was hoping to get that into, um, four point five. But unfortunately, the freeze window closed just too early. So it's going to be available in four point six. OK, because there is this OCTA core arm platform coming up and it will be really handy to have it on our muswell. Yes. So next release, I can expect it to have this features. Yes, we are working on it. Wonderful. The question from the Internet. Yes. We have, uh, one question on Iasi. Uh, what is the easiest way installing a hypervisor in general the next US and what are the minimum hardware requirements for embedded systems with current systems? I have been compiling them from source myself, but I guess I'm going to have to install them something of that effect. Kvamme is, of course, built into the kernel, so that's usually a load of if your hardware supports it. So usually your distribution has some built-In support for that. So. Micropho ne number four. So I was first of all, I was trying to get into the same area as the first guy, um, what's the purpose of those talks? I mean, those also really cool. If you want to analyze a a system, you have a contained environment and you want to you know that it was compromise and then you want to figure out what actually went on. Right. You can analyze it dynamically and trace things. That's pretty nice for an actual life system. Everything that's coming up with life patches like PERF, like graft, like the others, you can do a fingerprint for a security vulnerability that's going to happen in a year. Right. So so you don't you can simply cannot do a whitelist because you are in time before the fingerprint would get created. But that's where the whitelist approach sort of comes into play. Like, you know what the good things that can happen to the criminals only allow the good things that you want to know about and everything else you can flag as potentially malicious. It means you need to update your ideas before the case can be updated. So you have a time, right? So if you deploy a very new kernel that the protection tool doesn't know about yet, then yes, that has to happen. OK, so so imagine this is your day coming up and somebody like you basically get a specially crafted graft kind of module that you load to fix that they kind of that thing. It's first off, it's probably tailored to you because you might actually have another security fix in your system that is just only for you. So you actually might be wanting a kernel that's not fingerprinted by you or by by anybody else or your host provider. You're your cloud provider. And then the cloud provider might not even know about that fix yet because, well, they might be in the loop after you. Yeah, but that's a general thing. Take for the internal integrity part that I showed. That's very, very kind of specific. Like you need to know exactly which kernel is running on the system with everything. You have to h ave the binaries. And currently we extract all of the information for our verification or validation out of the binaries. So if you as the cloud provider don't know what's in that system, you can do that anyway. But you're also not supposed to do because if you were supposed to we're supposed to do that, you would know about the patch that is in the system. And you can also take like this is the kernel code that we are running and these are the patches that we are applied from that we can build like the ground truth and match it against what's in the system. So my main point is I don't think cloud is the right term in this case. This really is an awesome thing for embedded projects where you control the whole stack. As soon as you have different tenants doing different things, things fall apart because not everybody has knowledge of everything left and right. And also, this is also not only like for Malverde detection, like from the beginning, you want to you have a system you think is clean. And if you work with a system, you have the different tools that you with which you can look at the system. And once one of the system reports any funny things about the system, you definitely have to investigate further. So you don't have all our systems running like with the most detailed view about the systems. And it's also not for preventing any malicious things or entirely for preventing them. So it goes in the right direction, I think. And but it's not a tool like that solves everything, just having a hard time figuring out where to. The second thing I had was on the VEMA Flushing stuff. So the first generation CPU's didn't like the generation of VM Kubel CPU's didn't have ascites or anything of the likes and performance difference between having assets and noddies like what, five percent, two percent? Something on that ballpark. You can just flush in every context and call it age ages just at submitted patch to cavium to enable always to flush it. Always, right. I mean, that's always a possibility that you just disable tagging and that's your problem goes away. But then you sacrifice performance and you always have performance if you trap on every heap allocation. Right. That's yeah. So that performance really is not an argument. You if you lose two percent down to the tubes and it's definitely a lot less than any of it. Yeah. It depends on what your security application is doing. So you might not need to hook all the fast that functions to really protect stuff. So it really depends. As I said, you probably don't want to trap on everything because if you have an anger stage and then you don't need to trap, you just have something within that you protect from the outside. And for cloud application, that's probably the way you want to go. But for malware analysis, doing something like this where you really can't trust any kernel data, that's really essential, OK? No one. Or any of the large cloud providers offering malware detection service for their clients? Not yet, but we are expecting that to happen soon ish, I guess. Number three, uh, you mentioned something about Android, um, how far we are today to use this. We, um, we are my end and the system on on armed devices. So the thing with ARM is that we have the two-stage, Beijing and I have the codes to have the trapping mechanism working with them, but unfortunately, that's only one part of the picture. So for all of these things that I showed here today, it's really required all four types of trapping. For now, it's only the memory parts that's functional. So there is more research needed to be done on how to do a single stepping and an efficient trapping. So, for example, there is no break point trapping GLENARM So there is some alternative that needs to be found yet, but it's still very early in the research phase. So interestingly, the people who are looking into that mostly are Samsung. So I expect them to have some sort of security stuff that they want to sell soon. So it m ight be improving as well. So and those bags generally support for virtualization on arm. Yes, yes. Yes. Thank you. So if you have no other questions, we will conclude the talk. Thomas Thomas, for this very deep look at the.