Server Notice:

hide

30c3-talk-5412 Latest text of pad 30c3-talk-5412 Saved Jan 16, 2022

Hallo Du!

Bevor du loslegst den Talk zu transkribieren, sieh dir bitte noch einmal unseren Style Guide an: https://wiki.c3subtitles.de/de:styleguide. Solltest du Fragen haben, dann kannst du uns gerne direkt fragen oder unter https://webirc.hackint.org/#irc://hackint.org/#subtitles oder https://rocket.events.ccc.de/channel/subtitles erreichen.

Bitte vergiss nicht deinen Fortschritt im Fortschrittsbalken auf der Seite des Talks einzutragen.

Vielen Dank für dein Engagement!

Hey you!

Prior to transcribing, please look at your style guide: https://wiki.c3subtitles.de/en:styleguide. If you have some questions you can either ask us personally or write us at https://webirc.hackint.org/#irc://hackint.org/#subtitles or https://rocket.events.ccc.de/channel/subtitles .

Please don't forget to mark your progress in the progress bar at the talk's website.

Thank you very much for your commitment!

======================================================================

OK, so as announced, I'm talking about genocide.

I've been looking into security vulnerabilities for quite some time now.

I started exploiting them and then and in the late 90s, it got kind of boring.

So I started to look into how to prevent people from exploiting your computer.

And that turned out to be actually very hard in practice.

We still have a situation where everybody hates everybody.

The NSA hates the Chinese, the Chinese, the NSA, the NSA hacks Israel and the Israelis, the Russians.

And of course, some of our people hack other people, too.

But nobody really understands the art of defense, not letting people into your computers.

And I'm making a living on security problems and on consulting on how to prevent security issues.

And that's a common pattern that you see there, kinds of bugs that are very, very hard to get right. Like you have to work for your source code and look at every single line. Every single line has to fulfill a certain property for your software to be good and defensible against certain classes of attacks. Interestingly, you'll find that for certain classes of attacks, for certain classes of vulnerabilities, you can find generic solutions. For instance, for a scale injections, you use prepared statements, do a court audit. I'm groping for all the documents. I'm seeing everything. A prepared statement. I don't have to look for any Ezekial injections anymore. If there's not if somebody is assembling in a statement by hand, that's a problem. The same can be said for quite a number of other classes of vulnerabilities. And the one I want to talk about today are buffer overflow. But I assume you to be quite familiar. So I'm making a virtual process of it. That's essentially the problem. You see, you have some scenario called some and you write into that area and the input data you get happens to be longer than the 16 characters you see there. And boom, you're overwriting some memory and not if you've been paying attention for the last 20 or s

o Congresses and other hacker events. That goes a lot of effort into how to actually exploit things, because as soon as you write outside that buffer, which is called sum and that case, you tend to override some other data. And from there you can get into the computer by, for instance, or writing the return address of a function, very classical stack overflow like in the classical 11 paper. Or you could overwrite some structure on the heap, like the linked list that's responsible for memory allocation that gets the the locator reads a pointer and then writes a pointer and then writes back. You could write into the structured exception handler on Windows. You can write into quite a number of arbitrary data structures. Such a long jump of us. A lot of the things that are under the hood of the C implementation are potentially dangerous, potentially breakable just by writing off your buffer. And so far, well, let's get into more detail. So that kind of problem you see here in the C-code, who understands what the problem is here? Raise your hand, please. Good. I'm having a test on that later to see how much how many of you paid attention. So and that's something we call the buffer overflow or overflow and have certain kinds of names for exploits. And but it turned out and that's what I did, that science, actually computer science. You know, the people at university tend to look at the same problems that the hackers do and sometimes have their own terminology, but they sometimes also come up with interesting solutions. And one of those interesting solutions I'm going to show to you today. So to get used to the lingo, what we're seeing here is so-called a spatial memory safety problem, spatial because you're in the wrong part of the space. You write something you're not supposed to write to spatial memory safety. And that's also temporal memory safety and that's that. And essentially, you access memory that has been freed and then you freed again, et cetera, et cetera. Or

you look at another object and write into that. Essentially what happens is that whatever you write to is no longer what you expect it to be. So usually in the case of an educator, if you're. All three, then, instead of the object, you've got the set of pointers that linked the free memory buffers, so freeing them again or marking something that chase those pointers and start writing things until you address space, and that usually leads to all sorts of unpleasant situations of overwriting the instruction pointer and giving you code execution one way or another. So that's called temporal memory safety and the lingo of scientists. And the reason I'm holding that talk is that scientists now have found a solution to the problem that looks like it's viable to use for real world code. So, as I said, I've worried about that problem for quite a long time. There are a number of existing approaches towards the problem. One that I remember my talks, if you remember what I'm doing, looked very much into as the first point uses safe language. Don't write and see if you. Unless you have a very good excuse me is probably not the right language, and unless you're one of, you know, three or four gurus, one of the room to room seven just got a baby. So not many of you, I trust writing scored perfectly incorrectly. Sorry about that. No offense intended, but that's just the way it is. And most people don't get it right. And then there's the whole topic of mitigations. Also something that has been discussed on conferences for quite a while. Use aerospace layout randomization, use data, execution prevention, use dictionaries. And of course, there are ways around that as well. So in case of aerospace layered randomization, the idea is you might control the instruction point by your buffer overflow, but you no longer know where to jump to execute code and modern exploits use techniques to circumvent that, for instance, and finding another vulnerability. So in the old days, you had to find

an overflow at a rate overrun in memory space. These days you usually have to find a vulnerability that gives you access into the space of the program you want to attack, because as soon as you can start reading out parts of memory, you can figure out where things are. So you can circumvent Islam by reading out the right address. Or you can do what many people do. You see praying. It's very popular. If you take a browser, you just have a piece of JavaScript that generates a gazillion copies of it explored called all over the space. And essentially you no longer care where you jump to because you exploit and there's a very big likelihood, deep data, execution, prevention. Also, in the old days, if you wrote an exploit, you would have that section of memory on the stack and you would just put the code you execute right until the very same buffer that you overflow. And you would then then controlling the EOP, the instruction point and the stick frame, you would point it back to the beginning of a buffer. And in that buffer there, your instructions and you just execute them and saw some smart people thought, well, we could make the stick non executable. That will certainly help until at that very conference here, people started playing around with what later became known as return oriented programing. So instead of jumping to your own buffer of code, what you have in your buffer is just a lot of stack frames that actually jump to preexisting code in the executables. Instead of jumping to your code, you're just looking for a couple of bytes in the program you are checking that are executable already. You jump there and executes code and you're done. You can change that by changing Steck frames in the buffer you write so that execution prevention can be worked around. Set cannery's people thought, you know, I have that buffer and next to the buffer is my stick frame. The instruction point to address that I want to control while we put a magic value in here in between the

buffer and my stick frame. And if somebody overrides the buffer, we'll check whether a certain magic value is still at the right place there. And then if it isn't, we say all we have been attacked. And fortunately, that's also not 100 percent secure and two ways to get around it. First one already mentioned, if you get rid access into the space, you can find out what the value of the stack can is and use it. There was a very nice example that it shot on a Cisco where you would send an ICMP echo request to a box, which was 20 bytes long. But in the header it says I'm one thousand five hundred bytes long and Cisco takes the packet, receives it, sees in the header one thousand five hundred bytes. I'm going to send back 1500 bytes of my memory boom. There you go. Was so secondaries then nice and everything you can do what Linux did and fuck it up so far, something like the last six or seven years or six half a year ago, you statically linked an executable Linux would choose the magic value of zero for the dictionary. Lots of things that can go wrong there. So. Well, FIFA complains that it's just jealousy on Linux that does it wrong, Ditlev gets it right. So a lot of effort has been put in to at least give you some decent debugging tools in order to detect memory overruns during production. Most of you probably have known of a third of ground. It's one of the more popular tools. Essentially, you link a Dybbuk version of your program with the ground and then run all of your tests. And Verbruggen does a lot of things behind the scenes to find out whether you write to memory. You're not supposed to write to or not. There have been other ones, the most recent versions of Decency and LVM ship with something that's called a memory sanitizer that tries to do the same thing. That's the safe called project security, safety and et cetera, et cetera, et cetera. Some of them are, you know, after the fact debugging tools. Some of them actually hook up into your compiler to change th

ings, to detect memory flaws. There are. And those approaches to principal ideas, how you detect a buffer overflow, how to detect a an invalid memory access. And the first one is the so-called object based approach. So just for reference, the notation I'm using here is black as the code that the user actually wrote and read is the code that was inserted by the tool you're looking at to find your buffer overflow problems. So read as tool injected black as what the programmer. So the general idea is, and the object based approach is that for every object and memory, you know, whether that's a valid object or not, which translates into for every address in your address space, you know, whether there's a valid object there or not. And all you do is look up whether the address is good or not. So if you look, for instance, at foreground, the background memory profiler, what it does is it keeps a so-called shadow memory. So for every word in memory, they have a data structure somewhere that store a couple of bits. And those bits are that's a correctly allocated address and it's not even been freed yet. So that track, whether that's memory that is user visible as opposed to memory, that shouldn't be used visible like your stick frame or your long term buffer's or whatever. And so they check that and then return, whether that's correct or not. So if the byte is broken because it's not user located memory that you're looking at, that's bad and the exception is raised. That's all nice and everything until you start looking at examples like that. Because imagine I would allocate a structure like that, then my shadow memory would say, yeah, perfectly fine, everything's allocated memory and I'm pointing inside the structure here and I'm not accessing memory that has been freed in between. But still, if I overflow the ID field, I run into the account balance field here. And that's something that in general object based tools do not detect. There are other flaws like that, comparab

le flaws that it wouldn't detect. Like, for instance, you you free a piece of memory, you allocate a different structure, it gets the same address and suddenly that's valid memory again and start writing into memory at an address that's considered valid, even though the actual type of the structure, that place has changed. So that range has a flag for that. So it will not reuse addresses. But in general, that kind of problem make 100 percent detection of buffer of laws using the object best approach and possible. So there's an alternative approach and stuff like Secure doesn't, for instance. And that is instead of representing a pointer just as overt, you represent the pointer as Everet. Plus you remember the base, plus you remember the bond. So the base address of the allocated object and the bond of the allocated object. Why you have to do that is it has to do with the C Steinert because you can do point at arithmetic and C so you can take a pointer to the beginning of an array and add 20 and suddenly you have a pointer inside that character array and keep on calculating with that. It's even legal to add one hundred points outside and then subtract one hundred again and then access your memory because you might have pointer values that are pointing outside the value. Range and intermediate computations and still have it available, see problem and see problems, unfortunately, actually do that there. So Fairpoint approach usually involves changing the compiler and it has the very negative effect of changing the size of your pointer. So your point suddenly has all the information it needs to find out whether a memory exists especially valid if you use the base and won't address the getting to temporal later. Just looking at spatial for the moment, unfortunately, it also means that the layout of your structures change, so it breaks compatibility with the existing C program. So if you look at the existing solutions that use Fairpoint us, they usually suffer from a numb

er of problems. One is incompatibility by breaking your structure, offsets and everything. So memory level changes, calling conditions change so you cannot link against an instrumented library as it cannot do a separate compilation that usually involves things like a whole program analysis path where instead of compiling a list of object files, you take all your source code or your millions of lines of C and compile them all at the same time. You can do interesting things then, but it turns out to be not very practical if you actually tried to compile real world software with it. So both approaches have their their drawbacks and advantages. So enter soft seats. So one noticeable thing about academia, they get naming of software even worse than hackers do. It's called soft bone RCTs, soft porn, because they had a heart bound project before where they researched hardware support for checking every point of access for bonce violations and softball and forced the implementation of the same or Wilmont and software and RCTs talks about compiler enhanced temporal safety. So we got that abbreviation that we should research that has been done at University of Pennsylvania. And the software in part does geospatial safety and whether we overrun some piece of memory in the special domain wrong address and CTS is the part that does the temporal safety, meaning accessing memory that has been freed essentially or not yet allocated. And the thing that makes it interesting here is that they managed to get something that's reasonable for actual use out there. If you have ever used Vulgarized and it slows down your program by a factor of 20, that's nice and everything for testing, you can do that during testing and debugging. You wouldn't want to ship something linked with Wollogorang to the user and it uses disjoined Fed pointers. You remember object based versus pointer based. So what it does to remove many of the incompatibility problems that exist that we've had Ponto based approa

ch is that they have shadow memory as they would have an object based approach, but they're only using that to store the additional pointer information. So the memory structures keep the same stick from skip the same. But the extra information you need is propagated on a different view. I'm getting into the details how they're doing that later. And they they have a proof of correctness of what they're doing. And that's very interesting. So they have a formal representation of the semantics of a subset of C, and they prove that that transformation, no invalid exists of memory can be written down without being detected and. That proofers certain problem and coming back to that later, but that's already very remarkable and it's implemented and that's another point of view. It generates very efficient code. It's implemented as an LVM optimizer. So it operates on LVM intermediate representation, which is so-called single static assignment. So no idea how many people have you ever looked into? Computer intermediates. It's sort of above a similar level, but below sea level. And you can get that essentially by breaking down your code and taking all the big expressions. And for every sub expression that you write on, if you write A plus B plus C, so D equals A plus B plus. See, you say my temporary T1 is A plus B and my temporary T2 is T one plus C C break compound expressions into simple expressions. And you also do not reuse variables. You only use variables once you assign to them and then never change them. And the intermediate representation in the M is typed. So unlike, for instance, vol grind that just looks at the assembler level at memory loads and stories. And if we compile a program and if you look at an intermediate stage, we still know what variables we access are pointers, which are integers. So we can drop a lot of the load and statistics and only look at those checks which are actually relevant to the problem. So that that sounded on paper like a very promisi

ng project and I started looking into it. And yep, so the advantages you get, source compatibility, some of the competing projects you had to start modifying your C-code in order to get it to compile with a tool, it gets complete coverage, actually gets 100 percent of your memory safety problems and catches them. It supports separate compilations so you can compile a library and link it against your main executable like you're used to instead of having to compile all your second one. And it's got a lot overhead. So at the end, something like hundred percent of makes your code half of half as fast as it used to be. But, you know, that sounds like much of your C program. And somebody says, like, I have that optimization that makes the program three percent positive a while, then you can say, well, and I'll make it half of what. But then you remember that people actually use Ruby to serve the pages of their. So I can imagine quite a lot of applications there, a 100 percent performance overhead is really, really cheap compared to getting pulled by the NSA or somebody. So I thought it was a worthwhile project. So what are they doing? Essentially, again, blek is what the what the user rights rattus what the tools inserting. If you're looking at C-code here, which you have to think about as a high level representation of the intermediate representation it's actually working on. And you see there, that's a lot from a point of there, the check for a store would be totally equivalent and all it does is go through the intermediate representation of your code and for every person to access it inserts a check for the pointer whether the base and the bond are in the right range. It involves the access the size of the access type, because it might it might be the case that your point is, you know, just pointing to the end of the structure, but still inside. But if you read the whole word, then a few bytes will spill over. And yeah, one bite of buffer overflow are sufficient. I've

seen a presentation also at CES to see Congress where an exploit was written for a one byte overflow because the one byte overflow flew right to the base pointer so one could create a copy of the stack, frame a few bytes of inside the buffer and then part of that so that the size of things still plays a role. And the implementation of that check is pretty straightforward. OK. FIFA has been laughing. Do I hear more laughs Who's laughing if you're not laughing? I'm not trusting your C-code. If you don't see that you're not supposed to ride security critical C-code because, you know, they have that of proof of correctness, of their algorithm and everything, but then they menu. So they insert all the right checks to find 100 percent of the buffer of flaws. But then they get the check wrong. What's wrong with that check? You know, the addition of pointer plus size might overflow. So you have to check for that. I wrote to the authors and they said, Yeah, yeah, you know, and that's academic research we're doing here. And it's nice that somebody looks at actual. But other than that, it's a fine piece of software, so the research part they're doing is quite good and I think we should do more projects for hackers, look into research out there and try to liberate it into into real world, existing open source software, for instance, and the base of the bond value. They have to come from somewhere. That's the point I talked about. So we need to calculate them on memory allocation. So in the case of Marlock, no other point to address is the base and the bond is the point to address, plus the size that was requested to Marlock. And you know, if if my returns null, then of course a bond is also null if you have an outpoint against a serious size and the internal representation. So the check fails because the memory of the memory exists will always overwrite the zero size buffer. So very easy here. And second location pretty much works the same base address as the address of the are

a, as Bolland as the base plus the size of that area, very, very straightforward. This is very tricky, is tricky, and this is why it's hard to come up with a good memory solution for Forese, you have point to arithmetic and essentially what you do if your new pointer is a pointer plus an index, then you point out copies of the base address of the original pointer and it copies the bond address of the original point. And you will notice that inside that representation and the pointer might point outside the object temporarily. But every time you do point out arithmetic, it's a computer. So after the next, Edvin might be back into the area and everything's fine and. He is a special case, and that's the case I've talked about that you get the address of a member of a structure and access that. And essentially what you do is you narrow your bones to that member of that structure and you might end up with a couple of false positives there because, you know, somebody might think it's smart to you know, I've writing a program that deals with fleeted graphics, have a strike that has an X and Y and Z as members, and then the whole structure is passed to some other function that interprets the very same structure as an array of floats instead of a structure float with three members because the memory representation is the same at something we cannot tell apart. So this will prevent overruns inside a structure. It will prevent certain kinds of dirty and you'll see a program if you decide to do them. So something you can liveth. Now, the case of narrowing is when you access an area inside the structure instead of just an inch, but it's boils down to the same principle. This is where it gets interesting. We know that our point does now have three values as the point of values based on the bond. And we also know that we do not want to change representation of structures in memory. So if you have an object and memory and below the pointer from that address, we have to get our base

and our bond from somewhere so opposed to the other Fairpoint implementations, soft Ponseti as uses a shadow space, a data structure that keeps the copies around. So essentially what that does is a table look up based on the pointer and returning the base and bond address. That's something tricky here. You might not see that immediately. And we do not use the pointer value for the look of the base of the bond. We use the address of the pointer in memory, not as the two stars up there of using the address of the pointer and memory to get an index into our table to get the additional information. So instead of coupling something to the point of value, the copy some some additional information to the variable location at the end of the day and that we use to keep the base bound around. So, of course, inside the function that's just passed in additional registers. So there are a lot of variables that are generated during a translation of a function for the base of the bond. If there's a pointer. And this only applies when actually loading something from memory as opposed to getting a value that was an intermediate result of computation and such a function or something that was passed as a parameter. Storing metadata works the same whenever we write to a memory location that is a pointer and we know it's a pointer because we're working on a time to the immediate representation in the compiler, then we have to do that additional store. There are certain alternatives for the implementation of that store. So the thing abstracted behind the table, look up there, we could implement it as a hash table. As you can imagine, that's a very expensive operation to look into a hash table or items for a hash table every time you write or 2.0 from a to memory, and you could use an actual shadow space. So then you have a heap of, say, a size 16 megabyte. You allocate another 16 megabyte for your best addresses, another 16 megabyte for your bond addresses. And then it's a very simple op

eration, right? You do an arbitrary pointer address and you suddenly get the address of the base and the bond value. That's very fast, but it needs a lot of memory. And it turned out that after quite a lot of experimentation, that the optimum data structure for that is a a tree that's written tr i.e. some people call a tree or pronounce a tree. Some people call it erotics tree at at the end of the day, it works like a stable system. So you take a certain number of bits from a pointer address and that points to a secondary table that is just, you know, a page of values and that turns out to be sufficiently memory efficient to not be a problem, but also has to exist. Because all you have to do is, you know, all the right things, shift things around to a point to look up and to get into table value. What it does, however, with the performance is that these additional loads and stories. So from a performance point of view, you might think that the actual computation for checking the balance is what makes the program slow. And that's absolutely, totally not the case. If you think that you haven't looked into modern CPU architectures and in a modern computer, the first time to memory dominates everything that you do on the CPU. So it's not unusual to have a 180 cycle round trip for just loading a vote from RAM if it's not in cash and all the Whitesville your thumbs. Also, you have a super scale execution on modern space and the kind of check that's introduced for the bonus check is something that's very, very easily parallel per realizable. So it increases the parallelism of your software. It's more instructions that can be run the same time so the actual check can be distributed onto different instruction units inside your CPU. So that tends to take almost no time at all, the actual overhead you're paying. As for the extra loads and stores that are done to memory here for the structure access, and that's what the overhead comes from, one percent overhead and in order to

prevent loading and storing from those, whatever it is, hash table, Radack's tree shadow memory all of the time, an additional modification. That soft bond does is modifying the calling convention, so this year is an illustration. Think about it of Xavier for shlock. It doesn't work like that internally, but it internally does is it keeps a so-called shadow stack for the additional base and bond values they pass in 1.0 like the S and the function there, and then industrial as base and as bond is passed into the function body. Actually, that resides on a different stack from the main stack. So in order not to mess up stack layout and everything that's involved in there and to improve compatibility, but yeah, so we augment that. So usually with the additional information, the base and bond information are passed around like additional function parameters that passed around inside registers, inside the functions and excellent and stuff from until memory only happens if the update or objects. Yeah, and they're a couple of loose ends, things that need to be treated here. Global variables for global variables, there's a path that hooks into the program load and using the intersection and LTH and that generates the best information for all the globalists so they can be accessible and. A separate compilation is supported by defending and API to crawl through that many copy. Interesting, because if you do a copy of Memory and Aid to memory and be inside that memory, there might be pointers. And the point ATM information is the base and bond also needs to be copied. So you need to have a special and then copy implementation that makes sure that the additional basic information is also popular around function point. So interesting function portus just get a bond of zero so you cannot write onto them and then they're safe. Casting, cussing, cussing is a problem if you have an integer and you create a pointer from it. At that point in time, the compiler has completely lost track

, both the original type of the data that went into there. So essentially there's nothing sensible you can do there to prevent buffer overflows. So lots bond does is if you cast an integer to a pointer, it gets a bond of zero. So you cannot write to that memory. X completely prevents you from using integers as pointers, and that kills a couple of hacks. Some people of you might know the trick of using X or to X or two pointers onto each other to generate a doubly linked list with only one point of field. That doesn't work if you have that limitation. But such quote is rare, fortunately, and you can touch the three lines that do that, do that and casts and unions there just need to be treated right like at the moment where you access that memory, the conversion has to be applied. That works. And the one thing that's still a bit unsolved here is the arcs. So that's not the truth as they are not reaching 100 percent yet. Vioxx are not treated yet because they're special. You would need to add a base in the bound information to the area containing your Vioxx, and that's not there. It needs to be implemented. So 97 percent from its results are not covered yet. OK, onto temporary location. That's a bit quicker now because you already know the principle of passing fad pointers around. What we do for temporal is that we have two additional fields and our fat pointer. One is a key and one is a lock. The lock is an address and memory and the key is a value. The increment every time you say malac. So we know it's memory allocation. Twenty three. And then if you have an association between a pointer and the lock at the moment we're free is called. Um, the memory that holds the lock is reset to zero. I hope that will get obvious in the next slide, so that's the check that's introduced there. I get the key and the address as part of my Fed point of implementation and shadow space and I'd load from the lock address. And if that's what the address matches my key value, then I didn'

t call free on that point. Yet at the moment, very call free press free is free at the moment. Very cold, free. All I do is go to my address and write invalid. Look at that address. So yeah, as long as the key facts lock, my pointer is valid. So far, every location I remember the correct lock value. And then propagation of metadata pretty much works again, like the special checks, you've seen that all. Lowson stories Glovers cannot be freed, so we just introduce a key which is global key and a global lock address that always match to each other. So when a global is access, we have a key unlock that work. Yeah, and that will be all nice and fine, unless we wouldn't have Freds, unfortunately, we do. So that's again where we don't reach the 100 percent yet. If you have shared state of your threats, if you have no and it might happen that, you know, one threat calls free and the other one checks whether the lock is still valid and you have the process switch in between. So one process is, yeah, the lock is still valid, then the other one gets scheduled, does the free kills the lock, and then you're getting back to the first threat, which then on initialized memory. And we do not want to have that. So on contribution, I'm not just telling you what people out there did, I actually did my own checking on that. And at the moment, the form of soft porn sites you're getting is a collection of patches to LVM that their own top level executable. And all that does is processing a lot of damage to representation. And it also has a number of other nasty hacks, like in the compiler module. It keeps a list of Lipsey functions. And then for every function Lipsey, it comes with its own wrapper function that need to be read. When you want to call in there and started around taking that in, Assunta turned out to be intractable. Just cover interesting things when you do that, like, for instance, that the Linux headers and they call different functions depending on whether you enable the

optimizer or not. They actually have an if Dev did the user use optimization inside that function or not, then use one function or else use the other function and you end up with a ton of Lipsy internal functions that you need to read and then recompile the compiler and work on your code. And that all led to nowhere. So I found out that freebees, Dean actually is compiler will using LVM and chose that as a target for further hacking. And what what I did with a lot of support from Hana's may not have worked on that with me. He was a Phoebes domain expert and also coffee guru. Check out his coffee on the fruitfly is good and I can compile the whole user learned using LVM from scratch with every single line of code, and that's a nice spot to actually start taking on that stuff. So what I did was introduce function attributes that you can specify that turn on or turn off this off-balance-sheet as processing for a function. And the other one is so one is for saying that's a native function like this says call don't do anything. It's just a C function. The other one is like, that's a string copy function. And you might want to give me your baseline bound for the strings you pass, but I want to write the check myself in order not to check every single byte in the loop for the performance that the other attribute and the part of the software and CTS module to freebees. The event essentially involved a lot of Fakhoury with the built system in order to get the additional modules built. And then we had to walk through these start ups. Everything that happens between Unaskable Start, which is the symbol entry into the executable binary up until May, which is your main entry as a C program. A lot of things happen there. Like for instance, all the constructors are called Fred. Local storage is initialized, Marlock is initialized, and that's all low level code. So as you have seen, Marlock and Free are essentially primitives and you have Reppas around there that provide the correc

t baseband lock and key values. So we had to go through that and find every single function that's involved in program, start up, annotate that with the right attribute, come up with the Marlock that does the right thing, read the mallock to give back the pointer values and then we could delete about 2000 lines of code in the software seats because it was no longer needed here. And yeah. And actually we're at the point at the moment, they're a hell overwrote executable can be built and executed and will correctly load and come up to me. And then I have a little test function that overrides the buffer and the correctly detects the overflow of the overflow happens and then the executable is shut down correctly again. So I would call that a proof of concept. It's a little bit better than the academic code from a usability perspective. Not much from an industrial strength perspective, but I think it's a very promising approach to that. So next steps and that will be making that clean, making sure all the functions are instrumented and work and then start doing tests, fix some things like enlightening, for instance, so that the buffer of Lautrec you've seen is not in line due to the way it works. And I think we can get a lot of performance from there by by enlargening stuff. And the goal of the operation is to have a complete previously worked where every single line of code, except for a little bit of trust of code, which is essentially Mallock and the Lipsy start up, is correctly specially and temporarily bounced checked because it would be really nice to have a computer for a change that's not vulnerable to Buffalo Flaws because I didn't have any for the last twenty years go through all the vendors. Nobody gets it right. So let's link to the original source and to get up with modifications, I would like to thank once again for making the code with me, would like to thank the original softball and of us for being very generous with information.

One tip they gave me, the special memory checking.

They talked to Intel and is working on a new set extension.

So we will get native instructions that do the bonus checking for us later.

It's called the extension. Watch out for that and generally for coming up with some awesome research. Yeah. So thanks to you for for listening. And I'm open for customers now. Hello. Did you know that Setpoint US is not the invention of the people of the Soft-Boiled Fit City project is the word to the world from the dark time of replumbing on it on 80 86 times when their heads more of the same model of the register, which you have to load with, was a 64 came. So this is exactly the borrowing model of replumbing where no unexplainably plumbing was able to compile on Unix and sort. Sorry if I raised the and the idea that, you know, Soft-Boiled invented that of course Fed Pontoise are all they're even older than their 60s hardware. Had it like you had bonds checked point into X's and harkavy even in the 60s. So it's nothing new. It's just that, you know, that particular hech allows us to run realworld existing C-code on real world hardware these days. So I think that's what makes it unique.

Viewing latest content
Link to this version
Link to read-only page
Edit this pad

Download as

HTML

Plain text

Microsoft Word

PDF

Server Notice:

30c3-talk-5412 Latest text of pad 30c3-talk-5412 Saved Jan 16, 2022

Download as

Authors