34c3-talk-8789 Latest text of pad 34c3-talk-8789 Saved Jan 14, 2021

Hallo Du!

Bevor du loslegst den Talk zu transkribieren, sieh dir bitte noch einmal unseren Style Guide an: https://wiki.c3subtitles.de/de:styleguide. Solltest du Fragen haben, dann kannst du uns gerne direkt fragen oder unter https://webirc.hackint.org/#irc://hackint.org/#subtitles erreichen.

Bitte vergiss nicht deinen Fortschritt im Fortschrittsbalken auf der Seite des Talks einzutragen.

Vielen Dank für dein Engagement!

Hey you!

Prior to transcribing, please look at your style guide: https://wiki.c3subtitles.de/en:styleguide. If you have some questions you can either ask us personally or write us at https://webirc.hackint.org/#irc://hackint.org/#subtitles.

Please don't forget to mark your progress in the progress bar at the talk's website.

Thank you very much for your commitment!

======================================================================

Welcome to the 630 talk of today, it's one of the many things I love Congress for is the programing. Normally this is the art and culture, states and societies that. And so I'm very excited that there's also a security talk today happening. And that's next. OK, let's break modern binary code obfuscation. It's going to be translated into German and us, but I have thought about that under this address, streaming that C3 lingo dot org, which would also appear on the screen now. Wonderful. And from Bookham, I'm going to welcome now Tim Lutsenko and meet with Contac and they going to explain to us how you can do to obfuscate code without looking at the code, but only its behavior. A warm welcome and a lot a lot of a blast. Thank you, uh, for us, Candy and Abels TV down here. Techne got you. Well, well, hi, so, uh, welcome to our talk. So we are moguls and I to teach these students and give engineers at what we visited in Bonn, Germany, and we rediscuss and we in our daily work, we do reverse engineering and things like that. And as part of one of our academic projects, we started with programs in for biology and then thank you. And so this is basically the more technical version of our academic talk that we presented on public public at U.S. security this year. So and the first movers will talk about confiscation techniques and the obfuscation techniques. And I would like to join with Paul Gamson teases and how to apply that think. All right, thank you for the introduction. OK, first things first, why do we want to obfuscate code in the first place, so to set the scene, it's important to note that we cannot really prevent reverse engineering at hand, but rather we seek to complicate them, at least to some degree reasons why we would want to do so many intellectual property and the protection thereof. So like, if you got some super secret algorithm, which gives you some competitive advantage over some competitor, you can just protected and just have a head start to the to

this other corporation or something. Another reason why we want to obfuscate would be malicious. So we want to make them harder to detect so that analysis are not easily able to create signatures for those. And the malware lives longer without being picked up by a Navy. And the third, I think most interesting case is civil rights management, where software, especially Triple A games, larger games, are just obfuscated in order to prevent cracking attempts and illegal distribution of the game. Of the game itself, and especially in the context of digital rights management, there's a very fitting quote which sets the scene as to what we can expect from court obfuscation from Martin Slater of 2K Australia and was issued regarding to the release of Bioshield back in 2007. And he told us that they achieve their goals because they were uncrate for 13 whole days. So apparently we cannot really prevent obfuscation, we can just make sure to, yeah, make the softer withstand cracking attempts at least 13 days, which is about about the time in which the game distributor makes most of offers revenue. OK, um, how do we protect software? So there are different approaches to protecting software, some better, some worse. For example, we can just take a look at the software that's been used to analyze the problems, it sounds like already back like Disassembles and so on, and one idea that comes to mind is just to abuse shortcomings of these tools. So if we know already about crashes due to some code analysis, part of the of a specific FPU sequence, we could just issue that and make a ladybug crash. Similarly, if we got some process, Timba, which is relying on fields of the P had a memory, we can easily confuse them and just render these tools unusable. Another approach that's very popular is just to detect the environmental program once and check if we are running an application in an environment that gives us information about the presence of a debugger. So there's some capability by

the operating system, for example, that being debulked a bit. And the PCB, which is being said if we have an debuggers attached, similarly, there are more low level tricks we can abuse or some operating system internals which allow us to somehow detect the presence of the debugger and then just avoid execution and yeah, to just prevent any reverse engineering attempts. However, both techniques have the drawback that first they are once you know how they work, they're easily fixed. You can just Petrea tool or you can just circumvent the anti debugging tricks by just supplying the correct values. Also, if you go to Google and just search for game doesn't start detected, you get over six million people complaining that they cannot run the latest triple A game because the anti debugging technique that you might have, which might have worked reliably on Windows XP, does not really hold up on Windows seven and issues a false positive. So benign customers can really use the game, can actually play the game because they are getting debugging. Detection was faulty. So let's set up some requirements we need for a code of education and important ones that we want to call to be somatics preserving. So the mere fact of protecting the application shouldn't change its observable behavior. So we don't do not want the game to break only because we want to protect it. A second point, we want to avoid external dependencies, at least in the context of our discussion here. So they are various attempts to outsource data like on the Internet, on an Internet server, on a U.S. speed or on other separate media. But we are mostly interested in techniques that protect the code only. So we've got a white box attack scenario where the attacker has everything he needs to attack the application at his or her hand. And finally, the most important point probably is that we want to employ techniques that are easier or way easier to deploy for us than for the reverse engineer to attack. So the anti d

ebugging tricks we've seen are also the shortcomings of the tools which we can use. They are more or less effort. One of one wants to know how it works. It can easily bypass us, but we want to employ techniques that are easy to deploy for us, but they are very hard to attack by other parties. OK, count to court obfuscation techniques, one technique that's been used in commercial production engines is what is more known as OPAC predicates. So consider this rather easy CFT on the on the left side. So we've got some application with linear control social graph. If we now insert what is known as part predicates, the problem looks vastly more complex so that more branches and control for which we have to track and we have no clue that the underlying program in fact is very simple. Let's zoom in on one of those predicates. OK, so we got a bit of code, we got a true punch and a false branch, both glancing to a different basic block. So if we're just looking at this, we might come to the conclusion, OK, we don't know which one is going to be taken. Right. However. It turns out that this is what we call an actual predicate. So in this case, the trooper is always taken irregardless of the behavior here in the predicate. So what's happening here is that the predicate on top? This is the whole block, which is just constructing a predicate based on an opaque value, like the return value of the API call, get and process. And if it's ever worked with the Windows API before, you might know that Getman process always returns a constant value, namely minus one. So by just issuing this predicate, we make sure that we always take the left punch. The right punch might point to that court or to other point to confuse here, confuse the reverse engineer that we add runtime. We always take the left punch similarly or force for decades, which just invert this condition and follow the hot sponge. And there's another flavor called random OPAC predicate in which branch upon a random value. This

means that we can potentially at runtime follow both branches. So the challenge here is that we have to make sure that both blocks, both path that follow, has to be semantically equivalent because the problem breaks. Otherwise, however, it is also a challenge to ensure that it's not easily detected by the attacker, that both branches are in fact semantically equivalent. OK, recapping on the part predicates, obviously they increase the complexity of the application. Also, they can be built on hard problems. But what is most important is that they force the analysis of the analyst to encode additional knowledge at either the analyst himself has not, or he has to just extend this tool to know maybe about the win API, know about a few instructions, know about automatic identities or some other hard problems that he all has to encode and is another tool to provide reasonable results so they are hard to solve statically. However, if you now opt to only look at concrete execution traits, so we just let the application run and look at concrete values, we know that apart predicates are essentially solved for free for us. So this is good to keep in mind another very interesting technique that has been also been deployed. And I guess every major copy protection out there are what you call virtual machines. So take, for example, this native code on the left who gets some loopier and some eighty-six code. And let's just assume the loopier is our precious intellectual property we want to protect, obviously, we can use our common tools like Disassembles, like ADA or Ladybug to look at this code and reason about this code and get to know what this code does. So the idea here is to replace the code on the left with something our common tools cannot understand. So what we are doing, which is getting creative and we're just making up an entire instruction set, as you've seen on the Wired. So we'll leave EPOP instructions that are not known in this way by any other architecture full w

ith new registers, with new encodings and so on. And both the native code and the new instruction codes are semantically equivalent. Again, what we're doing is just replace the native caught by a call to what we call a virtual machine, like also known as interpretor, which is basically CPU and software, which just. Yeah. Lets you run the imaginary architecture we've thought about. So if we now try to take our some tools like ADA or Ladybug or some other tools to just analyze this code, it's not really viable anymore because we don't because they don't know about our made up architecture. But we still have to make sure that the transition from native code to a virtual instruction goes seamlessly until this effect. We have to look at the components. So there are three core components to a virtual machine. So mainly at the entry and exit. And they what they do is just performing the context switch from native to virtual context and back. So the entry just copies the native context, say Rukeyser's inflexibly native architecture to the virtual context and the exit copies them back to the native context. Usually the mapping of native to a veteran gets as a one to one, which makes it a bit easier. Then we go to the traditional fetch the code exit loop like in the traditional C.P.U. And what it does are just features and the code is one section and for what's the visual instruction point accordingly, then it looks up the handler, which defines the virtual instruction in what we call a handler table here on the right and just invokes the handler and the handler goes on to actually execute the instruction. So as we've seen that the table is just a table function was indexed by our code and we have at least one handler, Pavitra transaction, and each handler then just decodes the operands, operates on them and just updates the VM context accordingly. OK, so this is an example of a yeah on obfuscated version of the popular VMOs Association and we might be able to make out these

components here. So on top, we see the VM entry. So this is we're coming from the native court calling into the interpreter. And here we are, just switching the context to the visual context, initializing the virtual context. Then at the end of a loop, which just looks up the current veterans function in the handler table, which in turn points to the individual handle, as you see here below. So Get-Out reference to this Atlas and the handler, then update the little context and branch back to the VM dispatcher. Eventually we end up at the VM exit handler, which just performs the context switch from virtual context back to the native one. OK, taking a closer look in our side, we have the veteran suchan pointer and repeat, we got the VM context. So you see that the dispatcher, all it does is just take one bite of memory, increasing the instruction pointer and looking up the corresponding handler and jumping to it. It might want to have a handler as senior. And as you can see, just weeds out of the visual context, perform some semantics and then write values back into the context and finally jumps back to the dispatcher to execute the next version inspection. OK, so this is rather simple and easily understandable. How do we harden this whole context concept? For one, obviously, as we see the handlers assembled there, only a few instructions and they are easily understood what we can do here, just apply traditional codification transformations to make them harder, like substituting operations, inserting the predicates, inserting code and so on. So on the right. You see some slightly more complex assembly listing than before. Another technique, imagine you only got 400 individual instructions that you want to make it more complicated, more work for the reverse engineer, and what we're doing is just duplicating individual handlers to increase the workload for the attacker. So because the table is usually index using one single byte, we get up to 256 entries we can populate

. So what is doing? Just duplicate existing handlers to populate the full table and then again use traditional codes, obfuscation techniques to hinder the attacker from from easily finding out that two handlers are, in fact, similar. And again, yeah, we want to increase the workload to the attacker who is now analyze after 256 differently obfuscated versions of the handlers. Another technique you might recognize, we here get the of dispatcher and multiple handlers, which all branch back to the dispatcher, which executes the next instruction, what it can do is to omit this single point of failure and just in line, the dispatcher at the end of each handler. So what happens here is that we don't have the central dispatcher, but every handler branches to the next version. In summary. We align the dispatch into each handler because a central dispatcher just allows more easily observe what happens inside the van, like recording every handler that has been executed. And the third technique, we can also get rid of the handler table, actually, because the explosive relatively easily reveals all the amandola seen earlier, because it points to every single start of each handler. What can then do? Instead of querying the explicit handler table, we've been lying around in memory, which has then called the next handler address in the film and such and such as says. So this might look a bit like this, you get your coat, the upper end, and then somewhere in the encoding in the next hand address, this has the effect that it essentially hides, look, the starting location of each handler that has not been executed yet, what the handler table would easily see where every Handelsbanken geophysics and analyze it. But with these indirect encoding, we can only observe those handlers that have been occurring in the concrete execution trace. OK, I'm not an expert in automatic, which is another obfuscation technique not so widely used in modern obfuscation yet. But to give you an idea how it

works, just look at this instruction here at this term. So it looks a bit involved. And you might not be surprised when I tell you that there is a simpler version of this expression, which is a simple addition of the various explains why you might guess that even for this more involved term, there's again some of the variant, especially here, it's the addition of X plus why I said, OK, you might believe me that it's easy if you throw the first term and the second term in your favorite S.A.S. solver and you are able to prove equivalence. However, what is interesting, how do we get to the second term if we only have the first so we can try to maybe apply boolean identities we know from school or something, maybe arithmetic identities, maybe we might even go so far as to draw a nice map. However, it becomes evident that this doesn't really help us here. In fact, what we what we are using here is the concept of a boolean of algebra, which give us a ton of different operations that are in this algebra, like Boolean Algebra, Boolean operators comparisons and even arithmetic operators. It's worth noting that the wooden Agaba also includes so that the mix also includes a boolean algebra and an integral monitoring. And it is worth noting that no techniques exist to simply to simplify expressions that contain both. We know how to reduce expressions in algebra and in the interim modeling, but there's no underlying theory that helps us in easily tackling both at the same time. OK, onto the sophistication in every devious scheme, in every tool we've seen, we get to a point where we employ a technique at some point, at some point, which is called a symbolic execution. So consider the handler here on the left. This is no Hitler we've seen earlier on. And we now want to reason about this handler. So what we're doing is we executed, but what concrete values, but with symbolic values. This looks like this. So we're just executing this move. Symbolically, we assign to ascetics the val

ue of the memory. So index by the register RFP, we can continue to do so. And we've got the second assignment. Also, if we got a lot of variation on Rs6, we just assigned to R6 the negation of R6, which is the same as the negation of the memory, so pointed to by the were just up and we go on. And what gets interesting is this case where we got unlogical and on R6 and Alex, which is the same like the delegation of the members of the Port of Dubai with the logical end and the negation of the other member itself. But because symbolic execution is basically a computer Agaba system that includes some, well, identities from the Boolean Algebra, for example. And in this case, it knows that this expression is equivalent to a number of both memorizers. We can then just continue with execution. And again, here, the symbolic execution engine recognizes the next gnaw. At some point. This works rather well. And we see that the semantics of our handler here is just an operation until memorizers and strongly result in another memory. So apparently this works fine for this handler, but what can we do or what is the result if we try to throw this whole concept of symbolic execution at a more obfuscated handler? Here's a rather simple example of not good handler. You might be able to make out some parts, but that's just a symbolic execution added. We see that we get a ton of information here, but the problem is we don't really want a ton of information. What we want is just the underlying semantic, which is rather simple. It's the operation of two memorizers and the assignment to another man with a motto, let's try to solve this at school and admit it. So we have this rather complicated mix of automatic expression. And I'll just simply compile it into a program just like the binary program and then run symbolic execution on this trace or get as the semantic and it didn't really fit on the side. So it goes on and on and you might be able to make this the resulting values are X, and we

got this super complicated expression. And again, what we got here as underlying semantic is rather simple. So we do not want to have this complicated expression. We just want this more simple semantic here. So symbolic execution is nice because it allows us to capture the full semantics of the code we execute. Also, it's a computer algebra system. And as we've seen in our example, it allows some degree of simplification of the intermediate expressions. However, the usability decreases to some point of the syntax. Complexity of the underlying code increases. For example, we can introduce artificial complexity by just substituting instructions or just using apoc predicates or other schemes we've seen earlier. Or we can also increase increase the complexity by employing techniques like mix with expressions. OK, so it's obvious that we have a problem with syntactic complexity and we want to handle this somehow. So the interesting question here is what if we could reason about the semantics of the code only instead of having to fight about the syntax and just improving our tools to be able to cope with more complicated syntax? This leads us to the topic of problem synthesis, which will tell you more about. Yeah, thank you. So we have seen some limitations of syntactic complexity. So but it comes to my mind, obfuscation is semantics because think that means it has the same eye or behavior input output behavior. So why not just using a function as a black box and observe what it's doing? So, for instance, we might for this just generate some random inputs. One. One one and a half, three, and we not sat down and we do this several times more, so we stop this one and this one and 20 others again. So and then we don't look at the code at all. We just look at all behaviors this year. And then what we learn is that we learn a function that has the same behavior. So we might learn X plus why hasset? And the goal of programs and synthesis is to automatically learn these things

based on or samples. So how do we do that? So we use a holistic approach. Basically we have an optimization problem. We have a surface, something like that. And each and this system has some deep points, some Hallatt points, and we have a global máxima tops there. And Signable Máxima is a program that has exactly the same eye or behavior as our blackbox. And we have a concrete value for each point on the surface. So the closer we are to the global máxima, the higher is our score. And in public optimization problem, we basically start with a with a with a minus score and just increase until we find the global máxima. How we do that is an algorithm that is based on Montecarlo research. Basically Montecarlo, such as one of the main reasons why Alpha go last year was able to beat world class human go players in computer go. So let's get practical and let's just get to A plus B Ed, we first somehow have to define how we generate programs. Well, we define a grammar. We have a non terminal symbol. You know, one terminal means that we just can replace it with other symbols. So, for instance, replace you by you plus A or by you, plus you, plus you or by you times you or by A or B, we say that A and B, our input variables we cannot derive from any farther. So therefore we have a candidate program, something as a, B, A times B, A plus B, B plus B. And so on contrary, Intermediate Program is a program that contains at least one you so we can derive it. So that as a foundation we do want to calculate such. We start with an empty that has just as good node to you, and then we randomly apply the rules of our grammar. So we derive a A in this case is a terminal program, so it cannot be the further. So we can give a score to that because we are in a populistic face, we have to score six. And so how we calculate the score, it doesn't matter. Now, if we come to the claytie, just we give it a score. And then we did the next thing, B, and also there we give it just a score. And this tim

e we gave you times to you, so we cannot give it a score, since we have an intermediate program, what do we do? And we do some things that is called random play out. We apply the rules of the game randomly to place to you until we get a final program that we can evaluate. So we have something A as A plus A times B plus A and give it a score. And then we do the same with you, plus you'll hear again, intermediate program, we derive something, score it and go back. Now, we don't have any further calls to apply. What we then do is we choose the best child, not in a probabilistic manner. In this case, it is sometimes you and do the same thing again. We derive some expression to a play out, give it a score, go back. Now, each score basically represents the average score of all Tillicum scores. So in this case, we we had a new score here. So we just update this one and we go back again. We choose the best note. We do that every step and so on, go back, update, choose the best note, play out, score, update and so on. So now here you plus a we have a really high school you plus B, we have a rather low score and we want to synthesize a plan B, this means that you plus B had to bet play out and that you plus A had a good play out because here the scores better. So we update this and go back. So we again choose this note, go back and go back. So as you can see, we always explored this area more and that the reason is, well, because you plus you seems to me I'm way more promising than you type school, however, and we just have some probabilistic behavior that means we sometimes explore also different ways just to get to know if we might miss something. So in this case, we would normally choose plus eight again, but we just decide to do that. We do a play out, we give it a go up into well, it wasn't really good. So because of that, the next step, we go back to you plus eight, then we derive B plus A, and so we have a final program. We cannot use it any further so we can score it

directly. OK, we give it a score of one Y one. Well, because B plus A has exactly the same behavior as A plus B, so in other terms we are finished with our programs until this part. OK, how do we score some values, how do we calculate something? Basically what we do is we generate a random play out. We generate random inputs and query our black box and observe the output. So two plus two is four. And now we use the same inputs and query our intermediate program and the output. Then we calculate the similarity of these tools and get a score similarity. I will come back to this and the next slide so that we compare somehow the similarity. We don't do this only for one input pair. We do it for for many. So we do this. We do it for this one. In this case, we have the same output. That means the similarity is one because it's exactly the same. And we do it several times more. And finally, our score for this note is basically the average score of all these similarities. How do we calculate the similarity? Well, we are operating in the bed in the vector space, so we compare the similarity of pit vectors. In other terms, we can compare if they are in the same range. So we have a look at the trailing zeros or ones, and it's the leading zeros or ones and count. How many are the same. Then we can still compare how many are different, the distance we use that, for instance, for something else overflows in terms of additions or X or operations and end operations and things like that. And then we still have A plus B or something like that without overflow, where we can where we can have a look at how close are to that, to various pneumatically. So basically it's a distance and we use this matrix to tackle these different behaviors of the vector space, such as overflows and things like that. And we take, again, the average of all this matrix. OK, how do we use that as a core to synthesize obfuscated binary code? So basically what we do is that we have an execution path perhaps obt

ained from an instruction case, perhaps from static analysis, whatever you want. And we somehow emulate it and observe indirectly inputs and outputs. We say that everything is an input which we read before right to it. So in this case, these are the two memory cells tops out and everything is the output in that that we that we write. Lastly to. So, for instance, the abbey is not any it's not modified anywhere. So this is an output the same for us and things like that. And then we keep this again as a black box, generate random inputs and feed it into our black box and observe the output in this case. So we do this many times or something about 20 or 30 times. And then we synthesize this output and then we learn that Abie is not M0 and Emptywheel is here in this is the first number we sell. We do the same for asix and learn. This is basically the nature of these operations and finally we do it for this one and also that we learn the Sunhwa to conclude, we learn the semantics of the two loss and the negation tops that what we don't learn in this handlock is the push operation to the flag based and the Fleck's that are getting back into the memory, still pointing at AVP. Why don't we learn that? Because we don't care. So basically we discount that we designed our government in such a way that it's just abstract, higher level semantics, ignore us, ignore split-Level semantics and just a higher level semantics on other terms. We don't consider Flexner's any sort of input. OK, we do random sampling about the court, what is if there's a branch that may be conditional? And we feed it random inputs. Well, we might trigger some different behavior such that the animal later decides to do to choose another path. Well, what we do here again is we just ignore the flags if you force the execution to go the path that we observed. So if you want to go to a movie just faster to get to a if we want to go to be force it to go to be. So how do we implement something like this? Basically

, you have a lot of different opportunities. What you need is some code base that you can somehow execute. Code, for instance, um, emulated as in box on Unicon engine, or you can do it with dynamic binary instrumentation as in PIN or Dynamo. Or what you also can do is you can just translated into intermediate languages and Balicki execute it and evaluate just the symbolic expressions while feeding it with concrete inputs. But normally it's much, much slower, especially for large constraints, if you have seen it before. But in general it is possible. What we did is that we implemented everything in our framework that you could cintia basically Xynthia programs. And here's a framework for confiscation. It is written in Python. It performs random sampling for assembly code based on the Unicon engine, and it also implements Amstutz of romanticality, such based programs into Asakawa. We published, published and a tool. And this is the link where you can get it. So we will just do a quick demo. So first, I will show something about the program itself and basically redefine in the work of a. that is the function that we treat as a black box that we use for. I think what we want just is to synthesize X plus X, plus Y plus Y. So we just it. And what you can see here is that we start with different modes, that we try out different things and always assign a hierarchy what to this? And you also see, oh, at this case, it somehow learned that there must be a lot of additions and at some point we get to something like that. So this is basically one of you one and two and four two. So this isn't the simplest form, but it is simple enough such that we can easily simplify it to something like two times the input plus two times as our input. So, um, we can do that another time and we might we might choose a completely different path because it's all probabilistic. Perhaps one play out was better than the other. Then we choose another path. It might take up to five seconds up to one m

inute. So it basically depends how the quality of our inputs, how we choose the path and things like that. But at the end, we will mainly reach our goal. If not, we just started again. So the first one, it took nearly 20 seconds, now it took 24 seconds, and you see now we have a much more or less obfuscated version in our non simplified form. So it's OK. How do we use it for simplified code, for to synthesize obfuscated code? Let's have a look at a source. This and obfuscate at the top of the hour. This is a plane coat, basically, it takes five inputs, just performs an addition and multiplication and returns the value note. This is a function that means our final value. So the term value will be stored in X. And obfuscated, this was Tigress and base, when we get something like that, this is a little bit longer, so and then we compile that and get some get some assembly listing. That basically is about seven hundred and sixty lines, so we don't symbolically execute it here because it would be really, really a matter of time. Instead, we just learned it. What we do is we just random sampling input. So we give it by basically a code file. So this is our inspection place which redefines the architecture. We do well, let's take 20 random samples and we write to an output file. So what you see is this we are only less than five seconds, so if you look at the file, what we see is that we have different outputs here. This is one output. This is not an output. We basically have 17, 18 outputs because you count, you start counting at zero. We are only interested in X, so we will just synthesize that in a few minutes. And you'll see we have here our highest memory inputs or parameters for the function and also some, I guess, the US that are treated as input. OK, so what we then do is we just define. In our sample, listen to this. This is about MTD, this is basically the input silence on before and here I noted zero because we only want to synthesize the first output now. So we

just started. It basically takes the something file as input. And output file. And this, again, this might take between seconds and one minute, so it depends. So we are ready. We can have a look at it. You'll see this is output. The top non terminal is some some some integer, some integer that have not been derived, plus a concrete variable, um, the best top terminal, also the best program that is cannot be derived any further is a memory cell times the memory cell plus a memory cell. And so this is our final expression. Basically we learned that the whole romantic's is the first memory, the eight times the second memory, eight times a third. Remember it plus the fifth memory read. So OK. Coming back to our sleights. How can we use that to break the machine based obfuscation quickly mind the goal of the best obfuscation is was to introduce a new scheme that you don't can analyze easily with the tools you have to manually reverse engineer on the handler and things like that. And one simple thing we can do with that is just learn the semantics of arithmetic handles such as if the semantics as an add on or something like that. So what was discussed for different targeting techniques? And we will talk about how we can break all of this. So the first one was obfuscating it, obfuscating the individual mindlessness, more complex obfuscation teams then just duplicate headless and somehow transform them such that they don't look the same. Then we don't have no central VM picture, so we don't have an empty loop. Instead, each handler in lines and concrete the calculation of the next handler and then we don't have any explicit handler table. OK, what you see here is a handler of Timyra. It looks slightly more complex and you don't easily see Alternativa symbolic execution. What this is doing. Well, if we act as if if you ask Denty access versus semantic is that it is an edition of the 13th and 14th we eat and it is stored into a 64 bit value. OK, solved this problem, the othe

r problem, duplicated form headless assume we have a handler table where we can see each handler or we have an inspection place where we know where it is located. So given we know where handlers are. So how can we learn duplicated VM handlers? Well, at least we learn it. Simple semantics. We can learn based on our grammar. If it is an ET, if it's an X or if it some for some shift left, things like that. And of course you can find duplicate for because the semantics and if they are the same we learn it. OK, so coming back to our Timyra headline and no sense of this picture. Have a look at the end of the handler. Let's zoom in. Basically this plus the BASTIA is the code to calculate the next 10 ladder's. So we don't this calculation is a little bit too complicated for us into this part. So it doesn't tackle our semantic approach because we still want the semantics of a handler. And since the semantics of the next address calculation is too complex, we just don't learn it. And one other thing. Have a look. And the jump about X one advantage or disadvantage depends how you see it off in landing. Something like that is that we have a hot Zionistic. How we can find the end of the handlers. Basically, they end up with and return with some jump about X.. So basically, if you split and actually it indelicately suitcases control for TRANSFIGURES, we split it into the separate VM headless. This comes this helps us for our part, we don't have an expert the table, so we don't know by static analysis they each where each hand is. So what we can do is we just execute it and we observe the unstructured pace. And then we have something like this that we just then apply as a rule from above. We split everything where we observe in that controlled flow and we get something like that. And as it points out, each color is a different handler. So and then we can can take these simplest single components and just synthesize on their own. And in this case, we automatically learned large, la

rge portions or large things of the instruction case. And we haven't built in semantic disassembly of the for the new architecture, if you want it that way, so that we can say you a lot of things, just kind of your own. We also released our code and furthermore, we released all of our samples. So just feel free to use it and play it. So to conclude, we talked about obfuscation techniques, mainly opaque predicates, VM application schemes and hardening and mixed put in arithmetic. And of course, they all can be used together. And they all are mainly it's all mainly the same if we if we work on the application level. So then we discussed about execution and for syntactic the obfuscation, we saw that it really great, but it has some drawbacks if it is syntactically too complex. So on the other hand, we ask ourselves, what about not looking at the code level, but instead looking at the semantic level, using the code as a black box to obtain AI samples and learn something different. So this is polygon synthesis. And finally, minde, you'll find everything in our kit. Thank you very much. Thank you very much indeed for this creative solution to an existing problem. We now have about 10 minutes for Q&A. You know the drill. There are two microphones in the middle and one to the left. Please line up. Also secure. Linda, do you have questions from the Internet? OK, none. And other questions from the audience here. One, OK. I'm in the back, please. All right. Yes, I was to try to go to Thailand, but I was OK. Selimi, I was wondering if when takes a look at this obfuscation techniques, wouldn't it be possible for a closed source operating system to just prevent any debugging of code, so say could encrypt the code with some key that only the operating system has access to and then only shoveler to a protected memory area? Well, what you might do is that you execute it inside a virtual machine or something like that, assuming you can do that and you can take memory, snapshot of mem

ory, breakpoints at different points of execution, what you can then do is if you know, somewhere with in combination with reverse engineering that Adsit at this memory area now lies in the office gaited code and decrypted code because, you know, you know, it just was decrypted because it is used for execution. Now, you can take a memory snapshot and dumps the code and then feed it into something like that. Also, to add up on this, this also kind of a non-technical argument to make here, because all the like the argument with on the slides, if you try to outsource your code, this is like a similar scenario because to have hard guarantees on that, you might want to have like some HSM or some other hardware mechanism to enforce this, which might not always be feasible. Like imagine having a corporate environment where you need to supply USB dongles or something, which might a bit. Yeah. Which might be in violation of some corporate policy. So we've been looking only at things we can do in the white box, that scenario where the attack has everything he needs to analyze the code, which is the most, uh, yeah. The nonrestrictive scenario here. Hello, I would like to ask these techniques also seem to be very useful for code optimization in compilers and research being done in this direction, yet a lot of research. So basically, there is one big project, as it's called Storck. This is from the Stanford guys. They are they do stochastic programs. And this is for the optimization. And this was also one of the inspiration for us to use and to support. Do we have a question on the left also? OK, yeah. When you go. So if I understood you correctly, then you focus primarily on arithmetical functions or on arithmetical expressions. So can't you use or can you simply use techniques of numeric or fitting? Well, this might be possible, we haven't looked into that, but, uh, we also we mainly looked at that because we knew we needed some some evil of something to evaluate our approach.

So we designed our grandma to you learn something like that. But we are free to choose any more complex grandma or learn much more deep expressions. So this was just obvious that that we wanted to obfuscate was rather easy from the semantic level here. But it is a much more powerful. Thank you. I heard that the Internet is also still awake and we have a question from the signal Angel, please signal Angel. Yes, we do. Have you tried synthesizing intermediate representations? Um, no. This isn't possible because it's false. And as I did some work on that and there's no reason why you cannot do this with that approach. But we basically, uh, it depends on what you want to synthesize and on your path. So if you decide out that way that you want to synthesize some idea, you can do it. Because we're moving so fast, I think we still have time for one question of number two in the back. Does this work? Yeah. OK, so you've been using kind of fancy words, but in the end, I think what you're doing is essentially an optimist search for an expression that looks like it gives the same semantics. Yeah. And I'm wondering, what is it that makes you believe that this is faster than just randomly trying expressions? So why do you think that the child notes and in the same tree are some indications are somehow better if the metric of the parent notice better? Well, basically, we have a really large research space. So if you for the last if you have just A plus B, this is rather simple semantics. But normally we have something about 20 to 50 inputs and outputs and we have not only just multiplication or something like that, we have a much, much larger space. So we have eight bit input variables. We have 64 to input variables. We have downcast up. Zero extends the whole grammar as bits of the left arithmetic shift and X and also modulo and division and things like that. So we have lots of components such that we need so that four four four four eight plus B, something like that, you are p

robably faster, but if you take a people for A plus C, you are not that fast anymore. So we really have lots of space. And if we so if you use some kind of guided learning, this helps much. Final question, Mike, one. All right, so it looks to me a very interesting work, but my only question is, what if the function you're trying to obtain maybe I mean, what if the similarity function does not respect what the code is doing? And what if, for instance, it's not the case that the similarity increases the more you get closer and stuff like that is shorter and shorter terms are would you break your own system by letting the assumptions of your similarity function? Well, of course. Of course, if you just make it semantically harder, for instance, it won't work at some level anymore, but observe. And so what you can do, for instance, is if you apply some form of local encryption or something like that, this might break a little break for sure. But one of the observations we made is is not done yet. So and this was a really interesting fact. And also one of the main things is how you choose the boundaries of where you start, where you end up with your window that you analyze. And that can also be in this case, we had a look on the head level, but there's no reason why not to combine simple headless split into Headlam 1/2 or things like that. So that might also be ways by choosing window boundaries to tackle this again. Thanks. OK. Those who didn't get a chance, I'm sure you can approach the speakers next to the stage where once they leave again, thank you very much for the interesting talk. And let's conclude this was a rather large.

Viewing latest content
Link to this version
Link to read-only page
Edit this pad

Download as

HTML

Plain text

Microsoft Word

PDF

Server Notice:

34c3-talk-8789 Latest text of pad 34c3-talk-8789 Saved Jan 14, 2021

Download as

Authors