User Tools

Site Tools


wiki:rev1

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

wiki:rev1 [2020/03/10 19:37]
abigpickle created
wiki:rev1 [2020/03/10 19:41] (current)
abigpickle
Line 7: Line 7:
 <p>The text that turns into your program is called the <b>source code</b> of your program and is of high value to someone reverse engineering it. It is the highest level representation of what your program does, and if you're a responsible programmer, it even contains comments kindly explaining your thought process. If the reverse engineer has that, then that's pretty much half of their job done already. In most cases, however, this is not reality. The source code of a program is rarely released with the published program itself, but you will learn that through some clever techniques, we will a lot of the time be able to reconstruct a pretty damn good approximation. But in most of <i>those</i> cases, it will be a rough approximation. Gibberish variable names, weird-looking program flow, and zero comments. This is why it is absolutely <b>essential</b> that you know how to program with fairly low-level concepts as if it was the back of your hand. The code you get back will be a pain to understand, and if you already have a hard time understanding documented source code, this will be nearly impossible. But I digress. Back to what a compiler does.</p> <p>The text that turns into your program is called the <b>source code</b> of your program and is of high value to someone reverse engineering it. It is the highest level representation of what your program does, and if you're a responsible programmer, it even contains comments kindly explaining your thought process. If the reverse engineer has that, then that's pretty much half of their job done already. In most cases, however, this is not reality. The source code of a program is rarely released with the published program itself, but you will learn that through some clever techniques, we will a lot of the time be able to reconstruct a pretty damn good approximation. But in most of <i>those</i> cases, it will be a rough approximation. Gibberish variable names, weird-looking program flow, and zero comments. This is why it is absolutely <b>essential</b> that you know how to program with fairly low-level concepts as if it was the back of your hand. The code you get back will be a pain to understand, and if you already have a hard time understanding documented source code, this will be nearly impossible. But I digress. Back to what a compiler does.</p>
 <p>At its most basic level, a compiler takes in your source code, does some optimizations because it knows more about speed and less about how shiny and cool code looks than you do, and then spits out some form of low-level instructions for the executor to read. If you're working with a <b>native</b> language, languages such as C, C++, Rust, etc. then the instructions the compiler will spit out are machine instructions. These are the commands that your CPU will directly read in and execute. The fastest, most low-level, possible form of instructions for your system. When your CPU reads them, they are in an array of unreadable bytes to humans, so we usually represent them through what's called assembly language. This is nothing but a text representation of the machine instructions, and in many cases will be the best thing you'll get when reverse engineering something. The problem is that compilers are good at what they do. They destroy your precious source into something as fast as you allow it.</p> <p>At its most basic level, a compiler takes in your source code, does some optimizations because it knows more about speed and less about how shiny and cool code looks than you do, and then spits out some form of low-level instructions for the executor to read. If you're working with a <b>native</b> language, languages such as C, C++, Rust, etc. then the instructions the compiler will spit out are machine instructions. These are the commands that your CPU will directly read in and execute. The fastest, most low-level, possible form of instructions for your system. When your CPU reads them, they are in an array of unreadable bytes to humans, so we usually represent them through what's called assembly language. This is nothing but a text representation of the machine instructions, and in many cases will be the best thing you'll get when reverse engineering something. The problem is that compilers are good at what they do. They destroy your precious source into something as fast as you allow it.</p>
-<p>Disregarding languages such as Python and JavaScript where you basically run the source code, there are also languages in the middle of the language tree, like C# and Java. These languages are "compiled", but not to machine instructions. Instead, they are compiled to what's called an <b>immediate representation</b> or <b>immediate language</b> (IL). These instructions are like machine code, but instead of being directly run by the CPU there is what's called a <b>runtime</b> that reads first reads in your program and before it can be run spits out some machine code tailored for that system. There are arguments to be made for both sides, native versus not, but for us reverse engineers, the important distinction between the two is how close IL is to the source.</p>+<p>Disregarding languages such as Python and JavaScript where you basically run the source code, there are also languages in the middle of the language tree, like C# and Java. These languages are "compiled", but not to machine instructions. Instead, they are compiled to what's called an <b>immediate representation</b> or <b>immediate language</b> (IL). These instructions are like machine code, but instead of being directly run by the CPU there is what's called a <b>runtime</b> that first reads in your program and before it can be runspits out some machine code tailored for that system. There are arguments to be made for both sides, native versus not, but for us reverse engineers, the important distinction between the two is how close IL is to the source.</p>
 <center><img src="https://i.imgur.com/IUUByXF.png"><p>A hello world program in C.</p> <center><img src="https://i.imgur.com/IUUByXF.png"><p>A hello world program in C.</p>
 <img src="https://i.imgur.com/CRWb526.png"><p>Compiled result to x86 assembly.</p> <img src="https://i.imgur.com/CRWb526.png"><p>Compiled result to x86 assembly.</p>
Line 13: Line 13:
 <img src="https://i.imgur.com/2wge1gG.png"><p>Compiled result to CIL (Common Intermediate Language).</p></center> <img src="https://i.imgur.com/2wge1gG.png"><p>Compiled result to CIL (Common Intermediate Language).</p></center>
 <h2>Pulling Source From Nothing</h2> <h2>Pulling Source From Nothing</h2>
 +(WIP)
 </html> </html>
wiki/rev1.txt ยท Last modified: 2020/03/10 19:41 by abigpickle