Topic: March 02, 2008 - Compiler Design (Read 18811 times)

Anthony · « **on:** March 03, 2008, 06:53:48 am »

Hey guys,
Today's blog is more of an interest than a progress report. Sorry, I didn't get much time to work today, personal stuff, boring. Oh, I got to see my dad dance, drunk off his ass, with a belly dancer. Long story. Oh, right, programming! So, as I mentioned in a previous post, programming is creating a set of instructions to be executed. Well, when you actually send a program to a compiler, the code is translated into something that the system hardware can read. Now, lets say we are designing a compiler with 3 things: preprocessors, functions and variables. Keep in mind that the code I am showing is somewhat psuedo and obviously not instructed for use.

Take this program in our custom language:

Code: [Select]

#include "io.lang"
function test()
{
    var x=5;
    end_program();
}

Please ignore the obvious pointless-ness of this program. The compiler would do something along these lines:
1) Find any preprocessors and append the defined information in.
2) Look for any variables at a global scope and store their values in memory.
3) Search for all functions and repeat step 2-3, but since its recursive it is at a block scope.
4) Take the gathered information and convert it into working machine code.

While it seems easy, talk is cheap. Compilers often involve WEEKS of headache as well as gaining a ton of weight. The steps above don't necessarily apply for all languages, but it should give you an idea of how it works. Now, that stuff mostly applies if someone were to design a low-level machine code compiled language. I am interested in something a bit different.

JIT is a term that people usually just go, "... oh yeah, JIT, indeed!" The fact is that its a fairly simple acronym. It stands for "just-in-time." A lot of great languages use a JIT compiler, but instead of compiling to machine code, it is compiled into ALMOST machine code. Look at this for instance:

Code: [Select]

enum OPCodes
{
    OPCode_Function = 0,
    OPCode_Variable
};

Then what happens is the compiler actually writes a file like this:
OPCode_Byte Identifier_String Code

So a variable named test with a value of 5 would look like this:
0x01 "test" 0x00000005

The string "test" would actually be in hex or something but it doesn't matter. The point is, the code isn't system based, but rather something that EASILY can translate JUST in time.

What is the different about JIT and low-level system compiled languages? Well, you usually get a trade off. JIT languages are easy to design around, and very much so more compatible across platforms. The interpreter gets built for different platforms, but code that goes in will always be the same. This gives coders using your language peace of mind that they won't have to do a bunch of crap to make their stuff Linux/Windows/Mac compatible. Take this code for example in a psuedo JIT:

Code: [Select]

#include "core.h"
print("blah\n");

Now what it would look like in a system code language:

Code: [Select]

#ifdef WINDOWS
#include "win32core.h"
#elseif LINUX
#include "linuxcore.h"
#else
#include "maccore.h"
#endif

// lol MAC decides to use a different function name for print
#ifdef MAC
print_to_console("test\n");
#else
printf("test\n");
#endif

Oh, did I mention the fact that you need to build for each and every platform with machine code based languages? While this all seems to make JIT out to be a hero, its not ALL better. The fact is that machine code will always be faster than going through an interpreter first. But yeah, hope you learned something from my rant.

Your coder,
Anthony Iacono

VC · « **Reply #1 on:** March 03, 2008, 07:26:38 am »

When are we going to start peer-review coding on B3, so I can start watching and learning and fixing shotgun? And I was promised a C/EXP 2005 mirror...

Anthony · « **Reply #2 on:** March 03, 2008, 02:33:47 pm »

Sorry about the mirror, but I just keep forgetting. E-Mail me at anthonyiacono@gmail.com with a reminder. And today I'll do a blog about B3 code.

Konrad Beerbaum · « **Reply #3 on:** March 03, 2008, 04:57:27 pm »

Lol, I don't understand anything in that blog.

Mercury · « **Reply #4 on:** March 03, 2008, 10:19:49 pm »

You may be a good programmer, but please, for the love of God and all that is holy, don't ever become a teacher.

I'm a programmer, and I could barely understand anything you were saying.

killermonkey · « **Reply #5 on:** March 03, 2008, 11:39:54 pm »

He is basically ranting about that fact that he hates Linux because it won't compile the B3 code properly because it uses different system functions than Windows. That is what he is trying to say in not so many words

Anthony · « **Reply #6 on:** March 04, 2008, 12:30:28 am »

Mercury: There are three reasons you may not understand what I was talking about.
1) You don't understand how code actually works when its at a system level.
2) It was like 1 in the morning my time when I wrote it.
3) I wasn't clear about what I was saying.

Either way, I am sorry if it was an iffy blog, but I am not requiring anyone to read it.

Mercury · « **Reply #7 on:** March 04, 2008, 12:53:24 am »

Well, for a start you could make your examples relate to what you're actually explaining. Your first two pieces of code are not related at all to what you're talking about.

A compiler takes the high-level language and translates those instructions into machine code.

for example, a simple line of code like:

Code: [Select]

 someVariable = someVariable + 5

Actually can consist of many steps for the processor:
1. fetching the value of someVariable from the correct location in memory to a processor register
2. loading the constant value 5 into a processor register
3. executing an add instruction using those two registers
4. copying the result inot the memory location of someVariable

The compiler will assign a memory address to someVariable, and translate the simple line of code above into machine code which may resemble the 4 steps above, using the memory location it decided on earlier.

It gets more complicated for variables declared within functions - they can have variable memory addresses, and just keeping track of where in memory which variable is is very complicated.

You can take a four month university course on compilers and still barely scratch the surface of everything they do.

CCsaint10 · « **Reply #8 on:** March 04, 2008, 07:22:51 am »

jargon jargon jargon...hahaha! So lost...but at least I know work is getting done

Your the best Anthony!

GoldenEye: Source Forums

News:

Author Topic: March 02, 2008 - Compiler Design (Read 18811 times)

Anthony

March 02, 2008 - Compiler Design

VC

Re: March 02, 2008 - Compiler Design

Anthony

Re: March 02, 2008 - Compiler Design

Konrad Beerbaum

Re: March 02, 2008 - Compiler Design

Mercury

Re: March 02, 2008 - Compiler Design

killermonkey

Re: March 02, 2008 - Compiler Design

Anthony

Re: March 02, 2008 - Compiler Design

Mercury

Re: March 02, 2008 - Compiler Design

CCsaint10

Re: March 02, 2008 - Compiler Design