pouët.net

CR-ASM by The Cronies

                                                                       ____   
      ,----..  ,-.----.                ,---,       .--.--.           ,'  , `. 
     /   /   \ \    /  \              '  .' \     /  /    '.      ,-+-,.' _ | 
    |   :     :;   :    \     ,---,. /  ;    '.  |  :  /`. /   ,-+-. ;   , || 
    .   |  ;. /|   | .\ :   ,'  .' |:  :       \ ;  |  |--`   ,--.'|'   |  ;| 
    .   ; /--` .   : |: | ,---.'   ,:  |   /\   \|  :  ;_    |   |  ,', |  ': 
    ;   | ;    |   |  \ : |   |    ||  :  ' ;.   :\  \    `. |   | /  | |  || 
    |   : |    |   : .  / :   :  .' |  |  ;/  \   \`----.   \'   | :  | :  |, 
    .   | '___ ;   | |  \ :   |.'   '  :  | \  \ ,'__ \  \  |;   . |  ; |--'  
    '   ; : .'||   | ;\  \`---'     |  |  '  '--' /  /`--'  /|   : |  | ,     
    '   | '/  ::   ' | \.'          |  :  :      '--'.     / |   : '  |/      
    |   :    / :   : :-'            |  | ,'        `--'---'  ;   | |`-'       
     \   \ .'  |   |.'              `--''                    |   ;/           
      `---`    `---'                                         '---'            

                                                Prod:     CR-ASM
                                                Size:     512b
                                                Type:     Demotool?
                                                Platform: MS-DOS (386+)
                                                Group:    The Cronies
                                                Date:     July 6, 2015
                                                Contact:  [email protected]

-~=<|$| Introduction |$|>=~-----------------------------------------------------

    This is a self-hosting assembler (i.e. it can assemble it's own source code)
in 512 bytes! It assembles a Turing-complete subset of the x86 instruction set.
It accepts the source file via STDIN and writes the machine code to STDOUT. Feel
free to give it a go:

    cr-asm.com <cr-asm.asm >cr-asm2.com

Notice that cr-asm2.com is identical to cr-asm.com. You can do

    cr-asm2.com <cr-asm.asm >cr-asm3.com

ad infinitum (if you're insane and you get your jollies from that sort of thing)

-~=<|$| Caveats |$|>=~----------------------------------------------------------

    There are some caveats to the CR-ASM assembly language. Obviously, this was
never intended to be seriously used, but is rather a proof of concept. I'll
enumerate the short-comings up-front for the sake of fairness:
    
    1. As mentioned earlier, it only implements a small subset of the x86
       instruction set.

    2. It is CaSe-SeNsItIvE. All alphabetic characters must be capitalized.
    
    3. It does not tolerate any extraneous white-space (no blank lines, etc.)
    
    4. There is no support for comments or labels :( This makes it about as fun
       to code in as MS-DOS DEBUG.
    
    5. There is no support for decimal values. All constants are assumed to be
       hexadecimal.
       
    6. The source file must end with two blank lines (this signals to the parser
       to exit). I know. It's super lame.
       
    7. If there is a syntax error in the source code, CR-ASM typically hangs.
       Again, super lame, but what do you want out of a 512 byte assembler?
       
    8. No support for UNIX style lines. <CR><LF> must immediately follow each
       instruction (no white-space after the instruction).
    
    9. Only registers AX, CX, DX, BX, SP, BP, SI, and DI are supported. No
       support for segment registers.
       
    10. Many of the mnemonics that CR-ASM uses are not the official mnemonics.
        This was to make parsing the instructions easier.    
    
    11. Other snags that I'm not thinking of at the moment.
       
-~=<|$| Supported instructions |$|>=~-------------------------------------------

    Each instruction must be entered EXACTLY as described here. No extra
white-space ANYWHERE! In particular, watch out for white space after the end of
the instruction.

MOV XX, YYYY

    Move the hexadecimal value YYYY into the register XX. YYYY must be exactly
    four characters long. Use leading zeros where needed. This is the only form
    of the MOV instruction that is supported. If you need to move values between
    registers, use pushes and pops.
    
INT YY

    Call interrupt YY where YY is a hexadecimal value that is exactly two
    characters long.
    
PSH XX

    Push the register XX onto the stack. Notice the mnemonic deviates from the
    official "PUSH".
    
POP XX

    Pop the stack and store in register XX.
    
INC XX

    Increment register XX.
    
DEC XX

    Decrement register XX.
    
CMPD

    Double-word compare DS:SI to ES:DI and increment SI and DI. Notice the
    mnemonic deviates from the official "CMPSD".
    
CMPW

    Word compare DS:SI to ES:DI and increment SI and DI. Notice the mnemonic
    deviates from the official "CMPSW".
    
LODW

    Read a word from DS:SI into AX. Notice the mnemonic deviates from the
    official "LODSW".
    
LODB

    Read a byte from DS:SI into AL. Notice the mnemonic deviates from the
    official "LODSB".
    
STOW

    Write a word from AX to ES:DI. Notice the mnemonic deviates from the
    official "STOSW".
    
STOB

    Write a byte from AL to ES:DI. Notice the mnemonic deviates from the
    official "STOSB".
    
JNE YY

    If the zero flag is not set, then add YY to the instruction pointer where
    YY is a hexadecimal value that is exactly two characters long.
    
JMP YYYY

    Add YYYY to the instruction pointer where YYYY is a hexadecimal value that
    is exactly four characters long.
    
CAL YYYY

    Push the current instruction pointer and add YYYY to the instruction
    pointer where YYYY is a hexadecimal value that is exactly four characters
    long. Notice the mnemonic deviates from the official "CALL".
    
RETN

    Pop the stack and copy the value to the instruction pointer (near return).
    Notice the mnemonic deviates from the official "RET".
    
-~=<|$| Pseudo-instructions |$|>=~----------------------------------------------

    In addition to the aforementioned instructions, CR-ASM supports two commands
which allow the user to write data directly to the output. Notice these commands
will write the characters which follow precisely (that includes white-space,
new-lines, etc.).

DSW XX

    Write the characters XX to the output file.
    
DSD XXXX

    Write the characters XXXX to the output file.
    
-~=<|$| Bye-bye! |$|>=~---------------------------------------------------------

    Well, that about wraps it up. Thanks for taking the time to check out my
little labour of love. I'd love to see someone pull this off in 256b. If you do,
make sure to drop me a line. Greets go out to sensenstahl, jmph, frag,
g0blinish, Rrrola, YOLP, and all size-coders.

                                                               orbitaldecay 2015