JavaAssembler - Robo Wiki -= Collecting Robocode Knowledge =-

So I have several Java tools:

Compiler
Decompiler
Disassembler

Notice what's missing? Everything I Google up is dead. And writing my own is currently a little over the top (would be a fun project though...). -- Jonathan

Eh... I'm not sure I understand what this is about. What's a Java assembler?-- PEZ

Just like any other assembler (perhaps you should look up that), but for Java. ;-) -- Jonathan

It's strange, looks like the existing ones were deliberately removed form the web. If you find one please share, I wouldn't mind experimenting with java bytecode... -- ABC

Well, I am an old assembler programmer so no need to look that up. =) But I thought the best assembly language for Java was Java itself. Never realized there were assemblers for it. And, it seems you have troubles finding one too. What's the difference between the decompiler and the disassembler? Can you show some sample output from the latter? -- PEZ

Your example:

public class Test {
	public static void main(String[] args) {
		System.out.println("You gave me " + args.length + (args.length==1?" argument":" arguments"));
		for(int i=0; i<args.length; i++) {
			printArg(i, args[i]);
		}
	}
	
	static void printArg(int num, String str) {
		System.out.println((num * 2 + 2) / 2 - 1 + ": " + str);
	}
}

Disassembled using javap:

Compiled from "Test.java"
public class Test extends java.lang.Object{
public Test();
  Code:
   0:   aload_0
   1:   invokespecial   #1; //Method java/lang/Object."<init>":()V
   4:   return

public static void main(java.lang.String[]);
  Code:
   0:   getstatic       #2; //Field java/lang/System.out:Ljava/io/PrintStream;
   3:   new     #3; //class StringBuffer
   6:   dup
   7:   invokespecial   #4; //Method java/lang/StringBuffer."<init>":()V
   10:  ldc     #5; //String You gave me 
   12:  invokevirtual   #6; //Method java/lang/StringBuffer.append:(Ljava/lang/String;)Ljava/lang/StringBuffer;
   15:  aload_0
   16:  arraylength
   17:  invokevirtual   #7; //Method java/lang/StringBuffer.append:(I)Ljava/lang/StringBuffer;
   20:  aload_0
   21:  arraylength
   22:  iconst_1
   23:  if_icmpne       31
   26:  ldc     #8; //String  argument
   28:  goto    33
   31:  ldc     #9; //String  arguments
   33:  invokevirtual   #6; //Method java/lang/StringBuffer.append:(Ljava/lang/String;)Ljava/lang/StringBuffer;
   36:  invokevirtual   #10; //Method java/lang/StringBuffer.toString:()Ljava/lang/String;
   39:  invokevirtual   #11; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
   42:  iconst_0
   43:  istore_1
   44:  iload_1
   45:  aload_0
   46:  arraylength
   47:  if_icmpge       63
   50:  iload_1
   51:  aload_0
   52:  iload_1
   53:  aaload
   54:  invokestatic    #12; //Method printArg:(ILjava/lang/String;)V
   57:  iinc    1, 1
   60:  goto    44
   63:  return

static void printArg(int,java.lang.String);
  Code:
   0:   getstatic       #2; //Field java/lang/System.out:Ljava/io/PrintStream;
   3:   new     #3; //class StringBuffer
   6:   dup
   7:   invokespecial   #4; //Method java/lang/StringBuffer."<init>":()V
   10:  iload_0
   11:  iconst_2
   12:  imul
   13:  iconst_2
   14:  iadd
   15:  iconst_2
   16:  idiv
   17:  iconst_1
   18:  isub
   19:  invokevirtual   #7; //Method java/lang/StringBuffer.append:(I)Ljava/lang/StringBuffer;
   22:  ldc     #13; //String : 
   24:  invokevirtual   #6; //Method java/lang/StringBuffer.append:(Ljava/lang/String;)Ljava/lang/StringBuffer;
   27:  aload_1
   28:  invokevirtual   #6; //Method java/lang/StringBuffer.append:(Ljava/lang/String;)Ljava/lang/StringBuffer;
   31:  invokevirtual   #10; //Method java/lang/StringBuffer.toString:()Ljava/lang/String;
   34:  invokevirtual   #11; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
   37:  return

}

In many cases javac generates nearly perfect bytecode, but it doesn't know how to use the stack effectively. The best bytecode reference I found is at http://cat.nyu.edu/~meyer/jvmref/ (yes, the assembler that gets mentioned there results in a 403). I don't think I will use a Java assembler a lot, but at least I want to try it. -- Jonathan

http://www.sable.mcgill.ca/software/#soot jasmin sable 1.27 is downloadable there. --Goofy http://bmrc.berkeley.edu/courseware/cs164/spring98/proj/jasmin/src/jasmin/ is the jasmin source, I think --Goofy

Perfect bytecode? In what sense? Speed? CodeSize? Usually jikes produces smaller code than javac. Anyway, it might get something of a nightmare to maintain code that you've hand-shrink using an assembler. And that assembly doesn't look anywhere as "user friendly" as assembly for other processors do. Also, something must be missing from it. I can't see how assembling it would produce a program equivalent to the program produced by compiling the java source file... -- PEZ

Speed, CodeSize, file size, everything you want the way you want, but only if it can be done better in assembly. For example

multiply(double first, double second) {
    return first*second;
}

can't be done more efficient (besides that x*y is more efficient than multiply(x, y)).

Java works with a stack which works a bit like Forth. You can 'push' things on the stack and 'pop' them off. If you want to substract an int from another, first push the 2 ints on the stack and then use isub (no need for parentheses to get the right order!). The two input values are then popped off, and the result pushed on the stack. You can duplicate the top item with dup, pop it off, swap the two top items, etc. If you call a method (which is done by first pushing the object on the stack, then calling the method), the top items are used as arguments (number of items depends on method). Complex function calls are really easy that way: first push the object, then do everything to get the first argument, leave it on the stack, get the second argument, etc, and finally call the method. Looks like this (I put the stack contents between parentheses):

 (stack is empty)
 push object you want to call (object)
 push 100 (object 100)
 duplicate (object 100 100)
 push 50 (object 100 100 50)
 subtract (object 100 50)
 divide (object 2)
 duplicate (object 2 2)
 push some object (object 2 2 otherobject)
 get some field of that object which currently contains 4 (object 2 2 4)
 multiply (object 2 8)
 call method which takes 2 ints, so the 3rd from top element is the object that is called (resultvalue)
 pop (empty)

And then all you did is gone. ;-) Note that you can't use the first argument's value in the second in the Java language (or you need to get it again). Do you now see how it can be equivelant, and better?

About this:

   33:  invokevirtual   #6; //Method? java/lang/StringBuffer?.append:(Ljava/lang/String?;)Ljava/lang/StringBuffer?;

I think the names are stored somewhere at the beginning of the file with an ID, and called by that ID to save space when used more than once (and it is of course faster). What you see above is first a label, then the instruction name, then that ID, and then a comment by the disassembler that the object StringBuffer? is called with method append (#6) which takes a String and returns a StringBuffer?. I'm sure many (if many?) Java assemblers take a name, not necessarily the ID.

Btw, don't get me wrong. The language makes large projects much more maintainable. If you can easily follow assembler code (remember that you can indent it, and can use any labels you want), or you want to try it, or are very limited in size/speed (NanoBots!), it is useful. -- Jonathan