|
Perfect bytecode? In what sense? Speed? CodeSize? Usually jikes produces smaller code than javac. Anyway, it might get something of a nightmare to maintain code that you've hand-shrink using an assembler. And that assembly doesn't look anywhere as "user friendly" as assembly for other processors do. Also, something must be missing from it. I can't see how assembling it would produce a program equivalent to the program produced by compiling the java source file... -- PEZ |
|
Perfect bytecode? In what sense? Speed? CodeSize? Usually jikes produces smaller code than javac. Anyway, it might get something of a nightmare to maintain code that you've hand-shrink using an assembler. And that assembly doesn't look anywhere as "user friendly" as assembly for other processors do. Also, something must be missing from it. I can't see how assembling it would produce a program equivalent to the program produced by compiling the java source file... -- PEZ Speed, CodeSize, file size, everything you want the way you want, but only if it can be done better in assembly. For example multiply(double first, double second) {can't be done more efficient (besides that x*y is more efficient than multiply(x, y)). Java works with a stack which works a bit like Forth. You can 'push' things on the stack and 'pop' them off. If you want to substract an int from another, first push the 2 ints on the stack and then use isub (no need for parentheses to get the right order!). The two input values are then popped off, and the result pushed on the stack. You can duplicate the top item with dup, pop it off, swap the two top items, etc. If you call a method (which is done by first pushing the object on the stack, then calling the method), the top items are used as arguments (number of items depends on method). Complex function calls are really easy that way: first push the object, then do everything to get the first argument, leave it on the stack, get the second argument, etc, and finally call the method. Looks like this (I put the stack contents between parentheses): (stack is empty) push object you want to call (object) push 100 (object 100) duplicate (object 100 100) push 50 (object 100 100 50) subtract (object 100 50) divide (object 2) duplicate (object 2 2) push some object (object 2 2 otherobject) get some field of that object which currently contains 4 (object 2 2 4) multiply (object 2 8) call method which takes 2 ints, so the 3rd from top element is the object that is called (resultvalue) pop (empty) And then all you did is gone. ;-) Note that you can't use the first argument's value in the second in the Java language (or you need to get it again). Do you now see how it can be equivelant, and better? About this: 33: invokevirtual #6; //Method? java/lang/StringBuffer?.append:(Ljava/lang/String?;)Ljava/lang/StringBuffer?; I think the names are stored somewhere at the beginning of the file with an ID, and called by that ID to save space when used more than once (and it is of course faster). What you see above is first a label, then the instruction name, then that ID, and then a comment by the disassembler that the object StringBuffer? is called with method append (#6) which takes a String and returns a StringBuffer?. I'm sure many (if many?) Java assemblers take a name, not necessarily the ID. Btw, don't get me wrong. The language makes large projects much more maintainable. If you can easily follow assembler code (remember that you can indent it, and can use any labels you want), or you want to try it, or are very limited in size/speed (NanoBots!), it is useful. -- Jonathan |
Notice what's missing? Everything I Google up is dead. And writing my own is currently a little over the top (would be a fun project though...). -- Jonathan
Eh... I'm not sure I understand what this is about. What's a Java assembler?-- PEZ
Just like any other assembler (perhaps you should look up that), but for Java. ;-) -- Jonathan
It's strange, looks like the existing ones were deliberately removed form the web. If you find one please share, I wouldn't mind experimenting with java bytecode... -- ABC
Well, I am an old assembler programmer so no need to look that up. =) But I thought the best assembly language for Java was Java itself. Never realized there were assemblers for it. And, it seems you have troubles finding one too. What's the difference between the decompiler and the disassembler? Can you show some sample output from the latter? -- PEZ
Your example:
public class Test {
public static void main(String[] args) {
System.out.println("You gave me " + args.length + (args.length==1?" argument":" arguments"));
for(int i=0; i<args.length; i++) {
printArg(i, args[i]);
}
}
static void printArg(int num, String str) {
System.out.println((num * 2 + 2) / 2 - 1 + ": " + str);
}
}
Disassembled using javap:
Compiled from "Test.java"
public class Test extends java.lang.Object{
public Test();
Code:
0: aload_0
1: invokespecial #1; //Method java/lang/Object."<init>":()V
4: return
public static void main(java.lang.String[]);
Code:
0: getstatic #2; //Field java/lang/System.out:Ljava/io/PrintStream;
3: new #3; //class StringBuffer
6: dup
7: invokespecial #4; //Method java/lang/StringBuffer."<init>":()V
10: ldc #5; //String You gave me
12: invokevirtual #6; //Method java/lang/StringBuffer.append:(Ljava/lang/String;)Ljava/lang/StringBuffer;
15: aload_0
16: arraylength
17: invokevirtual #7; //Method java/lang/StringBuffer.append:(I)Ljava/lang/StringBuffer;
20: aload_0
21: arraylength
22: iconst_1
23: if_icmpne 31
26: ldc #8; //String argument
28: goto 33
31: ldc #9; //String arguments
33: invokevirtual #6; //Method java/lang/StringBuffer.append:(Ljava/lang/String;)Ljava/lang/StringBuffer;
36: invokevirtual #10; //Method java/lang/StringBuffer.toString:()Ljava/lang/String;
39: invokevirtual #11; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
42: iconst_0
43: istore_1
44: iload_1
45: aload_0
46: arraylength
47: if_icmpge 63
50: iload_1
51: aload_0
52: iload_1
53: aaload
54: invokestatic #12; //Method printArg:(ILjava/lang/String;)V
57: iinc 1, 1
60: goto 44
63: return
static void printArg(int,java.lang.String);
Code:
0: getstatic #2; //Field java/lang/System.out:Ljava/io/PrintStream;
3: new #3; //class StringBuffer
6: dup
7: invokespecial #4; //Method java/lang/StringBuffer."<init>":()V
10: iload_0
11: iconst_2
12: imul
13: iconst_2
14: iadd
15: iconst_2
16: idiv
17: iconst_1
18: isub
19: invokevirtual #7; //Method java/lang/StringBuffer.append:(I)Ljava/lang/StringBuffer;
22: ldc #13; //String :
24: invokevirtual #6; //Method java/lang/StringBuffer.append:(Ljava/lang/String;)Ljava/lang/StringBuffer;
27: aload_1
28: invokevirtual #6; //Method java/lang/StringBuffer.append:(Ljava/lang/String;)Ljava/lang/StringBuffer;
31: invokevirtual #10; //Method java/lang/StringBuffer.toString:()Ljava/lang/String;
34: invokevirtual #11; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
37: return
}
In many cases javac generates nearly perfect bytecode, but it doesn't know how to use the stack effectively. The best bytecode reference I found is at http://cat.nyu.edu/~meyer/jvmref/ (yes, the assembler that gets mentioned there results in a 403). I don't think I will use a Java assembler a lot, but at least I want to try it. -- Jonathan
http://www.sable.mcgill.ca/software/#soot jasmin sable 1.27 is downloadable there. --Goofy http://bmrc.berkeley.edu/courseware/cs164/spring98/proj/jasmin/src/jasmin/ is the jasmin source, I think --Goofy
Perfect bytecode? In what sense? Speed? CodeSize? Usually jikes produces smaller code than javac. Anyway, it might get something of a nightmare to maintain code that you've hand-shrink using an assembler. And that assembly doesn't look anywhere as "user friendly" as assembly for other processors do. Also, something must be missing from it. I can't see how assembling it would produce a program equivalent to the program produced by compiling the java source file... -- PEZ
Speed, CodeSize, file size, everything you want the way you want, but only if it can be done better in assembly. For example
multiply(double first, double second) {
return first*second;
}
can't be done more efficient (besides that x*y is more efficient than multiply(x, y)).
Java works with a stack which works a bit like Forth. You can 'push' things on the stack and 'pop' them off. If you want to substract an int from another, first push the 2 ints on the stack and then use isub (no need for parentheses to get the right order!). The two input values are then popped off, and the result pushed on the stack. You can duplicate the top item with dup, pop it off, swap the two top items, etc. If you call a method (which is done by first pushing the object on the stack, then calling the method), the top items are used as arguments (number of items depends on method). Complex function calls are really easy that way: first push the object, then do everything to get the first argument, leave it on the stack, get the second argument, etc, and finally call the method. Looks like this (I put the stack contents between parentheses):
(stack is empty) push object you want to call (object) push 100 (object 100) duplicate (object 100 100) push 50 (object 100 100 50) subtract (object 100 50) divide (object 2) duplicate (object 2 2) push some object (object 2 2 otherobject) get some field of that object which currently contains 4 (object 2 2 4) multiply (object 2 8) call method which takes 2 ints, so the 3rd from top element is the object that is called (resultvalue) pop (empty)And then all you did is gone. ;-) Note that you can't use the first argument's value in the second in the Java language (or you need to get it again). Do you now see how it can be equivelant, and better?
About this:
33: invokevirtual #6; //Method? java/lang/StringBuffer?.append:(Ljava/lang/String?;)Ljava/lang/StringBuffer?;I think the names are stored somewhere at the beginning of the file with an ID, and called by that ID to save space when used more than once (and it is of course faster). What you see above is first a label, then the instruction name, then that ID, and then a comment by the disassembler that the object StringBuffer? is called with method append (#6) which takes a String and returns a StringBuffer?. I'm sure many (if many?) Java assemblers take a name, not necessarily the ID.
Btw, don't get me wrong. The language makes large projects much more maintainable. If you can easily follow assembler code (remember that you can indent it, and can use any labels you want), or you want to try it, or are very limited in size/speed (NanoBots!), it is useful. -- Jonathan