For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | barahilia's commentsregister

Thank you.


As explained in the post, function call is a full expression for which appropriate function should be found. If there 2 functions with the same name and param types it would be impossible to compile such an expression in its own.


Not necessarily, you'd just need a syntactic mechanism to disambiguate.


And have one, since Java either throws away the return value, or assigns it to a variable the type of which is known.

Edit: here's the edge case though. You can call a function and use the return value directly as a parameter: foo(bar()). It's possible to have two foo that take both possible bar return types, at which point the compiler is stuck.

It could require a cast in this instance, however. The more I think about this the more I wonder why this isn't possible.


> here's the edge case though.

That's no more an edge case than the cases where you throw away the return value or you're binding to an ambiguous type e.g. `A getFoo()`, `B getFoo()`, `Object foo = getFoo()`.

> The more I think about this the more I wonder why this isn't possible.

The Java spec does not say, for C++ Stroustrup states it's

> to keep resolution for an individual operator or function call context-independent.

the Java reason is likely also some sort of Principle of Least Surprise claim.


Indeed. The lower level language must be more expressive by the "definition". It is more difficult to write but allows fine grain control. This is the reason some optimization and obfuscation tricks are done in Assembler (native world, not Java). And hence disassembler simply can not re-translate it back.


There are two valid ways for a disassembler mitigate this: a) decompile to a language in which the bytecode can be expressed (in a concise / expresive manner, Java would always be a "possible" target because of turing completeness) or b) accommodate for the fact that there could be signature collisions in java, e.g. by prefixing/suffixing the method name


If you change the method name you end up with code that acts differently, just imagine something that does something like this pseudocode:

if (!new Exception().getStackTrace().getSha1sum().startsWith("0000")) alert("hello decompiler")

Your comment about java and turing completeness doesn't make sense unless you want the decompiler to basically output a java implementation of a JVM?


In basic blocks (no conditionals or loops), a disassembler mostly does a mechanical job translating opcodes to the appropriate Java code and aggregating expressions. But it doesn't rename functions and or classes. They are left as were in the bytecode.


Yes. Because of overloading there may be 2 functions with the same name and the same return type but different arguments. So you need to include them too.


No. Java itself handles the differing arguments.


Well one obfuscation technique is to change the argument types. So for instance, you might have foo(string). But after obfuscation, you move the code of foo(string) into foo(object). So a decompiler ends up showing:

  String s = "123";
  foo(s);
Java will call foo(string) if this is re-compiled, which is wrong. The decompiler would need to show foo((Object)s).

On the CLR you can go even further and erase a lot of the type info and just pass objects around. That is, you can just change all your method signatures to foo(object, object, object) no matter which classes (not structs) you pass. Or have even more fun and randomize the classes. So now foo(string) becomes foo(DBConnectionInfo).


For that specific example that would actually be fine in Java. Everything is an Object, so even if you have foo(Object) you can just pass it a String and that is then fine both for the JVM and the Java compiler.

Since the JVM identifies methods by their name + argument types this would most likely break overloading unless done very carefully. I've decompiled quite a few Java applications and haven't seen anything like this either, but perhaps I've just had the luck to not yet run into it.


It wouldn't be fine. It'd call the wrong method. The bytecode invokes the foo(object) one, but a naive decompiling would invoke foo(string). A cast would be required to get it to resolve correctly. Or am I misunderstanding?


I'm talking about decompiled code, which would turn out looking something like http://ideone.com/HCOLU4 which compiles and runs just fine. Maybe I am misunderstanding what you mean?


Very interesting. Technically this should work in Dalvik opcodes too. But I do not know if there any runtime checks done by VM. I have never seen such technique used in Android.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You