4

I used to think Java can be decompiled because it compiles into byte code and not object code. This is wrong because of the implicit assumption byte code is some how "more human readable" than object code. Why can programs written in Java be so easily decompiled and even have the same identifiers (variable names)? I heard in C/C++ it can only disassemble to assembly but no decompile to source code, why so?

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
Celeritas
  • 14,489
  • 36
  • 113
  • 194
  • 2
    Because C++ rules!!! Seriously now, C++ can also be decompiled. – Luchian Grigore Sep 07 '12 at 07:36
  • [Here](http://stackoverflow.com/q/205059/650012) is a related SO question on C++ decompiling – Anders Gustafsson Sep 07 '12 at 07:38
  • 1
    http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html C rules, but C++ not so much. ;) – Peter Lawrey Sep 07 '12 at 07:38
  • 2
    The question makes absolutely no sense. You cannot "decompile a language". You can decompile a *program* perhaps, or *binary code*. And perhaps you can express the result as an equivalent program in some high-level language. – Kerrek SB Sep 07 '12 at 07:39
  • @PeterLawrey you take that back!!! – Luchian Grigore Sep 07 '12 at 07:39
  • @KerrekSB fine I'll change the title – Celeritas Sep 07 '12 at 07:40
  • 2
    Perhaps you should consider this title instead: Why programs compiled with certain compilers can be decompiled and other's (practically) can't? – eq- Sep 07 '12 at 07:43
  • @PeterLawrey Its such a pleasing sight to see that red downward arrow on Java! :D – UltraInstinct Sep 07 '12 at 07:48
  • 1
    The suggested title was also a hint: certain compilers or settings (same language or not) leave more original information into whatever output they produce, and scramble the logic less; of course certain language features may also require some of this to happen. However, not really a question fit for SO (IMO). – eq- Sep 07 '12 at 07:50
  • @Thrustmaster There appears to be a shift to Objective-C whose best feature might be that it runs on an iPhone. :P – Peter Lawrey Sep 07 '12 at 07:52

1 Answers1

8

Java compilers keeps most of the original information and does very little optimisation when producing the byte code. The compilers task is to validate the code so it can be dynamically optimised. Note: Excelisor compiles to native code and imagine would be difficult to decompile (at least that what their marketing says ;)

C/C++ is compiled and optimised as much as possible, discarding a lot of the original information. (With the exception for debug information) This makes it much more difficult to untangle into sensible C or C++.

Note: these are features of the compilers common used for those languages. Not features of the languages themselves.

In terms of the difference in languages, all you can say is that Java is relatively feature poor compared with C++. Less features makes less compiled patterns to understand and reverse engineer.

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
  • By dynamically optimized you mean the JVM does all the optimization at run time? – Celeritas Sep 07 '12 at 07:40
  • Java and C++ are *languages*. They don't "do" anything, especially not producing code. – Kerrek SB Sep 07 '12 at 07:42
  • 1
    @Celeritas: That is true; the JVM does all the optimization. The reason if this is platform independence; if the compiler would optimize it would be only for a platform. – Random42 Sep 07 '12 at 07:42
  • @KerrekSB: Java is a platform, and Java is a language. Both are named Java. – Dietrich Epp Sep 07 '12 at 07:45
  • The JVM does 99% of the optimisation. The compiler unfortunately does a small number of optimisations such as constant propagation which are more likely to catch you out than be useful (because they are relatively rare) – Peter Lawrey Sep 07 '12 at 07:45
  • 1
    @KerrekSB you already said that and it was a grammatical mistake I said. I think everyone gets the picture. – Celeritas Sep 07 '12 at 07:46
  • "Java" can refer to a language, a development environment, the libraries which come with the JVM. Most of what you need to learn about Java is in its libraries, you couldn't expect to be a Java Developer knowing just the language itself. – Peter Lawrey Sep 07 '12 at 07:47
  • @KerrekSB Your objection is way off base. Java has a formal specification that determines to a large degree what implementations do. – Jim Balter Sep 07 '12 at 07:49
  • C++ by comparison is a much richer language offering many ways of doing the same thing, but comes with a much simpler core library set. For this reason, most C++ developers use additional libraries like Boost to give the sort of functionality Java has in its core libraries. – Peter Lawrey Sep 07 '12 at 07:49
  • @PeterLawrey I have trouble seeing why in Java you can import any package but in C++ you may need to download a library (e.g. boost) before including it. – Celeritas Sep 07 '12 at 07:54
  • 1
    Its the same in Java and C++. There are libraries which are available as standard and those which are additional. Here are some examples of additional libraries for Java http://java-source.net/. The difference is that Java provides more functionality in core libraries so something which is core in Java might be additional in C++. – Peter Lawrey Sep 07 '12 at 08:01
  • I like using maven for builds for Java because it will automatically download and use any additional library&version you need (well almost all of them ;) This saves you having to find, download or compile additional libraries yourself. If you have an IDE it can find these libraries for you. e.g. You use a class which it doesn't know about and it will give you auto-fix suggestions as to which library&version you would need to add to use that class (and adds it) – Peter Lawrey Sep 07 '12 at 08:04