作者:asklxf

我们都知道,Java编译器负责将.java文件编译成.class文件,class文件存储的是java字节码,与.java文件无关(只要你愿意写一个 编译器,也可以将别的语言写的源代码编译成.class文件),本文准备详细解剖class文件的内部结构,并且把class文件结构读取并显示出来。

Class文件的格式由JVM规范规定,一共有以下部分:
1. magic number,必须是0xCAFEBABE,用于快速识别是否是一个class文件。
2. version,包括major和minor,如果版本号超过了JVM的识别范围,JVM将拒绝执行。
3. constant pool,常量池,存放所有用到的常量。
4. access flag,定义类的访问权限。
5. this class和super class,指示如何找到this class和super class。
6. interfaces,存放所有interfaces。
7. fields,存放所有fields。
8. methods,存放所有methods。
9. attributes,存放所有attributes。

先写一个Test.java:

package example.test;

public final class TestClass {
    public int id = 1234567;
    public void test() {}
}

然后编译,放在C:\example\test\Test.class。

我们用Java来读取和分析class,ClassAnalyzer的功能便是读取Test.class,分析结构,然后显示出来:

package classfile.format;

import java.io.*;

public class ClassAnalyzer {

    public static void main(String[] args) {
        DataInputStream input = null;
        try {
            input = new DataInputStream(new BufferedInputStream(new
                    FileInputStream(
                            "C:\\example\\test\\TestClass.class"
                    )));
            analyze(input);
        } catch (Exception e) {
            System.out.println("Analyze failed!");
        } finally {
            try {
                input.close();
            } catch (Exception e) {}
        }
    }

    public static void analyze(DataInputStream input) throws IOException {
        // read magic number:
        int magic = input.readInt();
        if (magic == 0xCAFEBABE) {
            System.out.println("magic number = 0xCAFEBABE");
        } else {
            throw new RuntimeException("Invalid magic number!");
        }
        // read minor version and major version:
        short minor_ver = input.readShort();
        short major_ver = input.readShort();
        System.out.println("Version = " + major_ver + "." + minor_ver);
        // read constant pool:
        short const_pool_count = input.readShort();
        System.out.println("constant pool size = " + const_pool_count);
        // read each constant:
        for (int i = 1; i < const_pool_count; i++) {
            analyzeConstant(input, i);
        }
    }

    public static void analyzeConstant(
            DataInputStream input, int index) throws IOException {
        byte flag = input.readByte();
        // for read:
        byte n8;
        short n16;
        int n32;
        long n64;
        float f;
        double d;
        byte[] buffer;
        System.out.println("\nconst index = " + index + ", flag = " +
                           (int) flag);
        switch (flag) {
        case 1: // utf-8 string
            System.out.println(" const type = Utf8");
            n16 = input.readShort();
            System.out.println("     length = " + n16);
            buffer = new byte[n16];
            input.readFully(buffer);
            System.out.println("      value = " + new String(buffer));
            break;
        case 3: // integer
            System.out.println(" const type = Integer");
            n32 = input.readInt();
            System.out.println("      value = " + n32);
            break;
        case 4: // float
            System.out.println(" const type = Float");
            f = input.readFloat();
            System.out.println("      value = " + f);
            break;
        case 5: // long
            System.out.println(" const type = Long");
            n64 = input.readLong();
            System.out.println("      value = " + n64);
            break;
        case 6: // double
            System.out.println(" const type = Double");
            d = input.readDouble();
            System.out.println("      value = " + d);
            break;
        case 7: // class or interface reference
            System.out.println(" const type = Class");
            n16 = input.readShort();
            System.out.println("      index = " + n16 +
                               " (where to find the class name)");
            break;
        case 8: // string
            System.out.println(" const type = String");
            n16 = input.readShort();
            System.out.println("      index = " + n16);
            break;
        case 9: // field reference
            System.out.println(" const type = Fieldref");
            n16 = input.readShort();
            System.out.println("class index = " + n16 +
                               " (where to find the class)");
            n16 = input.readShort();
            System.out.println("nameAndType = " + n16 +
                               " (where to find the NameAndType)");
            break;
        case 10: // method reference
            System.out.println(" const type = Methodref");
            n16 = input.readShort();
            System.out.println("class index = " + n16 +
                               " (where to find the class)");
            n16 = input.readShort();
            System.out.println("nameAndType = " + n16 +
                               " (where to find the NameAndType)");
            break;
        case 11: // interface method reference
            System.out.println(" const type = InterfaceMethodref");
            n16 = input.readShort();
            System.out.println("class index = " + n16 +
                               " (where to find the interface)");
            n16 = input.readShort();
            System.out.println("nameAndType = " + n16 +
                               " (where to find the NameAndType)");
            break;
        case 12: // name and type reference
            System.out.println(" const type = NameAndType");
            n16 = input.readShort();
            System.out.println(" name index = " + n16 +
                               " (where to find the name)");
            n16 = input.readShort();
            System.out.println(" descripter = " + n16 +
                               " (where to find the descriptor)");
            break;
        default:
            throw new RuntimeException("Invalid constant pool flag: " + flag);
        }
    }
}

输出结果为:

magic number = 0xCAFEBABE
Version = 48.0
constant pool size = 22

const index = 1, flag = 1
 const type = Utf8
     length = 22
      value = example/test/TestClass

const index = 2, flag = 7
 const type = Class
      index = 1 (where to find the class name)

const index = 3, flag = 1
 const type = Utf8
     length = 16
      value = java/lang/Object

const index = 4, flag = 7
 const type = Class
      index = 3 (where to find the class name)

const index = 5, flag = 1
 const type = Utf8
     length = 2
      value = id

const index = 6, flag = 1
 const type = Utf8
     length = 1
      value = I

const index = 7, flag = 1
 const type = Utf8
     length = 6
      value = <init>

const index = 8, flag = 1
 const type = Utf8
     length = 3
      value = ()V

const index = 9, flag = 1
 const type = Utf8
     length = 4
      value = Code

const index = 10, flag = 12
 const type = NameAndType
 name index = 7 (where to find the name)
 descripter = 8 (where to find the descriptor)

const index = 11, flag = 10
 const type = Methodref
class index = 4 (where to find the class)
nameAndType = 10 (where to find the NameAndType)

const index = 12, flag = 3
 const type = Integer
      value = 1234567

const index = 13, flag = 12
 const type = NameAndType
 name index = 5 (where to find the name)
 descripter = 6 (where to find the descriptor)

const index = 14, flag = 9
 const type = Fieldref
class index = 2 (where to find the class)
nameAndType = 13 (where to find the NameAndType)

const index = 15, flag = 1
 const type = Utf8
     length = 15
      value = LineNumberTable

const index = 16, flag = 1
 const type = Utf8
     length = 18
      value = LocalVariableTable

const index = 17, flag = 1
 const type = Utf8
     length = 4
      value = this

const index = 18, flag = 1
 const type = Utf8
     length = 24
      value = Lexample/test/TestClass;

const index = 19, flag = 1
 const type = Utf8
     length = 4
      value = test

const index = 20, flag = 1
 const type = Utf8
     length = 10
      value = SourceFile

const index = 21, flag = 1
 const type = Utf8
     length = 14
      value = TestClass.java

我们暂时只读取到constant pool,后面的信息下次再继续 :)

参考:《Inside the Java Virture Mechine》

-------------------------------------------

作者:asklxf 

今天继续分析Class文件的结构。

上次读取了constant pool,紧接着是这个类或接口的Access flags,JVM定义的关于class或interface的Access flags有:

private static short ACC_PUBLIC = 0x0001;
private static short ACC_FINAL = 0x0010;
private static short ACC_SUPER = 0x0020;
private static short ACC_INTERFACE = 0x0200;
private static short ACC_ABSTRACT = 0x0400;

// read access flags:
short access_flags = input.readShort();
System.out.print("Access flags:");
if ((access_flags & ACC_PUBLIC) == ACC_PUBLIC) {
    System.out.print(" public");
}
if ((access_flags & ACC_FINAL) == ACC_FINAL) {
    System.out.print(" final");
}
if ((access_flags & ACC_SUPER) == ACC_SUPER) {
    System.out.print(" super");
}
if ((access_flags & ACC_INTERFACE) == ACC_INTERFACE) {
    System.out.print(" interface");
}
if ((access_flags & ACC_ABSTRACT) == ACC_ABSTRACT) {
    System.out.print(" abstract");
}
System.out.println();
然后是this class和super class:
// read this class and super class:
short this_class_index = input.readShort();
short super_class_index = input.readShort();
System.out.println("This class = " + this_class_index);
System.out.println("Super class = " + super_class_index);
根据this class的index可以从constant pool中得到这个class的信息,super class也一样。每个class都有super class,只有Object的super class index=0。

接下来是这个class继承了多少个interface和每个interface在constant pool中的index:

// read interfaces count:
short interfaces_count = input.readShort();
System.out.println("Interfaces count = " + interfaces_count);
// read each interface:
for (int i = 1; i <= interfaces_count; i++) {
    short interface_index = input.readShort();
    System.out.println("No. " + i + " interface index = " + interface_index);
}
结果如下:

    Access flags: public final super
    This class = 2
    Super class = 4
    Interfaces count = 0