Java Memory Model
What’s in the memory of a running Java application?
Before we look into the details, we can guess that the following component is needed for an application to run:
- The code: Java code is not complied or executed in native code format, the code is converted into class file and executed through the execution engine. So the code must be loaded into the memory to execute.
- JVM: the Java Virtual Machine itself.
- Stack of the current thread, it stores current status of the execution, the running code, the local varaibles, and the trace of the execution.
- Data objects: when java code is run, the objects needs to be initiated to track he data, and they should not be in the stack area, therefore another area should be reserved fro the data objects.
We are pretty close with the Java memory model already with the above reasoning. Actually Java memory model has the following model:
- Method area
this areas store all Java class level information, including the class name, it’s immediate parent class name. This area is one per JVM, and shared by different threads.
- Heap area
this areas is used for all the objects that is instantiated. There is also one Heap Area per JVM. For example, all the static string lives on the Heap area so that they can be referred and shared.
- Stack area
There is one running stack for each thread, each block is created as a new function is invoked.
- PC Register
This area stores the address of current execution instruction of a thread. This is also one per thread.
How does Java GC work?
Java program runs in Java Virtual Machine, Java Virtual Machine manages the memory allocation and recycle of the programs. In C/C++ programming, the program need to call
free() method to allocate and release the memories. JVM handles the memory allocation and recycle for the programs, it introduced the garbage collector for recycling the unused memories.
The most critical question is how does Java know whether the memory area is ready to be recycled? Java GC will scan the Stack area, if the object in the Heap is reachable from the stack, it means this object is still being referred and it should not recycle, otherwise, they area can be recycled.
What are some of the algorithms that Java GC use?
Mark and Sweep
In phase I of this algorithm, the Java GC traversal the heap area to decide which one is still reachable.
In phase II of this algorithm, the Java GC will mark the rest of the area as recyclable, and these areas will be overwrite next time a data allocation request is sent.
Does Java GC compacting the memory to improve the later on memory allocation performance? Yes, it does. The
mark-compact algorithm would try to move the referred area to the start of the heap and so that the later on memory allocation is in a continuous chunk of area and the new allocation is faster.
The problem of Mark and Sweep algorithm is that is can become quite inefficient. If the objects size becomes big or the program becomes complex, traversal the diagram and clean them up could take a long time.
Generational Garbage Collection
Generational Garbage Collection is added on top of the Mark and Sweep algorithm. Most of the java objects are short lived, and if they are short lived, they most likely will leave forever.
Generational GC divides the Heap into a few areas: the younger generation, the old generation and the Permanent generation.
The objects are first allocated in the younger generation, which contains the
S2 area. If the size fills up, a minor GC is triggered to recycle the data, and if the objets survived, it is gradually pushed to older generation.
The old generation is used to store the long running objects, only when the object in the young generation reached certain age, they are added to the old generation. And a major GC is run to perform GC on old generation.
The permanent generation is used to store the metadata required for JVM to run, for example, the class file.
From Java 8 on, the permanent generation is removed and replaced with the metadata space, and method area is part of this area.
Java Memory Tuning
Java GC tpes
Java provides different types of GC:
- The Serial GC
This is the default GC, and it runs in a single thread and in serial for different GC tasks.
- The parallel GC
This GC runs in multiple thread and runs the GC in parallel.
- The Concurrent Mark Sweep Collector
This GC is used to collect the tenured generation.
How do we tune java memory?
- -Xmx: set the maximum heap size
- -Xms: set the starting heap size
- -XX:MaxPermSize: java 7 and below
- -X:MetaspaceSize the meta space size.
- -verbose:gc print to the console when a garbage collection is run.
- -Xmn set the size of the young generation
- -XX: HeapDumpOnOutOfMemory create a heap dump file