Java Memory Management

Java Memory Model

What’s in the memory of a running Java application?

Before we look into the details, we can guess that the following component is needed for an application to run:

  1. The code: Java code is not complied or executed in native code format, the code is converted into class file and executed through the execution engine. So the code must be loaded into the memory to execute.
  2. JVM: the Java Virtual Machine itself.
  3. Stack of the current thread, it stores current status of the execution, the running code, the local varaibles, and the trace of the execution.
  4. Data objects: when java code is run, the objects needs to be initiated to track he data, and they should not be in the stack area, therefore another area should be reserved fro the data objects.

We are pretty close with the Java memory model already with the above reasoning. Actually Java memory model has the following model:

  • Method area

this areas store all Java class level information, including the class name, it’s immediate parent class name. This area is one per JVM, and shared by different threads.

  • Heap area

this areas is used for all the objects that is instantiated. There is also one Heap Area per JVM. For example, all the static string lives on the Heap area so that they can be referred and shared.

  • Stack area

There is one running stack for each thread, each block is created as a new function is invoked.

  • PC Register

This area stores the address of current execution instruction of a thread. This is also one per thread.

Java GC

How does Java GC work?

Java program runs in Java Virtual Machine, Java Virtual Machine manages the memory allocation and recycle of the programs. In C/C++ programming, the program need to call alloc() and free() method to allocate and release the memories. JVM handles the memory allocation and recycle for the programs, it introduced the garbage collector for recycling the unused memories.

The most critical question is how does Java know whether the memory area is ready to be recycled? Java GC will scan the Stack area, if the object in the Heap is reachable from the stack, it means this object is still being referred and it should not recycle, otherwise, they area can be recycled.

What are some of the algorithms that Java GC use?

Mark and Sweep

In phase I of this algorithm, the Java GC traversal the heap area to decide which one is still reachable.

In phase II of this algorithm, the Java GC will mark the rest of the area as recyclable, and these areas will be overwrite next time a data allocation request is sent.

Does Java GC compacting the memory to improve the later on memory allocation performance? Yes, it does. The mark-compact algorithm would try to move the referred area to the start of the heap and so that the later on memory allocation is in a continuous chunk of area and the new allocation is faster.

The problem of Mark and Sweep algorithm is that is can become quite inefficient. If the objects size becomes big or the program becomes complex, traversal the diagram and clean them up could take a long time.

Generational Garbage Collection

Generational Garbage Collection is added on top of the Mark and Sweep algorithm. Most of the java objects are short lived, and if they are short lived, they most likely will leave forever.

Generational GC divides the Heap into a few areas: the younger generation, the old generation and the Permanent generation.

The objects are first allocated in the younger generation, which contains the eden, s1 and S2 area. If the size fills up, a minor GC is triggered to recycle the data, and if the objets survived, it is gradually pushed to older generation.

The old generation is used to store the long running objects, only when the object in the young generation reached certain age, they are added to the old generation. And a major GC is run to perform GC on old generation.

The permanent generation is used to store the metadata required for JVM to run, for example, the class file.

From Java 8 on, the permanent generation is removed and replaced with the metadata space, and method area is part of this area.

Java Memory Tuning

Java GC tpes

Java provides different types of GC:

  • The Serial GC

This is the default GC, and it runs in a single thread and in serial for different GC tasks.

  • The parallel GC

This GC runs in multiple thread and runs the GC in parallel.

  • The Concurrent Mark Sweep Collector

This GC is used to collect the tenured generation.

How do we tune java memory?

  • -Xmx: set the maximum heap size
  • -Xms: set the starting heap size
  • -XX:MaxPermSize: java 7 and below
  • -X:MetaspaceSize the meta space size.
  • -verbose:gc print to the console when a garbage collection is run.
  • -Xmn set the size of the young generation
  • -XX: HeapDumpOnOutOfMemory create a heap dump file

Manager Trainning: Feedback and Hard Conversation

How do you give and ask for feedbacks? What’s your strategy for structuring the feedback so it is can be better received? What if your peer push back on your feedback?
One of the most interesting topic for becoming manager is the feedbacks. Your feedback has to be timely, be specific and be actionable. But in the mean time, the strategy for structuring the feedback is equally important.
One of the strategy is called SBI, where S stands for situation, B stands for Behavior, I stands for impact. It suggest you to structure your feedback into three parts:

  • On the Situation part, you share the situation where the behavior is captured.
  • On the behavior part, you share the behaviors which you think could be improved.
  • On the impact part, you share the impact of these unexpected behavior.

SBI structure helps you focus on the behavioral feedbacks, on the examples and on the impact. And of course, you should expect push back from the other side about your feedback many times. In these situations, try to stand your ground. Remember, it’s not your job to please someone, it’s your job to communicate clearly what’s the expected behavior of someone. 
SBI can also be used on the hard conversations, for example, performance improvement plan. For hard conversation, it is suggested that we don’t try to avoid the real topic by having hiding agendas. It is better to directly shares with the other party that you have will some hard conversation with them and what is your concern. Then you share the SBI of these behaviors and ask for the perspective .
It is very common that you would receive push back during hard conversations, people might come back to you for all different reasons. But you will need to remember to stand on your ground(again), and push it through. Some of the tips including send out document before hand and allow them to be digested, and try to use ‘and’ instead of but on your conversation. You need to let people know that you totally get their concern, but the behavior didn’t really match the expectation and is a problem for you.

Manager Tools Pod Cast: Prioritizing Your Relationship With Your Boss

  • No one is glad to be managed. So don’t try to manage your boss, but to prioritizing your relationship.
  • Three major topics
  • Communicate with your boss FIRST. Either good news or bad news, your boss don’t want to be the last person to know it. Communicate with your boss before you communicate with other people on the organization. Hearing the news about the organization from someone else make him/her feel bad, which hurts the relationship. 
  • Make changes when you get feedback. When directs get feedback, whether positive or negative feedback, consider strongly about making a feedback. If you blow off the feedback every time, people will stop sharing the feedback with you.
  • Maintain confidentiality: keep the confidential information your boss shared with you confidential. Don’t risk to ruin the trust.
  • Why does how my Boss feels matter: he/she controls your performance review, and the project you get.
  • Where the relationship comes from?  The little, daily things that built up the relationship, not the big things. 

MIT PoM 1: introduction to Microeconomics

  • Economics is all about scarcity and the constrained optimization, which means given certain constraints, how does individuals and firm trade off different alternatives to make the optimal choice. Actually, all engineering is about constraint optimization.
  • This course is going to be focusing on two types of actors, consumers and producers. And we are about to build models to explain the behavior of these actions. The models needs to be tractable and can explain the real world in reality.
  • Consumer are to optimize the utility, firms on the other hand, are going to optimize the profits.
  • Three are fundamental questions in microeconomics: what goods and services should be produced? How these goods and services are produced? Who should get these goods and services? Price will determine what get produced, how it’s produced, and who gets the goods that are produced. 
  • The first distinction: theoretical v.s. empirical economics. Theoretical economics is the process that build the models to explain the world, while empirical economics is the process of testing these models to see how good they could explain the world. 
  • Another distinction: positive v.s. normative economics. The way things are: positive economics, and things should be: normative economics. 
  • Supply + demand: water is important but has a large supply, diamond is not relevant to the life, but has a much lower supply, thus a much lower price.
  • Consumer theory: preference, constraints. 

Java Stream

What is stream?

Stream is an abstraction of data operations. It takes input from the collections, Arrays of I/O channels.

From imperative to declarative

For example, given a list of people, find out the first 10 people with age less or equals to 18.

The following solution is the imperative approach:

public void imperativeApproach() throws IOException {
        List<Person> people = MockData.getPeople();

        List<Person> peopleAbove18 = new ArrayList<>();
        for (Person person : people) {
            if (person.getAge() <= 18) {
                peopleAbove18.add(person);
            }
        }

        for (Person person: peopleAbove18) {
            System.out.print(person);
        }
}

The following is the declare approach style:

public void declareApproach() throws IOException {
        List<Person> people = MockData.getPeople();
       people.stream()
                    // a lambda function as a filter
                  .filter(person -> person.getAge() <= 18)
                  .limit(10)
                  .collect(Collectors.toList())
                  .forEach(System.out::print);
}

Abstraction

We mentioned that stream is an abstraction to the data manipulation. It abstract them in the following way:

  • Concrete: can be the Sets, Lists, Maps, etc
  • Stream: can be filter, map, etc.
  • Concrete: collect the data to make it concrete again.

Intermediate and Terminate Operation

Java stream has different operation units:

  • Intermediate operators: map, filter, sorted
  • Terminators: collect, foreach, reduce

Each intermediate operation is lazily executed and return a stream, until a terminal operation is met.

Range

With IntStream.range(), you can create a stream with fixed set of elements, for example:

    public void rangeIteratingLists() throws Exception {
        List<Person> people = MockData.getPeople();

        // Use int stream to loop through the list and print the object.
        IntStream.range(0, 10).forEach(i -> System.out.println(people.get(i)));

        // If you want to use for the first elements
        people.stream().limit(10).forEach(System.out::println);
    }

You can also iterate the function for the given number times:

    public void intStreamIterate() throws Exception {
        // This is very much like the fold function on Kotlin,
        // that it keep iterating based on the iterator you provided.
        IntStream.iterate(0, operand -> operand + 1).limit(10).forEach(System.out::println);
    }

Max, Min and Comparators

Java stream provides built in Min/Max function that support customized comparators. For example:

    public void min() throws Exception {
        final List<Integer> numbers = ImmutableList.of(1, 2, 3, 100, 23, 93, 99);

        int min = numbers.stream().min(Comparator.naturalOrder()).get();

        System.out.println(min);
    }

Distinct

Sometimes, we would like to get the distinct elements from the stream, then we could use the distinct api of the stream

  public void distinct() throws Exception {
    final List<Integer> numbers = ImmutableList.of(1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9, 9, 9, 9);

    List<Integer> distinctNumbers = numbers.stream()
        .distinct()
        .collect(Collectors.toList());

  }

Filtering and Transformation

Stream filter api enables you to filter the content of the element, for example:

    public void understandingFilter() throws Exception {
        ImmutableList<Car> cars = MockData.getCars();

        // Predicate is an assertion that returns true or false
        final Predicate<Car> carPredicate = car -> car.getPrice() < 20000;

        List<Car> carsFiltered = cars.stream()
            .filter(carPredicate)
            .collect(Collectors.toList());

And map API enable you to transform the format of the element, for example, we could define a another object and transform the given stream to the targeted stream:

    public void ourFirstMapping() throws Exception {
        // transform from one data type to another
        List<Person> people = MockData.getPeople();

        people.stream().map(p -> {
            return new PersonDTO(p.getId(), p.getFirstName(), p.getAge());
        }).collect(Collectors.toList());

    }

Group Data

One common function in SQL queries are data grouping, for example:

SELECT COUNT(*), TYPE FROM JOB WHERE USER_ID = 123 GROUP BY TYPE

Java stream provides similar functionalities:

  public void groupingAndCounting() throws Exception {
    ArrayList<String> names = Lists
        .newArrayList(
            "John",
            "John",
            "Mariam",
            "Alex",
            "Mohammado",
            "Mohammado",
            "Vincent",
            "Alex",
            "Alex"
        );

    Map<String, Long> counting = names.stream()
        .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));

    counting.forEach((name, count) -> System.out.println(name + " > " + count));
  }

Reduce and Flatmap

Very similar to the Hadoop Map/Reduce job, where map take care of transformation of the data, while the reduce job collect the data and do the final computation. For example:

  public void reduce() throws Exception {
    Integer[] integers = {1, 2, 3, 4, 99, 100, 121, 1302, 199};

     // Compute the same of the elements, with the initial element as a
    int sum = Arrays.stream(integers).reduce(0, (a, b) -> a + b);
    System.out.println(sum);

    // use the function reference
    int sum2 = Arrays.stream(integers).reduce(0, Integer::sum);
    System.out.println(sum2);

  }

Flat map is different from the map function that it could flat the internal structure first.

For example:

List<List<String>> list = Arrays.asList(
  Arrays.asList("a"),
  Arrays.asList("b"));
System.out.println(list);

System.out.println(list
  .stream()
  .flatMap(Collection::stream)
  .collect(Collectors.toList()));

The result of the stream is a String List.

消费函数

当我们考虑花多少钱在消费上的时候,我们会考虑哪些因素?也就是说,我们的消费由什么决定?在经济学中,又该如何量化这个决策的过程?凯恩斯提出的消费函数认为:消费仅仅取决于现期收入,也就是 C=C0 + aY`。这里的Y是收入,而C0 是当前消费,而a是边际消费率,也就是随着收入增长,消费增长的比例。凯恩斯的消费公式有几个重要含义:

  • 消费随着收入的增长而增长,但是增长的比例小于1,即边际消费率大于0但是小于 1。
  • 边际消费率本身也会随着收入的增长而减小。

这种消费函数的理论在关于短期内消费倾向随收入变化的研究中得到了验证。根据凯恩斯的理论,随着收入的提高,由于新增收入部分消费比例的而降低,消费占收入整体的比例是减小的。这样就带来了一种预测中的困境:最终可用于消费的资金越来越少。问题在于,这种预测中的困境并未变成现实,而对长期消费倾向的研究显示,长期消费倾向不随收入而变化,也就是说,凯恩斯的公式在长期不适用了。

许多经济学家对这个问题进行了研究。费雪的研究现实人们在决定消费的时候并不只会考虑当前的收入,人们会考虑未来的收入,并会通过储蓄和贷款等方式将长期的消费平滑。比如在对未来收入稳定或者增长的情况下通过贷款来提前消费,或者在对退休后收入下降的预期之下通过储蓄来增加后期的消费。也就是说,人们会通过理性的调整自己的消费和储蓄的比例来尽力使得长期消费平滑。

费雪模型中引入了利率的影响。在进行跨期选择的时候,由于利息的作用,未来收入的价值要小于当前收入的价值。为了对现期与未来的消费组合进行比较,费雪的模型还引入了无差异曲线,在这个曲线上的消费组合对于消费者所产生的满意度是一样的。收入的增加会提升无差异曲线从而带来现期消费和未来消费的同时增多。于此同时,消费者受到跨期预算的约束,现期的收入必须减去为未来的储蓄才能作为消费,未来的消费包括未来的储蓄和未来的收入。现期和未来的消费在引入利息之后等于现期和未来的收入。

在费雪理论的基础上,弗朗科提出了生命周期假说。在费雪的模型中,消费取决于一个人一生的收入,弗朗科进一步强调人在收入在人们的一生中系统地变动,而人们通过储蓄把收入从一生中的高收入时期转移到低收入时期。弗朗科的模型在凯恩斯的模型之上引入了个人的财富W。消费者的总资源包括其初始的财富和一生中的收入,然后平均分配到未来的若干年中就可以得到平均的消费函数:C=aW+bY,其中a为财富的边际消费倾向,Y为收入的边际消费倾向,而平均消费倾向就变成了:C/Y = a(W/Y) + b。因此当我们观察不同个人或短期数据的时候,因为财富短期不变,高收入带来较低的平均消费倾向。但在长期,由于财富的增加,消费函数会向上移动,从而阻止了平均消费倾向随着收入的增加而下降。

弗里德曼提出了另一种理论来说明长期消费函数。他假设,我们的现期收入可以分为两个部分:永久收入和暂时收入。永久收入是一生中的平均收入,暂时收入是在这个平均值附近的随机偏离。比如更高的教育水平可以带来更高的平均收入,运气等原因会带来不同的暂时收入,弗里德曼的结论是,消费函数可以近似的看成:C=aYp ,其中a为常熟,它衡量永久收入中用于消费的比例。永久收入的假说认为,弗里德曼消费函数使用了错误的变量,而平均消费倾向取决于永久收入与现期收入的比例。当现期收入短暂上升到永久收入一下的时候,平均消费倾向暂时下降,反之则会上升。

经济学中关于消费函数的演进过程使我想起物理学定律被不断被修正的过程:初始提出的模型或者假设被发现不能使用与新的领域,于是新的模型被提出,更有想要常识使用同一个模型解释多重不同的情况,比如宏观和微观,经典和量子。经济学建模的过程与此相似,都是在不断找出真正影响结果的因素,修正所使用的模型的过程。在经济学的学习中我也逐渐发现数学的重要性,为了可以进行量化讨论和研究,数学模型是必不可少的。s

Lambda Expression in Java/Kotlin

 

Higher-order function

In computer science, a higher-order function is a function that does at least one of the following:

  • Takes one or more functions as arguments
  • Returns a function as its result.

All other functions are first-order functions.

Anonymous class

In Java, the anonymous class enables you to declare and instantiate a class at the same time. If you only need to use a local class once, then you should use the anonymous class. For example, if you wish to define a runnable class to execute a task:

Executors.NewSingleThreadExecutor().execute(new Runnable() {
    @Override
    public void run() {
        // Your task execution.
    }
})

As you can see on the above example, the Runnable is a interface with one function run defined. The anonymous class implemented the interface.

Lambda Expression

Besides the anonymous class, Java also supports anonymous functions, named Lambda Expression.

While anonymous class enables you to declare the new class in a statement, it is sometimes not concise enough when there is only one function in the class.

For the example on the above section, we could simplify it’s implementation with a Lambda expression.

Executors.NewSingleThreadExecutor().execute(()-> {// Your task execution })

The lambda expression provides a few functionalities:

  • Enable to treat functionality as a method argument, or code as data
  • A function that can be created without belonging to any class.
  • A lambda expression can be passed around as if was an object and executed on demand.

In the mean time, the functions are first-class in Kotlin. They can be stored in variables and data structures, passed as arguments and returned from other higher-order functions.

The kotlin Lambda expression follows the following syntax:

  • It is always surrounded by curly braces,
  • Parameter declarations in full syntactic from go inside curly braces and have optional type annotations.
  • The body goes after an -> sign.
  • If the expression return type is not Unit, the last expression inside the body is treated as the return type.

As you can tell, the Lambda expression can’t specify the return types. If you wish the define the return type, you could use an alternative solution: anonymous function.

fun(x: Int, y: Int): Int = x + y

The major difference between Kotlin and Java is that Kotlin is a functional programming language. Kotlin has a dedicated type for functions, for example:

val initFunction: (Int) -> Int

The above expression means that the initFunctions is a function type, and the function takes in a integer and return a integer.

The above function be rewrite as:

val a = {i: Int -> i +1}