Tuesday, November 1, 2011

Garbage collection in Java

few important points about garbage collection in java:

1) objects are created on heap in Java  irrespective of there scope e.g. local or member variable. while its worth noting that class variables or static members are created in method area of Java memory space and both heap and method area is shared between different thread.
2) Garbage collection is a mechanism provided by Java Virtual Machine to reclaim heap space from objects which are eligible for Garbage collection.
3) Garbage collection relieves java programmer from memory management which is essential part of C++ programming and gives more time to focus on business logic.
4) Garbage Collection in Java is carried by a daemon thread called Garbage Collector.
5) Before removing an object from memory Garbage collection thread invokes finalize () method of that object and gives an opportunity to perform any sort of cleanup required.
6) You as Java programmer can not force Garbage collection in Java; it will only trigger if JVM thinks it needs a garbage collection based on Java heap size.
7) There are methods like System.gc () and Runtime.gc () which is used to send request of Garbage collection to JVM but it’s not guaranteed that garbage collection will happen.
8) If there is no memory space for creating new object in Heap Java Virtual Machine throws OutOfMemoryError or java.lang.OutOfMemoryError heap space
9) J2SE 5(Java 2 Standard Edition) adds a new feature called Ergonomics goal of ergonomics is to provide good performance from the JVM with minimum of command line tuning.



When an Object becomes Eligible for Garbage Collection

An Object becomes eligible for Garbage collection or GC if its not reachable from any live threads or any static refrences in other words you can say that an object becomes eligible for garbage collection if its all references are null. Cyclic dependencies are not counted as reference so if Object A has reference of object B and object B has reference of Object A and they don't have any other live reference then both Objects A and B will be eligible for Garbage collection.
Generally an object becomes eligible for garbage collection in Java on following cases:
1) All references of that object explicitly set to null e.g. object = null
2) Object is created inside a block and reference goes out scope once control exit that block.
3) Parent object set to null, if an object holds reference of another object and when you set container object's reference null, child or contained object automatically becomes eligible for garbage collection.
4) If an object has only live references via WeakHashMap it will be eligible for garbage collection. To learn more about HashMap see here How HashMap works in Java.

Heap Generations for Garbage Collection in Java

Java objects are created in Heap and Heap is divided into three parts or generations for sake of garbage collection in Java, these are called as Young generation, Tenured or Old Generation and Perm Area of heap.
New Generation is further divided into three parts known as Eden space, Survivor 1 and Survivor 2 space. When an object first created in heap its gets created in new generation inside Eden space and after subsequent Minor Garbage collection if object survives its gets moved to survivor 1 and then Survivor 2 before Major Garbage collection moved that object to Old or tenured generation.

Permanent generation of Heap or Perm Area of Heap is somewhat special and it is used to store Meta data related to classes and method in JVM, it also hosts String pool provided by JVM as discussed in my string tutorial why String is immutable in Java. There are many opinions around whether garbage collection in Java happens in perm area of java heap or not, as per my knowledge this is something which is JVM dependent and happens at least in Sun's implementation of JVM. You can also try this by just creating millions of String and watching for Garbage collection or OutOfMemoryError.

Types of Garbage Collector in Java

Java Runtime (J2SE 5) provides various types of Garbage collection in Java which you can choose based upon your application's performance requirement. Java 5 adds three additional garbage collectors except serial garbage collector. Each is generational garbage collector which has been implemented to increase throughput of the application or to reduce garbage collection pause times.

1) Throughput Garbage Collector: This garbage collector in Java uses a parallel version of the young generation collector. It is used if the -XX:+UseParallelGC option is passed to the JVM via command line options . The tenured generation collector is same as the serial collector.

2) Concurrent low pause Collector: This Collector is used if the -Xingc or -XX:+UseConcMarkSweepGC is passed on the command line. This is also referred as Concurrent Mark Sweep Garbage collector. The concurrent collector is used to collect the tenured generation and does most of the collection concurrently with the execution of the application. The application is paused for short periods during the collection. A parallel version of the young generation copying collector is sued with the concurrent collector. Concurrent Mark Sweep Garbage collector is most widely used garbage collector in java and it uses algorithm to first mark object which needs to collected when garbage collection triggers.

3) The Incremental (Sometimes called train) low pause collector: This collector is used only if -XX:+UseTrainGC is passed on the command line. This garbage collector has not changed since the java 1.4.2 and is currently not under active development. It will not be supported in future releases so avoid using this and please see 1.4.2 GC Tuning document for information on this collector.
Important point to not is that -XX:+UseParallelGC should not be used with -XX:+UseConcMarkSweepGC. The argument passing in the J2SE platform starting with version 1.4.2 should only allow legal combination of command line options for garbage collector but earlier releases may not find or detect all illegal combination and the results for illegal combination are unpredictable. It’s not recommended to use this garbage collector in java.

JVM Parameters for garbage collection in Java

Garbage collection tuning is a long exercise and requires lot of profiling of application and patience to get it right. While working with High volume low latency Electronic trading system I have worked with some of the project where we need to increase the performance of Java application by profiling and finding what causing full GC and I found that Garbage collection tuning largely depends on application profile, what kind of object application has and what are there average lifetime etc. for example if an application has too many short lived object then making Eden space wide enough or larger will reduces number of minor collections. you can also control size of both young and Tenured generation using JVM parameters for example setting -XX:NewRatio=3 means that the ratio among the young and tenured generation is 1:3 , you got to be careful on sizing these generation. As making young generation larger will reduce size of tenured generation which will force Major collection to occur more frequently which pauses application thread during that duration results in degraded or reduced throughput. The parameters NewSize and MaxNewSize are used to specify the young generation size from below and above. Setting these equal to one another fixes the young generation. In my opinion before doing garbage collection tuning detailed understanding of garbage collection in java is must and I would recommend reading Garbage collection document provided by Sun Microsystems for detail knowledge of garbage collection in Java. Also to get a full list of JVM parameters for a particular Java Virtual machine please refer official documents on garbage collection in Java. I found this link quite helpful though http://www.oracle.com/technetwork/java/gc-tuning-5-138395.html

Full GC and Concurrent Garbage Collection in Java

Concurrent garbage collector in java uses a single garbage collector thread that runs concurrently with the application threads with the goal of completing the collection of the tenured generation before it becomes full. In normal operation, the concurrent garbage collector is able to do most of its work with the application threads still running, so only brief pauses are seen by the application threads. As a fall back, if the concurrent garbage collector is unable to finish before the tenured generation fill up, the application is paused and the collection is completed with all the application threads stopped. Such Collections with the application stopped are referred as full garbage collections or full GC and are a sign that some adjustments need to be made to the concurrent collection parameters. Always try to avoid or minimize full garbage collection or Full GC because it affects performance of Java application. When you work in finance domain for electronic trading platform and with high volume low latency systems performance of java application becomes extremely critical an you definitely like to avoid full GC during trading period.

Summary on Garbage collection in Java

1) Java Heap is divided into three generation for sake of garbage collection. These are young generation, tenured or old generation and Perm area.
2) New objects are created into young generation and subsequently moved to old generation.
3) String pool is created in Perm area of Heap, garbage collection can occur in perm space but depends upon JVM to JVM.
4) Minor garbage collection is used to move object from Eden space to Survivor 1 and Survivor 2 space and Major collection is used to move object from young to tenured generation.
5) Whenever Major garbage collection occurs application threads stops during that period which will reduce application’s performance and throughput.
6) There are few performance improvement has been applied in garbage collection in java 6 and we usually use JRE 1.6.20 for running our application.
7) JVM command line options –Xmx and -Xms is used to setup starting and max size for Java Heap. Ideal ratio of this parameter is either 1:1 or 1:1.5 based upon my experience for example you can have either both –Xmx and –Xms as 1GB or –Xms 1.2 GB and 1.8 GB.
8) There is no manual way of doing garbage collection in Java.

Polymorphism

What is polymorphism in Java

Polymorphism is an Oops concept which advice use of common interface instead of concrete implementation while writing code. When we program for interface our code is capable of handling any new requirement or enhancement arise in near future due to new implementation of our common interface. If we don't use common interface and rely on concrete implementation, we always need to change and duplicate most of our code to support new implementation. Its not only java but other object oriented language like C++ also supports polymorphism and it comes as fundamental along with encapsulation , abstraction and inheritance.

How Polymorphism supported in Java

Java has excellent support of polymorphism in terms of Inheritance, method overloading and method overriding. Method overriding allows java to invoke method based on a particular object at run-time instead of declared type while coding. To get hold of concept let's see an example of polymorphism in java:

class TradingSystem{
public String getDescription(){
return "electronic trading system";
}
}

class DirectMarketAccessSystem extends TradingSystem{
public String getDescription(){
return "direct market access system";
}
}

class CommodityTradingSystem extends TradingSystem{
public String getDescription(){
return "Futures trading system";
}
}

Here we have a super class called TradingSystem and there two implementation DirectMarketAccessSystem and CommodityTradingSystem and here we will write code which is flexible enough to work with any future implementation of TradingSystem we can achieve this by using polymorphism in java which we will see in further example.

Where to use Polymorphism in code

Probably this is the most important part of this java polymorphism tutorial and It’s good to know where you can use polymorphism in java while writing code. Its common practice to always replace concrete implementation with interface it’s not that easy and s comes with practice but here are some common places where I check for polymorphism:


1) Method argument:
Always use super type in method argument that will give you leverage to pass any implementation while invoking method. For example:

public void showDescription(TradingSystem tradingSystem){
tradingSystem.description();
}
If you have used concrete implementation e.g. CommodityTradingSystem or DMATradingSystem then that code will require frequent changes whenever you add new Trading system.



2) Variable names:
Always use Super type while you are storing reference returned from any Factory object, this gives you Flexibility to accommodate any new implementation from Factory. Here is an example of polymorphism while writing java code which you can use retrieving reference from Factory:

String systemName = Configuration.getSystemName();
TradingSystem system = TradingSystemFactory.getSystem(systemName);


Method overloading and method overriding in Java

Method is overloading and method overriding uses concept of polymorphism in java where method name remains same in two classes but actual method called by JVM depends upon object. Java supports both overloading and overriding of methods. In case of overloading method signature changes while in case of overriding method signature remains same and binding and invocation of method is decided on runtime based on actual object. This facility allows java programmer to write very flexibly and maintainable code using interfaces without worrying about concrete implementation. One disadvantage of using polymorphism in code is that while reading code you don't know the actual type which annoys while you are looking to find bugs or trying to debug program. But if you do java debugging in IDE you will definitely be able to see the actual object and the method call and variable associated with it.


Parameteric Polymorphism in Java

Java started to support parametric polymorphism with introduction of generic in JDK1.5. Collection classes in JDK 1.5 are written using Generic Type which allows Collections to hold any type of object in run time without any change in code and this has been achieved by passing actual Type as parameter. For example see the below code of a parametric cache written using generic which shows use of parametric polymorphism in Java.

interface cache{
public void put(K key, V value);
public V get(K key);
}

Method Overloading In Java

In Java it is possible to define two or more methods within the same class that share the same name, as long as their parameter declarations are different. When this is the case, the methods are said to be overloaded, and the process is referred to as method overloading. Method overloading is one of the ways that Java implements polymorphism.
If you have never used a language that allows the overloading of methods, then the concept may seem strange at first. But as you will see, method overloading is one of Java's most exciting and useful features. When an overloaded method is invoked, Java uses the type and/or number of arguments as its guide to determine which version of the overloaded method to actually call. Thus,
overloaded methods must differ in the type and/or number of their parameters. While overloaded methods may have different return types, the return type alone is insufficient to distinguish two versions of a method. When Java encounters a call to an overloaded method, it simply executes the version of the method whose parameters match the arguments used in the call. Here is a simple example that illustrates method overloading:
// Demonstrate method overloading.
class OverloadDemo {
void test() {
System.out.println("No parameters");
}
// Overload test for one integer parameter.
void test(int a) {
System.out.println("a: " + a);
}
// Overload test for two integer parameters.
void test(int a, int b) {
System.out.println("a and b: " + a + " " + b);
}
// overload test for a double parameter
double test(double a) {
System.out.println("double a: " + a);
return a*a;
}
}
class Overload {
public static void main(String args[]) {
OverloadDemo ob = new OverloadDemo();
double result;
// call all versions of test()
ob.test();
ob.test(10);
ob.test(10, 20);
result = ob.test(123.2);
System.out.println("Result of ob.test(123.2): " + result);
}
}
This program generates the following output:
No parameters
a: 10
a and b: 10 20
double a: 123.2
Result of ob.test(123.2): 15178.24
As you can see, test( ) is overloaded four times. The first version takes no parameters, the second takes one integer parameter, the third takes two integer parameters, and the fourth takes one double parameter. The fact that the fourth version of test( ) also returns a value is of no consequence relative to overloading, since return types do not play a role in overload resolution.
When an overloaded method is called, Java looks for a match between the arguments used to call the method and the method's parameters. However, this match need not always be exact. In some cases Java's automatic type conversions can play a role in overload resolution. For example, consider the following program:
// Automatic type conversions apply to overloading.
class OverloadDemo {
void test() {
System.out.println("No parameters");
}
// Overload test for two integer parameters.
void test(int a, int b) {
System.out.println("a and b: " + a + " " + b);
}
// overload test for a double parameter
void test(double a) {
System.out.println("Inside test(double) a: " + a);
}
}
class Overload {
public static void main(String args[]) {
OverloadDemo ob = new OverloadDemo();
int i = 88;
ob.test();
ob.test(10, 20);
ob.test(i); // this will invoke test(double)
ob.test(123.2); // this will invoke test(double)
}
}
This program generates the following output:
No parameters
a and b: 10 20
Inside test(double) a: 88
Inside test(double) a: 123.2
As you can see, this version of OverloadDemo does not define test(int). Therefore, when test( ) is called with an integer argument inside Overload, no matching method is found. However, Java can automatically convert an integer into a double, and this conversion can be used to resolve the call. Therefore, after test(int) is not found, Java elevates i to double and then calls test(double). Of course, if test(int) had been defined, it would have been called instead. Java will employ its automatic type conversions only if no exact match is found.
Method overloading supports polymorphism because it is one way that Java implements the "one interface, multiple methods" paradigm. To understand how, consider the following. In languages that do not support method overloading, each method must be given a unique name. However, frequently you will want to implement essentially the same method for different types of data. Consider the absolute value function. In languages that do not support overloading, there are usually three or more versions of this function, each with a slightly different name. For instance, in C, the function abs( )

returns the absolute value of an integer, labs( ) returns the absolute value of a long integer, and fabs() returns the absolute value of a floating-point value. Since C does not support overloading, each function has to have its own name, even though all three functions do essentially the same thing. This makes the situation more complex, conceptually, than it actually is. Although the underlying concept of each function is the same, you still have three names to remember. This situation does not occur in Java, because each absolute value method can use the same name. Indeed, Java's standard class library includes an absolute value method, called abs( ). This method is overloaded
by Java's Math class to handle all numeric types. Java determines which version of abs() to call based upon the type of argument. The value of overloading is that it allows related methods to be accessed by use of a common name. Thus, the name abs represents the general action which is being performed. It is left to the compiler to choose the right specific version for a particular circumstance. You, the programmer, need only remember the general operation being performed. Through the application of polymorphism, several names have been reduced to one. Although this example is fairly simple, if you expand the concept, you can see how overloading can help you manage greater complexity.
When you overload a method, each version of that method can perform any activity you desire. There is no rule stating that overloaded methods must relate to one another. However, from a stylistic point of view, method overloading implies a relationship. Thus, while you can use the same name to overload unrelated methods, you should not. For example, you could use the name sqr to create methods that return the square of an integer and the square root of a floating-point value. But these two operations are fundamentally different. Applying method overloading in this manner defeats its original purpose. In practice, you should only overload closely related operations.
This is an extract from the book Java: the complete reference. By Herbert Schildt