Strings and String literals in String Literal Pool

A lot of times I have been asked questions to count the number of string objects created in a statement. Yes, typically by product companies. So it triggered me to actually write an article on it.

Let us discuss on the following topics then
  • String is immutable
  • String pool
  •  sd

String is immutable?


Strings are immutable objects. 
Yes, once created, they cannot be change. Although we can perform operations on it to create new string objects.
String str = "a" + "b";
In the statement above, we actually create string object "a", then string object "b" and then append them to create new string object "c". We do not alter the original strings "a" and "b".
Same is the case when we write
String s1 = "a";
String s2 = s1.concat("b");
String object "a" is created. String object "b" is created and concatenated to "a". Oh no, String object "a" and "b" are added and new String object "ab" is created. The concat() is hence a misnomer. But one thing worth noting about the concat() according to javadoc is
If the length of the argument string is 0, then this String object is returned. Otherwise, a new String object is created, representing a character sequence that is the concatenation of the character sequence represented by this String object and the character sequence represented by the argument string.
Fairly simple but worth noting.

 String literal pool

String literal pool is a collection of references to String objects.
But the String objects are themselves created on heap, just like other objects. And the String literal pool references are maintained by the JVM (say in a table).

String literals and String pool

Now since the String objects are immutable, it is safe for multiple references to same literal actually share the same object.
String s1 = "harsh";
String s2 = "harsh";
The code above seems to create 2 String objects, with literal as "harsh". But in JVM, both the references s1 and s2 point to the same String object.
How do we test that?
System.out.println("Equals method: " + s1.equals(s2));
System.out.println("==: " + (s1 == s2));
This should explain you what I am trying to say.

What happens behind the scenes is that the string literals are noted down separately by the compiler. When the classloader loads the class, goes through the literal table. When it finds a literal, it searches through the list of String pool references to see if the equivalent String already exists on heap. If one already exists, all the references in class to this String literal are replaced with the reference to the String object on heap (pointed to by the String literal table). If none exists, then a new object is created on heap and then its reference is created on String literal pool table. So any subsequent references to this literal are automatically mapped to the existing String object on heap.

The 'new' operator and String literal pool

When it comes to the 'new' operator, it forces the JVM to create a new String object on the String literal pool. It has no connection whatsoever with the objects on String literal pool.
A 'new' operator creates and points to a new String object on heap.
Hence, do not think 'String literal pool' when you encounter 'new' operator.

Literal mathematics through constant operations

What about the String created through literal mathematics?
String s1 = "ab" + "c";
String s2 = "a" + "bc";
The above operation also creates a single literal on string pool, and both the references point to it. How? Because compiler can calculate this at compile time, that we are doing string literal constant mathematics.

Literal mathematics through objects

String s1 = "a";
String s2 = s1 + "bc";
String s3 = "a" + "bc";
In statements above, s2 and s3 do not point to the same literal. 
i.e. when we perform some mathematics through references, compiler isn't able to identify the resultant string at compile time, and hence does not make an entry for the literal pool.
String object referenced by s1 and s3 go on the literal pool, but not the one referenced by s2 as it is created at runtime.

String literal garbage collection

An object is eligible for garbage collection when it is no longer referenced.
But our String literals on the literal pool are always referenced by the literal pool. So they are never eligible for garbage collection. They are always accessible through String interns.
But the objects created by 'new' operator are eligible if they are no longer referenced - as they are never refered to by the pool.
 

Comments