Collections Java Interview Questions | Eklavya Online

Collections Java Interview Questions

Here is the sample code that makes use of the default compareTo( ) provided in the String class as it implements the Comparable interface and the Collections utility class that provides a sorting method, which internally uses the efficient “merge sort” algorithm.

import java.util.Arrays;

import java.util.Collections;

import java.util.List;

 

public class Sort1 {

public static void main(String[] args) {

List<string> values = Arrays.asList(“JEE”, “Java”, “Servlets”, “JMS”, “JNDI”, “JDBC”, “JSP”, “EJB”);

Collections.sort(values); // uses the default compareTo(String anotherString)  in the String class

System.out.println(values);

}

}

Output:

[EJB, JDBC, JEE, JMS, JNDI, JSP, Java, Servlets]

// Sort

Collections.sort(list);

// Sort case-insensitive

Collections.sort(list, String.CASE_INSENSITIVE_ORDER);

// SortReverse-order

Collections.sort(list, Collections.reverseOrder ());

// Reverse-order sort case-insensitive

Define local, member and a class variable.

Within a method variables declared are called “local” variables.

Variables declared in the class i.e not in any methods are “member” variables (global variables).

Variables declared in the class i.e not in any methods and are called as “static” are class variables.

Name the different identifier states of a Thread.

Different types of identifiers of a Thread are:

R – Running or runnable thread

S – Suspended thread

CW – Thread waiting on a condition variable

MW – Thread waiting on a monitor lock

MS – Thread suspended waiting on a monitor lock

Define Vector class? Differentiate the Vector and ArrayList.

Vector canbe said a legacy class which has been introduced to implement the List interface since Java 2 platform v1.2

Vector is always synchronized but ArrayList is not.

When Vector class is synchronized, if we will run in multithreading environment we’ve to use ArrayList with Collections.

Vector has a default size i.e 10 while arrayList has no default size.

ArraayList is not having any method returning Enumerations where as vector list is having.

Differentiate between Enumeration and Iterator interface

In java.util package the Enumeration and Iterator are available.

The Enumeration interface is replicated by the Iterator interface.

In preference to Enumeration new implementations should consider using Iterator .

 

The difference of Iterators from enumerations are:

  • Enumeration has 2 methods namely hasMoreElements() & nextElement() where the Iterator contained three methods namely hasNext(), next(),remove().
  • An optional remove operation is added in Iterator,and has shorter method names. We Use remove() to delete the objects but the Enumeration interface does not support this feature.
  • The legacy classes use Enumeration interface .Vector.elements() & Hashtable.elements() method results Enumeration.
  • All Java Collections Framework classes returns iterator. java.util.Collection.iterator() method returning an instance of Iterator.

Collections.sort(list, String.CASE_INSENSITIVE_ORDER);

Collections.reverse(list);

Yes, 2 things — firstly, the above sort is case sensitive, that is the uppercase takes priority over lowercase pushing ‘Java’ after ‘JSP’. Secondly, if the collection had any null values, it will throw a NullpointerException.

These two issues can be rectified by providing a custom sorting implementation that ignores case and handles null values by pushing them to the end. The Collections class’s sort method takes a Comparator implementation as a second argument. In the code below, the Comparator has been implemented as an anonymous inner class. The compare(…) method will be called a number of times by the Collections.sort(list, comparator).

import java.util.Arrays;

import java.util.Collections;

import java.util.Comparator;

import java.util.List;

 

public class Sort2 {

 

public static void main(String[] args) {

List<String> values = Arrays.asList(“JEE”, “Java”, null,  “Servlets”, null, “JMS”, “JNDI”, “JDBC”, “JSP”, null,”EJB”);

 

//The comparator is defined as an anonymous inner class, but it can be

//defined in its own class. Handles nulls and ignores case

Collections.sort(values, new Comparator<String>() {

 

@Override

public int compare(String o1, String o2) {

//push the null values to the end

if(o1 == null){

if(o2 == null) {

return 0;

}

return 1;

}

else if(o2 == null){

return -1;

}

return o1.compareToIgnoreCase(o2);

}

}); // anonymous inner class end

 

System.out.println(values);

}

}

Hashtabel is original collection classes in java which was introduced as version 1.2 that HashMap permits null values in it, while Hashtable doesn’t.

Here is an example of a JavaTechnology custom object that implements a default sorting logic based on the rank (i.e popularity).

public class JavaTechnology implements Comparable<JavaTechnology>{

 

private String name;

private int rank;   // popularity lower value means more popular

 

public JavaTechnology(String name, int rank){

this.name = name;

this.rank = rank;

}

 

//default implementation by rank alone

@Override

public int compareTo(JavaTechnology technology) {

int rank1 = this.rank;

int rank2 = technology.rank;

if (rank1 > rank2){

return +1;

}else if (rank1 < rank2){

return -1;

}else{

return 0;

}

}

 

//required for printing, displaying, etc.

@Override

public String toString() {

return “(” + name + ” , ” + rank + “)”;

}

}

Now, a simple test class

import java.util.Arrays;

import java.util.Collections;

import java.util.List;

 

public class Sort3Test {

 

public static void main(String[] args) {

JavaTechnology jt1 = new JavaTechnology(“JEE”, 1);

JavaTechnology jt2 = new JavaTechnology(“Java”, 1);

JavaTechnology jt3 = new JavaTechnology(“Servlets”, 2);

JavaTechnology jt4 = new JavaTechnology(“JSP”, 2);

JavaTechnology jt5 = new JavaTechnology(“JNDI”, 3);

JavaTechnology jt6 = new JavaTechnology(“EJB”, 4);

JavaTechnology jt7 = new JavaTechnology(“JMS”, 5);

 

List<javatechnology> values = Arrays.asList(jt1, jt2, jt3, jt4, jt5, jt6, jt7);

 

Collections.sort(values); // invokes the compareTo(…) method in JavaTechnology a number of times

 

System.out.println(values);

 

}

 

}

Output:

[(JEE , 1), (Java , 1), (Servlets , 2), (JSP , 2), (JNDI , 3), (EJB , 4), (JMS , 5)]

Any number of Comparator classes can be created to sort them differently as shown below.

import java.util.Comparator;

 

public class JavaTechnologyComparator implements Comparator<JavaTechnology> {

 

@Override

public int compare(JavaTechnology t1, JavaTechnology t2) {

 

//handle null values here

Integer rank1 = t1.getRank();

Integer rank2 = t2.getRank();

int rankVal = rank1.compareTo(rank2);

int nameVal = t1.getName().toLowerCase().compareTo(t2.getName().toLowerCase());

 

//if same rank, then sort by name

if(rankVal == 0){

return nameVal;

}

//else sort by rank

return rankVal ;

}

}

Now, the Sort3Test class has been slightly modified to use a Comparator.

import java.util.Collections;

import java.util.List;

 

public class Sort3Test {

public static void main(String[] args) {

 

JavaTechnology jt1 = new JavaTechnology(“JEE”, 1);

JavaTechnology jt2 = new JavaTechnology(“Java”, 1);

JavaTechnology jt3 = new JavaTechnology(“Servlets”, 2);

JavaTechnology jt4 = new JavaTechnology(“JSP”, 2);

JavaTechnology jt5 = new JavaTechnology(“JNDI”, 3);

JavaTechnology jt6 = new JavaTechnology(“EJB”, 4);

JavaTechnology jt7 = new JavaTechnology(“JMS”, 5);

 

List<JavaTechnology> values = Arrays.asList(jt1, jt2, jt3, jt4, jt5, jt6, jt7);

 

Collections.sort(values, new JavaTechnologyComparator());

// invokes the compare(…) in JavaTechnologyComparator

// a number of times

System.out.println(values);

}

}

The output will be:

[(Java , 1), (JEE , 1), (JSP , 2), (Servlets , 2), (JNDI , 3), (EJB , 4), (JMS , 5)]

This should now enable you to sort any Java objects.

Remote method invocation is called RMI.

One can work with remote object using RMI.

It gives a impression that you are working with a object that resides within your own JVM though it is somewhere.

The protocol used by RMI is RMI-IIOP

Define a Collection API.

The set of classes and interfaces supporting the operation on collections of objects is the Collection API.

Than the vectors, arrays, and hashtables if effectively replaces,these classes and interfaces are more flexible, more powerful, and more regular

class examples: HashSet, TreeMap, ArrayList, LinkedList,HashMap and TreeMap.

interface examples: Set,List ,Collection and Map.

How many forms of Polymorphism are there?

polymorphism exists in three different forms in Java:

  • Method overloading
  • Method overriding through inheritance
  • Method overriding through the Java interface

Define the wrapper classes in Java and name a few.

Wrapper class is wraps around the primitive data type. List of the primitive types and the corresponding wrapper classes:

Primitive Wrapper

boolean java.lang.Boolean

byte java.lang.Byte

char java.lang.Character

double java.lang.Double

float java.lang.Float

int java.lang.Integer

long java.lang.Long

short java.lang.Short

void java.lang.Void

Differentiate between JDK ,JRE & JVM

JDK stands for Java Development Kit. It is the most widely used Java Software Development Kit.

JRE stands for Java Runtime Environment. It is an implementation of the Java Virtual Machine which executes Java programs

JVM stands for Java Virtual Machine. It is an interpreter.

The Collection interface provides support for the implementation of a mathematical bag – an unordered collection of objects that may contain duplicates.

A Java object is considered immutable when its state cannot change after it is created. Use of immutable objects is widely accepted as a sound strategy for creating simple, reliable code. Immutable objects are particularly useful in concurrent applications. Since they cannot change state, they cannot be corrupted by thread interference or observed in an inconsistent state. java.lang.String and java.lang.Integer classes are the Examples of immutable objects from the Java Development Kit. Immutable objects simplify your program due to following characteristics :

  • Immutable objects are simple to use test and construct.
  • Immutable objects are automatically thread-safe.
  • Immutable objects do not require a copy constructor.
  • Immutable objects do not require an implementation of clone.
  • Immutable objects allow hashCode to use lazy initialization, and to cache its return value.
  • Immutable objects do not need to be copied defensively when used as a field.
  • Immutable objects are good Map keys and Set elements (Since state of these objects must not change while stored in a collection).
  • Immutable objects have their class invariant established once upon construction, and it never needs to be checked again.

Immutable objects always have “failure atomicity” (a term used by Joshua Bloch) if an immutable object throws an exception, it’s never left in an undesirable or indeterminate state

There are six interfaces and come under two different inheritance group one which comes under the collection interface root and the other in the map interface root.

 

Collection

It’s the base of all collection classes. It provides a unified way to manipulate collection objects. Collection has group of object called as elements. These elements can be accessed and manipulated using Iterator. List

In List interface the elements are arranged sequentially. Elements can be inserted in to any location and you can also insert duplicate elements. In order to access the elements you need to use the “ListIterator”. Using “ListIterator” you can move back and forth which makes it unique as compared to other iterators.

 

Set

It represents a collection but no duplicates are allowed in this case.

 

SortedSet

It extends the Set interface and sorts the set in ascending order.

 

Map

Map stores association between keys and value pairs. So given a key you can easily find the value. One thing important to note is they do not implement iterable interface. But yes you can obtain a collection view of the map which allows you loop using for loop.

 

SortedMap

It extends Map so that the keys are maintained in ascending order.

The Collections API is a set of classes and interfaces that support operations on collections of objects.

The Iterator interface is used to step through the elements of a Collection.

The Iterator is an interface, used to traverse through the elements of a Collection. It is not advisable to modify the collection itself while traversing an Iterator.

Vector / HashTable are synchronized which means they are thread safe. Cost of thread safe is performance degradation. So if you are sure that you are not dealing with huge number of threads then you should use ArrayList / HashMap.But yes you can stillsynchronize List and Map’s using Collections provided methods :-

  • List OurList = Collections.synchronizedList (OurList);
  • Map OurMap = Collections.synchronizedMap (OurMap);

The List interface provides support for ordered collections of objects.

when the internal array fills up. The arrayList has to create a new array and copy all the elements there. The ArrayList has a growth algorithm of (n*3)/2+1, meaning that each time the buffer is too small it will create a new one of size (n*3)/2+1 where n is the number of elements of the current buffer. Hence if we can guess the number of elements that we are going to have, then it java.util.ArrayList and java.util.LinkedList are two Collections classes used for storing lists of object references Here are some key differences:

  • ArrayList uses primitive object array for storing objects whereas LinkedList is made up of a chain of nodes. Each node stores an element and the pointer to the next node. A singly linked list only has pointers to next. A doubly linked list has a pointer to the next and the previous element. This makes walking the list backward easier.
  • ArrayList implements the RandomAccess interface, and LinkedList does not. The commonly used ArrayList implementation uses primitive Object array for internal storage. Therefore an ArrayList is much faster than a LinkedList for random access, that is, when accessing arbitrary list elements using the get method. Note that the get method is implemented for LinkedLists, but it requires a sequential scan from the front or back of the list. This scan is very slow. For a LinkedList, there’s no fast way to access the Nth element of the list.
  • Adding and deleting at the start and middle of the ArrayList is slow, because all the later elements have to be copied forward or backward. (Using System.arrayCopy()) Whereas Linked lists are faster for inserts and deletes anywhere in the list, since all you do is update a few next and previous pointers of a node.
  • Each element of a linked list (especially a doubly linked list) uses a bit more memory than its equivalent in array list, due to the need for next and previous pointers.
  • ArrayList may also have a performance issue makes sense to create a arraylist with that capacity during object creation (using construtor new ArrayList(capacity)). Whereas LinkedLists should not have such capacity issues.

Map is Interface which is part of Java collections framework. This is to store Key Value pair, and Hashmap is class that implements that using hashing technique.

An enumeration is an object that generates elements one at a time, used for passing through a collection, usually of unknown size. The traversing of elements can only be done once per creation.

Best practices relating to Java Collection framework are as follow:

  • Choose the right type of data structure based on usage patterns like fixed size or required to grow, duplicates allowed or not, ordering is required to be maintained or not, traversal is forward only or bi-directional, inserts at the end only or any arbitrary position, more inserts or more reads, concurrently accessed or not, modification is allowed or not, homogeneous or heterogeneous collection, etc. Also, keep multi-threading, atomicity, memory usage and performance considerations discussed earlier in mind.
  • Don’t assume that your collection is always going to be small as it can potentially grow bigger with time. So your collection should scale well.
  • Program in terms of interface not implementation: For example, you might decide a LinkedList is the best choice for some application, but then later decide ArrayList might be a better choice for performance reason.
    • Bad:
      • ArrayList list = new ArrayList(100);
    • Good:

// program to interface so that the implementation can change

  • List list = new ArrayList(100);
  • List list2 = new LinkedList(100);
  • Return zero length collections or arrays as opposed to returning a null in the context of the fetched list is actually empty. Returning a null instead of a zero length collection is more error prone, since the programmer writing the calling method might forget to handle a return value of null.
    • List emptyList = Collections.emptyList( );
    • Set emptySet = Collections.emptySet( );
  • Use generics for type safety, readability, and robustness.
  • Encapsulate collections: In general, collections are not immutable objects. So care should be taken not to unintentionally expose the collection fields to the caller. The caller may not perform any necessary validation.

The basic difference between a Vector and an ArrayList is that, vector is synchronized while ArrayList is not. Thus whenever there is a possibility of multiple threads accessing the same instance, one should use Vector. While if not multiple threads are going to access the same instance then use ArrayList. Non synchronized data structure will give better performance than the synchronized one.

Many leave out Trees and Graphs. Trees and Graphs are very useful data structures as well.Whilst it is not recommended to write your own implementation when Java provides proven and tested implementations, the interviewer is testing your understanding on data structures. My book entitled “Core Java Career Essentials” covers this in more detail with diagrams, examples, and code.

Here are the common data structures.

Arrays are the most commonly used data structure. Arrays are of fixed size, indexed, and all containing elements are of the same type (i.e. a homogenous collection). For example, storing employee details just read from the database as EmployeeDetail[ ], converting and storing a string as a byte array for further manipulation or processing, etc. Wrap an array in a class to protect it from being inadvertently altered. This would be true for other data structures as well.

Lists are known as arrays that can grow. These data structures are generally backed by a fixed sized array and they re-size themselves as necessary. A list can have duplicateelements. For example,  adding new line items to an order that stores its line items as a list, removing all expired products from a list of products, etc. Initialize them with an appropriate initial size to minimize the number of re-sizes

Sets are like lists but they do not hold duplicate elements. Sets can be used when you have a requirement to store unique elements.

Stacks allow access to only one data item, which is the last item inserted (i.e. Last In First Out – LIFO). If you remove this item, you can access the next-to-last item inserted, and so on. The LIFO is achieved through restricting access only via methods like peek( ), push( ), and pop( ). This is useful in many programing situations like parsing mathematical expressions like (4+2) * 3, storing methods and exceptions in the order they occur, checking your source code to see if the brackets and braces are balanced properly, etc.

The LIFO access mechanism used by a stack has many practical uses.  For example, Evaluation of expressions / syntax Parsing, validating and parsing XML, undo sequence in a text editor, pages visited history in a web browser, etc. Java interview questions and answers on stack data structure.

Queues are somewhat like a stack, except that in a queue the first item inserted is the first to be removed (i.e. First In First Out – FIFO). The FIFO is achieved through restricting access only via methods like peek( ), offer( ), and poll( ). For example,  waiting in a line for a bus, a queue at the bank or super market teller, etc.

LinkedLists are data structures made of nodes, where each node contains data and a reference to the next node, and possibly to the previous node as well for a doubly linked list. For example, a stack or queue can be implemented with a linked list or a doubly linked list because you can insert and delete at both ends. There would also be other situations where data will be frequently inserted and deleted from the middle. The Apache library provides a TreeListimplementation, which is a good replacement for a LinkedList as it performs much better than a LinkedList at the expense of using a little more memory.  This means a LinkedList is rarely a good choice of implementation.

 

ArrayList is a good general purpose list implementation. An ArrayList is faster than a TreeList for most operations except inserting and removing in the middle of the list. A TreeList implementation utilizes a tree structure internally to ensure that all insertions and removals are O(log n). This provides much faster performance than both an ArrayList and a LinkedListwhere elements are inserted and removed repeatedly from anywhere.

 

 

class Link

{

private int id;                    // data

private String name;               // data

private Link next;                 // reference to next link

}

HashMaps are amortized constant-time access data structures that map keys to values. This data structure is backed by an array. It uses a hashing functionality to identify a location, and some type of collision detection algorithm is used to handle values that hash to the same location. For example, storing employee records with employee number as the key, storing name/value pairs read from a properties file, etc. Initialize them with an appropriate initial size to minimize the number of re-sizes.

 

Trees are the data structures that contain nodes with optional data elements and one or more child elements, and possibly each child element references the parent element to represent a hierarchical or ordered set of data elements.  For example, a hierarchy of employees in an organization, storing the XML data as a hierarchy, etc.  If every node in a tree can have utmost 2 children, the tree is called a binary tree. The binary trees are very common because the shape of a binary tree makes it easy to search and insert data. The edges in a tree represent quick ways to navigate from node to node.

list.

Java does not provide an implementation for this but it can be easily implemented as shown below. Just make a class Node with an ArrayList holding links to other Nodes.

package bigo;

import java.util.ArrayList;

import java.util.List;

 

public class Node {

private String name;

private List<node> children = new ArrayList<node>( );

private Node parent;

 

public Node getParent( ) {

return parent;

}

 

public void setParent(Node parent) {

this.parent = parent;

}

 

public Node(String name) {

this.name = name;

}

 

public void addChild(Node child) {

children.add(child);

}

 

public void removeChild(Node child) {

children.remove(child);

}

 

public String toString( ) {

return name;

}

}

Graphs are data structures that represent arbitrary relationships between members of any data sets that can be represented as networks of nodes and edges. A tree structure is essentially a more organized graph where each node can only have one parent node. Unlike a tree, a graph’s shape is dictated by a physical or abstract problem. For example,nodes (or vertices)  in a graph may represent cities, while edges may represent airline flight routes between the cities.

 

To make a Java graph class, you will have to work out the way in which the information can be stored and accessed. A graph will be using some of the data structures mentioned above. The Java API does not provide an implementation for this. But there are a number of third party libraries like  JUNG, JGraphT, and JDSL (does not seem to support generics).The book “Core Java Career Essentials” covers working examples using Java.

The java.lang.Object has two methods defined in it. They are – public boolean equals(Object obj) public int hashCode(). These two methods are used heavily when objects are stored in collections. There is a contract between these two methods which should be kept in mind while overriding any of these methods. The Java API documentation describes it in detail.

The hashCode() method returns a hash code value for the object. This method is supported for the benefit of hashtables such as those provided by java.util.Hashtable or java.util.HashMap. The general contract of hashCode is: Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application. If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result. It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results.

However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables. As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects.

The equals(Object obj) method indicates whether some other object is “equal to” this one. The equals method implements an equivalence relation on non-null object references: It is reflexive: for any non-null reference value x, x.equals(x) should return true. It is symmetric: for any non-null reference values x and y, x.equals(y) should return true if and only if y.equals(x) returns true. It is transitive: for any non-null reference values x, y, and z, if x.equals(y) returns true and y.equals(z) returns true, then x.equals(z) should return true. It is consistent: for any non-null reference values x and y, multiple invocations of x.equals(y) consistently return true or consistently return false, provided no information used in equals comparisons on the objects is modified. For any non-null reference value x, x.equals(null) should return false.

The equals method for class Object implements the most discriminating possible equivalence relation on objects; that is, for any non-null reference values x and y, this method returns true if and only if x and y refer to the same object (x == y has the value true). Note that it is generally necessary to override the hashCode method whenever this method is overridden, so as to maintain the general contract for the hashCode method, which states that equal objects must have equal hash codes.

A practical Example of hashcode() & equals(): This can be applied to classes that need to be stored in Set collections. Sets use equals() to enforce non-duplicates, and HashSet uses hashCode() as a first-cut test for equality. Technically hashCode() isn’t necessary then since equals() will always be used in the end, but providing a meaningful hashCode() will improve performance for very large sets or objects that take a long time to compare using equals().

Both Iterator and Enumeration are used to traverse Collection objects, in a sequential fashion. Enumeration can be applied to Vector and HashTable. Iterator can be used with most of the Collection objects. The main difference between the two is that Iterator is fail-safe. i.e,  If you are using an iterator to go through a collection you can be sure of no concurrent modifications in the underlying collection which may happen in multi-threaded environments.

Vector & ArrayList both classes are implemented using dynamically resizable arrays, providing fast random access and fast traversal. ArrayList and Vector class both implement the List interface. Both the classes are member of Java collection framework, therefore from an API perspective, these two classes are very similar. However, there are still some major differences between the two. Below are some key differences

  • Vector is a legacy class which has been retrofitted to implement the List interface since Java 2 platform v1.2

Vector is synchronized whereas ArrayList is not. Even though Vector class is synchronized, still when you want programs to run in multithreading There are multiple aspects to this decision:

  • The basic difference between a Hashtable and an HashMap is that, Hashtable is synchronized while HashMap is not. Thus whenever there is a possibility of multiple threads accessing the same instance, one should use Hashtable. While if not multiple threads are going to access the same instance then use HashMap. Non synchronized data structure will give better performance than the synchronized one.
  • If there is a possibility in future that – there can be a scenario when you may require to retain the order of objects in the Collection with key-value pair then HashMap can be a good choice. As one of HashMap’s subclasses is LinkedHashMap, so in the event that you’d want predictable iteration order (which is insertion order by default), you can easily swap out the HashMap for a LinkedHashMap. This wouldn’t be as easy if you were using Hashtable. Also if you have multiple thread accessing you HashMap then Collections.synchronizedMap() method can be leveraged. Overall HashMap gives you more flexibility in termsenvironment using ArrayList with Collections.synchronizedList() is recommended over Vector.
  • ArrayList has no default size while vector has a default size of 10.
  • The Enumerations returned by Vector’s elements method are not fail-fast. Whereas ArraayList does not have any method returning Enumerations.

Both Hashtable & HashMap provide key-value access to data. The Hashtable is one of the original collection classes in Java (also called as legacy classes). HashMap is part of the new Collections Framework, added with Java 2, v1.2. There are several differences between HashMap and Hashtable in Java as listed below

  • The HashMap class is roughly equivalent to Hashtable, except that it is unsynchronized and permits nulls. (HashMap allows null values as key and value whereas Hashtable doesn’t allow nulls).
  • HashMap does not guarantee that the order of the map will remain constant over time. But one of HashMap’s subclasses is LinkedHashMap, so in the event that you’d want predictable iteration order (which is insertion order by default), you can easily swap out the HashMap for a LinkedHashMap. This wouldn’t be as easy if you were using Hashtable.
  • HashMap is non synchronized whereas Hashtable is synchronized.
  • Iterator in the HashMap is fail-fast while the enumerator for the Hashtable isn’t. So this could be a design consideration.

Java Collections Framework provides a set of interfaces and classes that support operations on a collections of objects.

Each java collection implementation class have different performance for different methods, which makes them suitable for different programming needs.

Many developers are concerned about the performance difference between java.util.Array.sort() java.util.Collections.sort() methods. Both methods have same algorithm the only difference is type of input to them. Collections.sort() has a input as List so it does a translation of List to array and vice versa which is an additional step while sorting. So this should be used when you are trying to sort a list. Arrays.sort is for arrays so the sorting is done directly on the array. So clearly it should be used when you have a array available with you and you want to sort it.

TreeSet – It is the implementation of SortedSet interface.This implementation provides guaranteed log(n) time cost for the basic operations (add, remove and contains). The class is not synchronized.

Enumeration and Iterator are the interface available in java.util package. The functionality of Enumeration interface is duplicated by the Iterator interface. New implementations should consider using Iterator in preference to Enumeration. Iterators differ from enumerations in following ways:

  • Enumeration contains 2 methods namely hasMoreElements() & nextElement() whereas Iterator contains three methods namely hasNext(), next(),remove().
  • Iterator adds an optional remove operation, and has shorter method names. Using remove() we can delete the objects but Enumeration interface does not support this feature.
  • Enumeration interface is used by legacy classes. Vector.elements() & Hashtable.elements() method returns Enumeration. Iterator is returned by all Java Collections Framework classes. java.util.Collection.iterator() method returns an instance of Iterator.

Vector implements a dynamic array. It is similar to ArrayList, but with two differences:  Vector is synchronized, and it contains many legacy methods that are not part of the collections framework.

The Big-O notation simply describes how well an algorithm scales or performs in the worst case scenario as the number of elements in a data structure increases.  The Big-O notation can also be used to describe other behavior such as memory consumption. At times you may need to choose a slower algorithm because it also consumes less memory. Big-o notation can give a good indication about performance for large amounts of data, but the only real way to know for sure is to have a performance benchmark with large data sets to take into account things that are not considered in Big-O notation like paging as virtual memory usage grows, etc. Although benchmarks are better, they aren’t feasible during the design process, so Big-O complexity analysis is the choice.

The algorithms used by various data structures for different operations like search, insert and delete fall into the following performance groups like constant-time O(1), linear O(n), logarithmic O (log n), exponential O (c to the power n), polynomial O(n to the power c), quadratic O (n to the power 2) and factorial O (n!) where n is the number of elements in the data structure. It is generally a tradeoff between performance and memory usage. Here are some examples.

Example 1: Finding an element in a HashMap is usually a constant-time, which is  O(1) . This is a constant time because a hashing function is used to find an element, and computing a hash value does not depend on the number of elements in the HashMap.

Example 2:  Linear search of an array, list, and LinkedList is linear, which is O(n). This is linear because you will have to search the entire list. This means that if a list is twice as big, searching it will take twice as long.

Example 3: An algorithm that needs to compare every element in an array to sort the array has polynomial complexity, which is O (n2). A nested for loop is O (n2).  An example is shown under sorting algorithms.

Example 4: Binary search of a sorted array or ArrayList is logarithmic, which is O(log n). Searching an element in a LinkedList normally requires O(n). This is one of the disadvantages of LinkedList over the other data structures like an ArrayList or array offering a O (log n) performance, which offers better performance than O(n)  as the number of elements increases. A logarithmic running times mean, if 10 items take at most x amount of time, 100 items will take say at most 2x amount of time, and 10,000 items will take at most  4x. If you plot this on a graph, the time decreases as  n (i.e. number of items) increases.

What can you tell about the performance of a HashMap compared to a TreeMap? Which one would you prefer?

A balanced tree does have O (log n) performance. The TreeMap class in Java maintains key/value objects in a sorted order by using a red-black tree. A red-black tree is a balanced binary tree. Keeping the binary tree balanced ensures the fast insertion, removal, and look-up time of O (log n). This is not as fast as a HashMap, which is O(1) ,  but the TreeMaphas the advantage of that the keys are in sorted order which opens up a lot of other capabilities.

 

Which one to choose?

The decision as to using an unordered collection like a HashSet  or HasMap versus   using a sorted data structure like aTreeSet  or TreeMap depends mainly on the usage pattern, and to some extent on the data size and the environment you run it on.   The practical reason for keeping the elements in sorted order is for frequent and faster retrieval of sorted data if the inserts and updates are frequent. If the need for a sorted result is infrequent like prior to producing a report or running a batch process, then maintaining an unordered collection and sorting them only when it is really required with Collections.sort(…) could sometimes be more efficient than maintaining the ordered elements. This is only an opinion, and no one can offer you a correct answer. Even the complexity theories like Big-O notation like O(n) assume possibly large values of n.  In practice, a O(n) algorithm can be much faster than a O(log n) algorithm, provided the data set that is handled is sufficiently small. One algorithm might perform better on an AMD processor than on an Intel. If your system is set up to swap, disk performance need to be considered. The only way to confirm the efficient usage is to test and measure both performance and memory usage with the right data size. Measure both the approaches on your chosen hardware to determine, which is more appropriate.

Java Collections framework API is a unified architecture for representing and manipulating collections. The API contains Interfaces, Implementations & Algorithm to help java programmer in everyday programming. In nutshell, this API does 6 things at high level

  • Reduces programming efforts. – Increases program speed and quality.
  • Allows interoperability among unrelated APIs.
  • Reduces effort to learn and to use new APIs.
  • Reduces effort to design new APIs.
  • Encourages & Fosters software reuse.

To be specific, There are six collection java interfaces. The most basic interface is Collection. Three interfaces extend Collection: Set, List, and SortedSet. The other two collection interfaces, Map and SortedMap, do not extend Collection, as they represent mappings rather than true collections.

ArrayList is a part of the Collection Framework. We can store any type of objects, and we can deal with only objects. It is growable

Hashtable

An instance of Hashtable has two parameters that affect its performance: initial capacity and load factor. The capacity is the number of buckets in the hash table, and the initial capacity is simply the capacity at the time the hash table is created. Note that the hash table is open: in the case of a “hash collision”, a single bucket stores multiple entries, which must be searched sequentially. The load factor is a measure of how full the hash table is allowed to get before its capacity is automatically increased. The initial capacity and load factor parameters are merely hints to the implementation. The exact details as to when and whether the rehash method is invoked are implementation-dependent.

HashMap

This implementation provides constant-time [ Big O Notation is O(1) ] performance for the basic operations (get and put), assuming the hash function disperses the elements properly among the buckets. Iteration over collection views requires time proportional to the “capacity” of the HashMap instance (the number of buckets) plus its size (the number of key-value mappings). Thus, it’s very important not to set the initial capacity too high (or the load factor too low) if iteration performance is important.

TreeMap

The TreeMap implementation provides guaranteed log(n) [ Big O Notation is O(log N) ] time cost for the containsKey, get, put and remove operations.

LinkedHashMap

A linked hash map has two parameters that affect its performance: initial capacity and load factor. They are defined precisely as for HashMap. Note, however, that the penalty for choosing an excessively high value for initial capacity is less severe for this class than for HashMap, as iteration times for this class are unaffected by capacity.

Java has implementation of BlockingQueue available since Java 1.5. Blocking Queue interface extends collection interface, which provides you power of collections inside a queue. Blocking Queue is a type of Queue that additionally supports operations that wait for the queue to become non-empty when retrieving an element, and wait for space to become available in the queue when storing an element. A typical usage example would be based on a producer-consumer scenario. Note that a BlockingQueue can safely be used with multiple producers and multiple consumers. An ArrayBlockingQueue is a implementation of blocking queue with an array used to store the queued objects. The head of the queue is that element that has been on the queue the longest time. The tail of the queue is that element that has been on the queue the shortest time. New elements are inserted at the tail of the queue, and the queue retrieval operations obtain elements at the head of the queue. ArrayBlockingQueue requires you to specify the capacity of queue at the object construction time itself. Once created, the capacity cannot be increased. This is a classic “bounded buffer” (fixed size buffer), in which a fixed-sized array holds elements inserted by producers and extracted by consumers. Attempts to put an element to a full queue will result in the put operation blocking; attempts to retrieve an element from an empty queue will be blocked.

You should use ArrayList over Vector because you should default to non-synchronized access. Vector synchronizes each individual method. That’s almost never what you want to do. Generally you want to synchronize a whole sequence of operations. Synchronizing individual operations is both less safe (if you iterate over a Vector, for instance, you still need to take out a lock to avoid anyone else changing the collection at the same time) but also slower (why take out a lock repeatedly when once will be enough)? Of course, it also has the overhead of locking even when you don’t need to. It’s a very flawed approach to have synchronized access as default. You can always decorate a collection using Collections.synchronizedList – the fact that Vector combines both the “resized array” collection implementation with the “synchronize every operation” bit is another example of poor design; the decoration approach gives cleaner separation of concerns. Vector also has a few legacy methods around enumeration and element retrieval which are different than the List interface, and developers (especially those who learned Java before 1.2) can tend to use them if they are in the code. Although Enumerations are faster, they don’t check if the collection was modified during iteration, which can cause issues, and given that Vector might be chosen for its syncronization – with the attendant access from multiple threads, this makes it a particularly pernicious problem. Usage of these methods also couples a lot of code to Vector, such that it won’t be easy to replace it with a different List implementation. Despite all above reasons Sun may never officially deprecate Vector class.

At high level – Fail-fast is a property of a system or software with respect to its response to failures. A fail-fast system is designed to immediately report any failure or condition that is likely to lead to failure. Fail-fast systems are usually designed to stop normal operation rather than attempt to continue a possibly-flawed process. When a problem occurs, a fail-fast system fails immediately and visibly. Failing fast is a non-intuitive technique: “failing immediately and visibly” sounds like it would make your software more fragile, but it actually makes it more robust. Bugs are easier to find and fix, so fewer go into production. In Java, Fail-fast term can be related to context of iterators. If an iterator has been created on a collection object and some other thread tries to modify the collection object “structurally”, a concurrent modification exception will be thrown. It is possible for other threads though to invoke “set” method since it doesn’t modify the collection “structurally”. However, if prior to calling “set”, the collection has been modified structurally, “IllegalArgumentException” will be thrown.

From Sun FAQ Page: Many Collection implementations (including all of the ones provided by the JDK) will have a public clone method, but it would be mistake to require it of all Collections. For example, what does it mean to clone a Collection that’s backed by a terabyte SQL database? Should the method call cause the company to requisition a new disk farm? Similar arguments hold for serializable. If the client doesn’t know the actual type of a Collection, it’s much more flexible and less error prone to have the client decide what type of Collection is desired, create an empty Collection of this type, and use the addAll method to copy the elements of the original collection into the new one. Note on Some Important Terms

  • Synchronized means only one thread can modify a hash table at one point of time. Basically, it means that any thread before performing an update on a hashtable will have to acquire a lock on the object while others will wait for lock to be released.

Fail-fast is relevant from the context of iterators. If an iterator has been created on a collection object and some other thread tries to modify the collection object “structurally”, a concurrent modification exception will be thrown. It is possible for other threads though to invoke “set” method since it doesn’t modify the collection “structurally”. However, if prior to calling “set”, the collection has been modified structurally, “IllegalArgumentException” will be thrown

All standard implementations of collections List, Set and Map interface already implement java.io.Serializable. All the commonly used collection classes like java.util.ArrayList, java.util.Vector, java.util.Hashmap, java.util.Hashtable, java.util.HashSet, java.util.TreeSet do implement Serializable. This means you do not really need to write anything specific to serialize collection objects. However you should keep following things in mind before you serialize a collection object – Make sure all the objects added in collection are Serializable. – Serializing the collection can be costly therefore make sure you serialize only required data isntead of serializing the whole collection. – In case you are using a custom implementation of Collection interface then you may need to implement serialization for it.

An enumeration is an interface containing methods for accessing the underlying data structure from which the enumeration is obtained. It is a construct which collection classes return when you request a collection of all the objects stored in the collection. It allows sequential access to all the elements stored in the collection.

Some of the collection classes provide traversal of their contents via a java.util.Iterator interface. This interface allows you to walk through a collection of objects, operating on each object in turn. Remember when using Iterators that they contain a snapshot of the collection at the time the Iterator was obtained; generally it is not advisable to modify the collection itself while traversing an Iterator.

Iterator : Iterator takes the place of Enumeration in the Java collections framework. One can traverse throughr the the collection with the help of iterator in forward direction only and Iterators allow the caller to remove elements from the underlying collection during the iteration with well-defined semantics

ListIterator: An iterator for lists that allows one to traverse the list in either direction.modify the list during iteration, and obtain the iterator’s current position in the list. A ListIterator has no current element. its cursor position always lies between the element that would be returned by a call to previous() and the element that would be returned by a call to next(). In a list of length n, there are n+1 valid index values, from 0 to n, inclusive.

This functionality is provided by the Collections class, which is a wrapper implementation using the decorator design pattern.

public class ReadOnlyExample {

public static void main(String args[ ]) {

Set<string> set = new HashSet<string>( );

set.add(“Java”);

set.add(“JEE”);

set.add(“Spring”);

set.add(“Hibernate”);

set = Collections.unmodifiableSet(set);

set.add(“Ajax”);                                           // not allowed.

}

}

Though the Map interface is part of collections framework, it does not extend collection interface. This is by design, and the answer to this questions is best described in Sun’s FAQ Page: This was by design. We feel that mappings are not collections and collections are not mappings. Thus, it makes little sense for Map to extend the Collection interface (or vice versa). If a Map is a Collection, what are the elements? The only reasonable answer is “Key-value pairs”, but this provides a very limited (and not particularly useful) Map abstraction. You can’t ask what value a given key maps to, nor can you delete the entry for a given key without knowing what value it maps to. Collection could be made to extend Map, but this raises the question: what are the keys? There’s no really satisfactory answer, and forcing one leads to an unnatural interface. Maps can be viewed as Collections (of keys, values, or pairs), and this fact is reflected in the three “Collection view operations” on Maps (keySet, entrySet, and values). While it is, in principle, possible to view a List as a Map mapping indices to elements, this has the nasty property that deleting an element from the List changes the Key associated with every element before the deleted element. That’s why we don’t have a map view operation on Lists.

Iterator : Enables you to traverse through a collection in the forward direction only, for obtaining or removing elements

ListIterator : extends Iterator, and allows bidirectional traversal of list and also allows the modification of elements.

import java.util.ArrayList;

import java.util.LinkedHashSet;

import java.util.List;

 

public class CollectionFunction {

    public <e> List<e> function (List <e> list) {

          return new ArrayList<e>(new LinkedHashSet<e>(list));

    }

}

The above code removes duplicates from a supplied list by passing it through an implementation of a Set interface. In this case, a LinkedHashSet is used to honor the ordering by implementing a SortedSet interface. If ordering is not required, the LinkedHashSet can be replaced with a HashSet.

The functionality of Enumeration interface is duplicated by the Iterator interface. Iterator has a remove() method while Enumeration doesn’t. Enumeration acts as Read-only interface, because it has the methods only to traverse and fetch the objects, where as using Iterator we can manipulate the objects also like adding and removing the objects. So Enumeration is used when ever we want to make Collection objects as Read-only.

HashMap can be synchronized by Map m = Collections.synchronizedMap(hashMap);

Collections.synchronizedMap(new HashMap());

Collections.synchronizedList(List<T> list)