Does setting the serialVersionUID class field improve Java serialization performance?
Declaring an explicit serialVersionUID field in your classes saves some CPU time only the first time the JVM process serializes a given Class. However the gain is not significant, In case when you have not declared the serialVersionUID its value is computed by JVM once and subsequently kept in a soft cache for future use.
Why is Serialization required? What is the need to Serialize
Serialization is required for a variety of reasons. It is required to send across the state of an object over a network by means of a socket. One can also store an object’s state in a file. Additionally, manipulation of the state of an object as streams of bytes is required. The core of Java Serialization is the Serializable interface. When Serializable interface is implemented by your class it provides an indication to the compiler that java Serialization mechanism needs to be used to serialize the object.
What are the alternatives to Serialization? If Serialization is not used, is it possible to persist or transfer an object using any other approach
In case, Serialization is not used, Java objects can be serialized by many ways, some of the popular methods are listed below:
- Saving object state to database, this is most common technique used by most applications. You can use ORM tools (e.g. hibernate) to save the objects in a database and read them from the database.
- Xml based data transfer is another popular mechanism, and a lot of XML based web services use this mechanism to transfer data over network. Also a lot of tools save XML files to persist data/configurations.
- JSON Data Transfer – is recently popular data transfer format. A lot of web services are being developed in JSON due to its small footprint and inherent integration with web browser due to JavaScript format.
What is the Difference between Externalizable and Serializable Interfaces?
This is one of top serialization questions that is asked in many big companies to test your in-depth understanding of serialization. Serializable is a marker interface therefore you are not forced to implement any methods, however Externalizable contains two methods readExternal() and writeExternal() which must be implemented. Serializable interface provides a inbuilt serialization mechanism to you which can be in-efficient at times. However Externilizable interface is designed to give you greater control over the serialization mechanism. The two methods provide you immense opportunity to enhance the performance of specific object serialization based on application needs. Serializable interface provides a default serialization mechanism, on the other hand, Externalizable interface instead of relying on default Java Serialization provides flexibility to control this mechanism. One can drastically improve the application performance by implementing the Externalizable interface correctly. However there is also a chance that you may not write the best implementation, so if you are not really sure about the best way to serialize, I would suggest your stick to the default implementation using Serializable interface.
What changes are compatible and incompatible to the mechanism of java Serialization
This is one of a difficult and tricky questions and answering this correctly would mean you are an expert in Java Serialization concept. In an already serialized object, the most challenging task is to change the structure of a class when a new field is added or removed. As per the specifications of Java Serialization, addition of any method or field is considered to be a compatible change whereas changing of class hierarchy or non-implementation of Serializable interface is considered to be a non-compatible change. You can go through the Java serialization specification for the extensive list of compatible and non-compatible changes. If a serialized object need to be compatible with an older version, it is necessary that the newer version follows some rules for compatible and incompatible changes. A compatible change to the implementing class is one that can be applied to a new version of the class, which still keeps the object stream compatible with older version of same class. Some Simple Examples of compatible changes are:
- Addition of a new field or class will not affect serialization, since any new data in the stream is simply ignored by older versions. the newly added field will be set to its default values when the object of an older version of the class is un marshaled.
- The access modifiers change (like private, public, protected or default) is compatible since they are not reflected in the serialized object stream.
- Changing a transient field to a non-transient field is compatible change since it is similar to adding a field.
- Changing a static field to a non-static field is compatible change since it is also similar to adding a field.
- Some Simple Examples of incompatible changes are:
- Changing implementation from Serializable to Externalizable interface can not be done since this will result in the creation of an incompatible object stream.
- Deleting a existing Serializable fields will cause a problem.
- Changing a non-transient field to a transient field is incompatible change since it is similar to deleting a field.
- Changing a non-static field to a static field is incompatible change since it is also similar to deleting a field.
- Changing the type of a attribute within a class would be incompatible, since this would cause a failure when attempting to read and convert the original field into the new field.
- Changing the package of class is incompatible. Since the fully-qualified class name is written as part of the object byte stream.
Java serialization is one of the most commonly misunderstood areas. Many developers still think its only used for saving objects on the file system
What are transient variables? What role do they play in Serialization process
The transient keyword in Java is used to indicate that a field should not be serialized. Once the process of de-serialization is carried out, the transient variables do not undergo a change and retain their default value. Marking unwanted fields as transient can help you boost the serialization performance. Below is a simple example where you can see the use of transient keyword.
class MyVideo implements Serializable {
private Video video;
private transient Image thumbnailVideo;
private void generateThumbnail()
{
// Generate thumbnail.
}
private void readObject(ObjectInputStream inputStream)
throws IOException, ClassNotFoundException
{
inputStream.defaultReadObject();
generateThumbnail();
}
}
When will you use Serializable or Externalizable interface? and why
Most of the times when you want to do a selective attribute serialization you can use Serializable interface with transient modifier for variables not to be serialized. However, use of Externalizable interface can be really effective in cases when you have to serialize only some dynamically selected attributes of a large object. Lets take an example, Some times when you have a big Java object with hundreds of attributes and you want to serialize only a dozen dynamically selected attributes to keep the state of the object you should use Externalizable interface writeObject method to selectively serialize the chosen attributes. In case you have small objects and you know that most or all attributes are required to be serialized then you should be fine with using Serializable interface and use of transient variable as appropriate.
What is a connected RowSet? or What is the difference between connected RowSet and disconnected RowSet? or Connected vs Disconnected RowSet, which one should I use and when?
Connected RowSet
A RowSet object may make a connection with a data source and maintain that connection throughout its life cycle, in which case it is called a connected rowset. A rowset may also make a connection with a data source, get data from it, and then close the connection. Such a rowset is called a disconnected rowset. A disconnected rowset may make changes to its data while it is disconnected and then send the changes back to the original source of the data, but it must reestablish a connection to do so. Example of Connected RowSet: A JdbcRowSet object is a example of connected RowSet, which means it continually maintains its connection to a database using a JDBC technology-enabled driver.
Disconnected RowSet
A disconnected rowset may have a reader (a RowSetReader object) and a writer (a RowSetWriter object) associated with it. The reader may be implemented in many different ways to populate a rowset with data, including getting data from a non-relational data source. The writer can also be implemented in many different ways to propagate changes made to the rowset’s data back to the underlying data source.Example of Disconnected RowSet: A CachedRowSet object is a example of disconnected rowset, which means that it makes use of a connection to its data source only briefly. It connects to its data source while it is reading data to populate itself with rows and again while it is propagating changes back to its underlying data source. The rest of the time, a CachedRowSet object is disconnected, including while its data is being modified. Being disconnected makes a RowSet object much leaner and therefore much easier to pass to another component. For example, a disconnected RowSet object can be serialized and passed over the wire to a thin client such as a personal digital assistant (PDA).
Why does serialization NOT save the value of static class attributes? Why static variables are not serialized
The Java variables declared as static are not considered part of the state of an object since they are shared by all instances of that class. Saving static variables with each serialized object would have following problems
- It will make redundant copy of same variable in multiple objects which makes it in-efficient.
- The static variable can be modified by any object and a serialized copy would be stale or not in sync with current value.
What are the ways to speed up Object Serialization? How to improve Serialization performance
The default Java Serialization mechanism is really useful, however it can have a really bad performance based on your application and business requirements. The serialization process performance heavily depends on the number and size of attributes you are going to serialize for an object. Below are some tips you can use for speeding up the marshaling and un-marshaling of objects during Java serialization process.
- Mark the unwanted or non Serializable attributes as transient. This is a straight forward benefit since your attributes for serialization are clearly marked and can be easily achieved using Serialzable interface itself.
- Save only the state of the object, not the derived attributes. Some times we keep the derived attributes as part of the object however serializing them can be costly. Therefore consider calcualting them during de-serialization process.
- Serialize attributes only with NON-default values. For examples, serializing a int variable with value zero is just going to take extra space however, choosing not to serialize it would save you a lot of performance. This approach can avoid some types of attributes taking unwanted space. This will require use of Externalizable interface since attribute serialization is determined at runtime based on the value of each attribute.
- Use Externalizable interface and implement the readObject and writeObject methods to dynamically identify the attributes to be serialized. Some times there can be a custom logic used for serialization of various attributes.