Machine Learning Interview Questions – Set 02

What are the different categories you can categorized the sequence learning process?

  • Sequence prediction
  • Sequence generation
  • Sequence recognition
  • Sequential decision

What is classifier in machine learning?

A classifier in a Machine Learning is a system that inputs a vector of discrete or continuous feature values and outputs a single discrete value, the class.

Explain differences between random forest and gradient boosting algorithm.

  • random forest uses bagging techniques whereas GBM uses boosting techniques.
  • Random forests mainly try to reduce variance and GBM reduces both bias and variance of a model

What are the five popular algorithms of Machine Learning?

  • Decision Trees
  • Neural Networks (back propagation)
  • Probabilistic networks
  • Nearest Neighbor
  • Support vector machines

What are two techniques of Machine Learning ?

The two techniques of Machine Learning are

  • Genetic Programming
  • Inductive Learning

What are the two classification methods that SVM ( Support Vector Machine) can handle?

  • Combining binary classifiers
  • Modifying binary to incorporate multiclass learning

What are the advantages and disadvantages of neural networks?

Advantages: Neural networks (specifically deep NNs) have led to performance breakthroughs for unstructured datasets such as images, audio, and video. Their incredible flexibility allows them to learn patterns that no other ML algorithm can learn.

Disadvantages: However, they require a large amount of training data to converge. It’s also difficult to pick the right architecture, and the internal “hidden” layers are incomprehensible.

List the advantages and disadvantages of using neural networks.

Advantages:

We can store information on the entire network instead of storing it in a database. It has the ability to work and give a good accuracy even with inadequate information. A neural network has parallel processing ability and distributed memory.

Disadvantages:

Neural Networks requires processors which are capable of parallel processing. It’s unexplained functioning of the network is also quite an issue as it reduces the trust in the network in some situations like when we have to show the problem we noticed to the network. Duration of the network is mostly unknown. We can only know that the training is finished by looking at the error value but it doesn’t give us optimal results.

How do we deal with sparsity issues in recommendation systems? How do we measure its effectiveness? Explain.

Singular value decomposition can be used to generate the prediction matrix. RMSE is the measure that helps us understand how close the prediction matrix is to the original matrix.

Which distance do we measure in the case of KNN?

The hamming distance is measured in case of KNN for the determination of nearest neighbours. Kmeans uses euclidean distance.