Machine Learning Interview Questions – Set 21

When does regularization becomes necessary in Machine Learning? Regularization becomes necessary when the model begins to ovefit / underfit. This technique introduces a cost term for bringing in more features with the objective function. Hence, it tries to push the coefficients for many variables to zero and hence reduce cost term. This helps to reduce … Read more

Machine Learning Interview Questions – Set 20

What is the difference between supervised and unsupervised machine learning? Supervised learning requires training labeled data. For example, in order to do classification (a supervised learning task), you’ll need to first label the data you’ll use to train the model to classify data into your labeled groups. Unsupervised learning, in contrast, does not require labeling … Read more

Machine Learning Interview Questions – Set 19

Differentiate between Boosting and Bagging? Bagging and Boosting are variants of Ensemble Techniques. Bootstrap Aggregation or bagging is a method that is used to reduce the variance for algorithms having very high variance. Decision trees are a particular family of classifiers which are susceptible to having high bias. Decision trees have a lot of sensitiveness to … Read more

Machine Learning Interview Questions – Set 18

What ensemble technique is used by Random forests? Bagging is the technique used by Random Forests. Random forests are a collection of trees which work on sampled data from the original dataset with the final prediction being a voted average of all trees. Both being tree-based algorithms, how is Random Forest different from Gradient Boosting … Read more

Machine Learning Interview Questions – Set 17

What is Kernel SVM? Kernel SVM is the abbreviated version of the kernel support vector machine. Kernel methods are a class of algorithms for pattern analysis, and the most common one is the kernel SVM. What are 3 data preprocessing techniques to handle outliers? Winsorize (cap at threshold). Transform to reduce skew (using Box-Cox or … Read more

Machine Learning Interview Questions – Set 16

What is OOB error and how does it occur? For each bootstrap sample, there is one-third of data that was not used in the creation of the tree, i.e., it was out of the sample. This data is referred to as out of bag data. In order to get an unbiased measure of the accuracy … Read more

Machine Learning Interview Questions – Set 15

Explain the differences between Random Forest and Gradient Boosting machines. Random forests are a significant number of decision trees pooled using averages or majority rules at the end. Gradient boosting machines also combine decision trees but at the beginning of the process unlike Random forests. Random forest creates each tree independent of the others while … Read more

Machine Learning Interview Questions – Set 14

What’s the trade-off between bias and variance? Bias is error due to erroneous or overly simplistic assumptions in the learning algorithm you’re using. This can lead to the model underfitting your data, making it hard for it to have high predictive accuracy and for you to generalize your knowledge from the training set to the … Read more

Machine Learning Interview Questions – Set 13

Do you have experience with Spark or big data tools for machine learning? You’ll want to get familiar with the meaning of big data for different companies and the different tools they’ll want. Spark is the big data tool most in demand now, able to handle immense datasets with speed. Be honest if you don’t … Read more

Machine Learning Interview Questions – Set 12

Is ARIMA model a good fit for every time series problem? No, ARIMA model is not suitable for every type of time series problem. There are situations where ARMA model and others also come in handy. ARIMA is best when different standard temporal structures require to be captured for time series data. What is inductive … Read more