R Interview Questions – Set 04

How can we find the mean of one column with respect to another?

In iris dataset, there are five columns, i.e., Sepal.Length, Sepal.Width, Petal.Length, Petal.Width and Species. We will calculate the mean of Sepal-Length across different species of iris flower using the mean() function from the mosaic package.

mean(iris$Sepal.Length~iris$Species)

Explain Chi-Square Test

The Chi-Square Test is used to analyze the frequency table (i.e., contingency table), which is formed by two categorical variables. The chi-square test evaluates whether there is a significant relationship between the categories of the two variables.

Explain RStudio.

RStudio is an integrated development environment which allows us to interact with R more readily. RStudio is similar to the standard RGui, but it is considered more user-friendly. This IDE has various drop-down menus, windows with multiple tabs, and so many customization processes. The first time when we open RStudio, we will see three Windows. The fourth Window will be hidden by default

What is the use of with() and by() functions in R?

The with() function applies an expression to a dataset, and the by() function applies a function to each level of factors.

What are GGobi and iPlots?

The GGobi is an open-source program for visualization to exploring high dimensional typed data, and the iPlots is a package which provides bar plots, mosaic plots, box plots, parallel plots, histograms, and scatter plots.

What is the full form of SEM and CFA?

CFA stands for Confirmatory Factor Analysis, and SEM stands for Structural Equation Modeling.

Give the command to create a histogram and to remove a vector from the R workspace?

hist() and rm() function are used as a command to create a histogram and remove a vector from the R workspace.

What is a Random Walk model?

A random walk is the simplest example of a non-stationary process. A random walk has no specified mean or variance, strong dependence over time, and its changes or increments are white noise. Simulating random walk in R:

arima.sim(model=list(order=c(0,1,0)),n=40)->rw ts.plot(rw)

Explain Random Forest.

The Random Forest is also known as Decision Tree Forest. It is one of the popular decision tree-based ensemble models. The accuracy of these models is higher than other decision trees. This algorithm is used for both classification and regression applications.

Give full form of MANOVA and what is the use of it.

MANOVA stands for Multivariate Analysis of Variance, and it is used to test more than one dependent variable simultaneously.