Data Analytics Interview Questions – Set 02

What is the difference between joining and blending in Tableau?

The Joining term is used when you are combining data from the same source, for example, worksheet in an Excel file or tables in an Oracle database. While blending requires two completely defined data sources in your report.

What is the Alternative Hypothesis?

To explain the Alternative Hypothesis, you can first explain what the null hypothesis is. Null Hypothesis is a statistical phenomenon that is used to test for possible rejection under the assumption that result of chance would be true.

After this, you can say that the alternative hypothesis is again a statistical phenomenon which is contrary to the Null Hypothesis. Usually, it is considered that the observations are a result of an effect with some chance of variation.

What is imputation?

Missing data may lead to some critical issues; hence, imputation is the methodology that can help to avoid pitfalls. It is the process of replacing missing data with substituted values. Imputation helps in preventing list-wise deletion of cases with missing values.

Can you tell the difference between VAR X1 – X3 and VAR X1 — X3?

When you specify sing dash between the variables, then that specifies consecutively numbered variables. Similarly, if you specify the Double Dash between the variables, then that would specify all the variables available within the dataset.

For Example:
Consider the following data set:

Data Set: ID NAME X1 X2 Y1 X3

Then, X1 – X3 would return X1 X2 X3

and X1 — X3 would return X1 X2 Y1 X3

How to view underlying SQL Queries in Tableau?

To view the underlying SQL Queries in Tableau, we mainly have two options:

  • Use the Performance Recording Feature: You have to create a Performance Recording to record the information about the main events you interact with the workbook. Users can view the performance metrics in a workbook created by Tableau.
    Help -> Settings and Performance -> Start Performance Recording.
    Help -> Setting and Performance -> Stop Performance Recording.
  • Reviewing the Tableau Desktop Logs: You can review the Tableau Desktop Logs located at C:UsersMy DocumentsMy Tableau Repository. For live connection to the data source, you can check log.txt and tabprotosrv.txt files. For an extract, check tdeserver.txt file.

What is a Pivot Table?

A Pivot Table is a Microsoft Excel feature used to summarize huge datasets quickly. It sorts, reorganizes, counts, or groups data stored in a database. This data summarization includes sums, averages, or other statistics.

Name the best tools used for data analysis.

A question on the most used tool is something you’ll mostly find in any data analytics interview questions.
The most useful tools for data analysis are:

  • Tableau
  • Google Fusion Tables
  • Google Search Operators
  • KNIME
  • RapidMiner
  • Solver
  • OpenRefine
  • NodeXL
  • io

Why is ‘naïve Bayes’ naïve?

It is naïve because it assumes that all dataset are equally important and independent, which is not the case in a real-world scenario.

What is the KNN imputation method?

This method is used to impute the missing attribute values which are imputed by the attribute values that are most similar to the attribute whose values are missing. The similarity of the two attributes is determined by using the distance functions.

What is time series analysis?

Time series analysis can be done in two domains, frequency domain and the time domain. In Time series analysis the output of a particular process can be forecast by analyzing the previous data by the help of various methods like exponential smoothening, log-linear regression method, etc.