Data Analytics Interview Questions – Set 01

There are 5 lanes on a race track. One needs to find out the 3 fastest horses among the total of 25. Determine the minimum number of races to be conducted in order to find the fastest three cars.

Now, you can start solving the problem by considering the number of cars racing. Since there are 25 cars racing with 5 lanes, there would be initially 5 races conducted, with each group having 5 cars. Next, a sixth race will be conducted between the winners of the first 5 races to determine the 3 fastest cars(let us say X1, Y1, and Z1).

Now, suppose X1 is the fastest among the three, then that means A1 is the fastest car among the 25 cars racing. But the question is how to find the 2nd and the 3rd fastest? We cannot assume that Y1 and Z1 are 2nd and 3rd since it may happen that the rest cars from the group of X1s’ cars could be faster than Y1 and Z1. So, to determine this a 7th race is conducted between cars Y1, Z1, and the cars from X1’s group(X2, X3), and the second car from Y1’s group Y2.

So, the cars that finish the 1st and 2nd is the 7th race are actually the 2nd and the 3rd fastest cars among all cars.

What are your communication strengths?

Communication is key in any position. Specifically, with a data analyst role, you will be expected to successfully present your findings and collaborate with the team. Assure them of your ability to communicate with an answer like this:

“My greatest communication strength would have to be my ability to relay information. I’m good at speaking in a simple, yet effective manner so that even people who aren’t familiar with the terms can grasp the overall concepts. I think communication is extremely valuable in a role like this, specifically when presenting my findings so that everyone understands the overall message.”

What are hash table collisions? How is it avoided?

A hash table collision happens when two different keys hash to the same value. Two data cannot be stored in the same slot in array.

To avoid hash table collision there are many techniques, here we list out two

  • Separate Chaining:
  • It uses the data structure to store multiple items that hash to the same slot.
  • Open addressing:
  • It searches for other slots using a second function and store item in first empty slot that is found

How can a Data Analyst highlight cells containing negative values in an Excel sheet?

Final question in our data analyst interview questions and answers guide. A Data Analyst can use conditional formatting to highlight the cells having negative values in an Excel sheet. Here are the steps for conditional formatting:

  • First, select the cells that have negative values.
  • Now, go to the Home tab and choose the Conditional Formatting option.
  • Then, go to the Highlight Cell Rules and select the Less Than option.
  • In the final step, you must go to the dialog box of the Less Than option and enter “0” as the value.

How do you handle pressure and stress?

The best way to answer this question is to give an example of how you have handled stress in a previous job. That way, the interviewer can get a clear picture of how well you work in stressful situations. Avoid mentioning a time when you put yourself in a needlessly stressful situation. Rather, describe a time when you were given a difficult task or multiple assignments and rose to the occasion:

“I actually work better under pressure, and I’ve found that I enjoy working in a challenging environment. I thrive under quick deadlines and multiple projects. I find that when I’m under the pressure of a deadline, I can do some of my highest quality work. For example, I once had three large projects due in the same week, which was a lot of pressure. However, because I created a schedule that detailed how I would break down each project into small assignments, I completed all three projects ahead of time and avoided additional stress.”

What is a data collection plan?

A data collection plan is used to collect all the critical data in a system. It covers –

  • Type of data that needs to be collected or gathered
  • Different data sources for analyzing a data set

How have you dealt with messy data in the past? (Two Sigma)

Up to 80% of a data analyst’s time can be spent on cleaning data. That makes this a very important concept to understand. Even more important when you consider that, if your data is unclean and produces inaccurate insights, it could lead to costly company actions based on false information. Yikes. That could mean trouble for you.

You need to demonstrate not only that you understand the difference between messy data and clean data but also that you used that knowledge to cleanse the data. This article shows the sort of workflow you might be looking for in your response, as well as some methods for identifying inconsistent data and cleaning it.

Just as with any other question where you’re asked to describe a situation you’ve encountered in the past, it’s a good time to employ the STAR method: situation, task, action, result.

A client of ours was unhappy with our staffing reports, so I needed to pore over one to see what was causing their chagrin. I was looking at some data in a spreadsheet that contained information about when our call center employees went to break, took lunch, etc., and I noticed that the time stamps were inconsistent: some had a.m., some had p.m., some didn’t have any specifications for morning or night, and worst of all, many of these employees were located in different time zones, so this needed to be made more consistent as well.

To solve the a.m./p.m. dilemma, I made sure all times were specified in military. This had two benefits: first, it eliminated the strings in the data and made the whole column numeric; second, it removed any need to specify morning or night as military time does this inherently. Next, I converted all times to UTC, this way all of the data was on the same time zone. This was important for the report I was working on because otherwise the data would be presented out of order and it could cause confusion for our client. Reorganizing the report’s data this way helped improve our relationship with the client, who, due to the time discrepancies, previously believed we were understaffed at specific times of day.

Can you make a Pivot Table from multiple tables?

Yes, we can create one Pivot Table from multiple different tables when there is a connection between these tables.

In Your Opinion, What Skills and Qualities Should a Successful Data Analyst Have?

There is no right or wrong answer to this question necessarily, but it’s good to be prepared for the possibility of this question coming up. Being an analytical thinker and good problem solver is two examples of answers you could use for this type of question.

As mentioned earlier, these data analyst interview questions are just sample questions that may or may not be asked in a data analyst interview, and it would largely vary based on the skillsets and the experience level the interviewer would be looking for. So, you need to be prepared for all kinds of questions on the related topics, including probability and statistics, regression and correlation, Python, R and SAS programming, and more.

Whether you’re new at data analysis or you’re looking to further your training, Simplilearn has a variety of courses and programs available to suit your needs and goals. Two popular choices include our Business Analytics Expert Master’s Program and our Business Analytics Certification Training with Excel. We also offer specialized training for those looking to learn more about a specific aspect of data analysis, such as our Python for Data Science Certification Training Course, Data Science Certification Training – R Programming Course, and Data Science with SAS Certification Training. Enroll in one of our highly accredited programs today and get a jumpstart on your career.

What is the ANYDIGIT function in SAS?

The ANYDIGIT function is used to search for a character string. After the string is found it will simply return the desired string.