Post content
✅ Top 50 Data Analytics Interview Questions – Part 1📊🔥 1️⃣ What is the difference between Data Analysis and Data Analytics? Data Analysis focuses on inspecting, cleaning, and summarizing data to extract insights. Data Analytics is broader—it includes data collection, transformation, modeling, and using algorithms to support decision-making. 2️⃣ Explain your data cleaning process. ⦁ Identify and handle missing values (impute or remove) ⦁ Remove duplicate records ⦁ Correct inconsistent data entries ⦁ Standardize data formats (e.g., date/time) ⦁ Validate data types and ranges ⦁ Ensure data integrity and quality 3️⃣ How do you handle missing or duplicate data? ⦁ Missing Data: Use methods like mean/median imputation, predictive modeling, or drop the records. ⦁ Duplicates: Identify using unique identifiers, and either remove or retain the most relevant version based on business logic. 4️⃣ What is a primary key in a database? A primary key is a unique identifier for each record in a table. It ensures that no two rows have the same value in that column and helps maintain data integrity. 5️⃣ SQL query to find the 2nd highest salary from a table employees: SELECT MAX(salary) FROM employees WHERE salary < (SELECT MAX(salary) FROM employees); 6️⃣ What is the difference between INNER JOIN and LEFT JOIN? ⦁ INNER JOIN: Returns only matching rows from both tables. ⦁ LEFT JOIN: Returns all rows from the left table, and matching rows from the right (NULLs if no match). 7️⃣ What are outliers? How do you detect and handle them? Outliers are values that deviate significantly from the rest of the data. Detection Methods: ⦁ IQR (Interquartile Range) ⦁ Z-score Handling Methods: ⦁ Remove outliers ⦁ Cap values ⦁ Use transformation (e.g., log scale) 8️⃣ What is a Pivot Table? A pivot table is a data summarization tool that allows quick grouping, aggregation, and analysis of data in spreadsheets or BI tools. It's useful for analyzing patterns and trends. 9️⃣ How do you validate a data model? ⦁ Split data into training and testing sets ⦁ Use cross-validation (e.g., k-fold) ⦁ Evaluate metrics like Accuracy, Precision, Recall, F1-Score, RMSE, etc. 🔟 What is Hypothesis Testing? Difference between t-test and z-test? Hypothesis testing is a statistical method to test assumptions about a population. ⦁ T-test: Used when sample size is small and population variance is unknown. ⦁ Z-test: Used when sample size is large or population variance is known. 💬Tap ❤️ for Part 2!