Post content
Data Analyst Interview Questions with Answers: Part-3🧠📊 21. What is correlation vs causation? • Correlation is a statistical relationship between two variables (e.g., ice cream sales temperature). • Causation means one variable directly affects another (e.g., smoking causes lung disease). Correlation doesn’t imply causation. 22. What is regression analysis? It’s used to predict the value of a dependent variable based on one or more independent variables. Example: Predicting sales based on ad spend using linear regression. 23. What is hypothesis testing? A statistical method to determine if there’s enough evidence to support a claim about a dataset. It involves: • Null hypothesis (H0): no effect • Alternative hypothesis (H1): there is an effect Results are judged based on significance level (usually 0.05). 24. What is p-value and its importance? P-value indicates the probability of getting observed results if H0 is true. • Low p-value (< 0.05) → Reject H0 → Significant result • High p-value (> 0.05) → Fail to reject H0 It helps assess if differences are due to chance. 25. What is A/B testing? A/B testing compares two versions (A and B) to see which performs better. Common in marketing and UX: e.g., comparing two landing page designs for conversion rates. 26. What is a confidence interval? It gives a range within which we expect a population parameter to fall, with a certain level of confidence (e.g., 95%). Example: “We’re 95% confident the average age of users is between 24–27.” 27. What is outlier detection and how do you handle it? Outliers are data points that deviate significantly from others. Methods to detect: • Z-score • IQR method • Box plots Handle by: • Removing • Imputing • Investigating cause 28. Explain standard deviation and variance • Variance measures how far values spread out from the mean. • Standard deviation is the square root of variance, representing dispersion in original units. Low SD → data close to mean; High SD → more spread out. 29. What is a pivot table? A pivot table summarizes data for analysis, often used in Excel or Power BI. You can group, filter, and aggregate data (e.g., total sales by region and product). 30. How do you visualize time series data? Use line charts, area charts, or time-based plots. Include trend lines, moving averages, and seasonal decomposition to analyze patterns over time. 💬Tap ❤️ for Part-4!