Post content
✅Data Analyst Interview Questions with Answers: Part-5 41. What is data cleaning? Data cleaning is the process of fixing or removing incorrect, incomplete, or inconsistent data. Example: Removing duplicate customer records, Fixing wrong date formats. 42. How do you handle missing data? Common methods: - Remove rows (if few missing) - Replace with mean, median, or 0 - Use forward or backward fill Example (SQL): SELECT COALESCE(sales, 0) AS sales FROM orders; 43. How do you treat outliers? - Identify using sorting, box plots, or Z-score - Remove or cap extreme values Example: Sales = 10,000, 12,000, 15,000, 1,00,000 → outlier. 44. What is data normalization? Scaling data between 0 and 1. Example: Normalized value = (x - min) / (max - min) Used in ML and comparisons. 45. What is data standardization? Centers data around mean 0 with std dev 1. Example: Z = (x - mean) / std 46. How do you check data quality? - Accuracy - Completeness - Consistency - Validity - Timeliness Example: Sales should never be negative. 47. What is duplicate data? Same record appearing more than once. Example: Same customer ID repeated multiple times. 48. How do you validate source data? - Compare with source systems - Check row counts - Verify key metrics Example: Total revenue in report = total revenue in database. 49. What is data transformation? Converting data into usable format. Examples: - Converting dates - Creating new columns - Aggregating values 50. Why is data preparation important? Clean data = correct insights. Poor data leads to wrong decisions. Example: Wrong sales data → wrong business strategy. Double Tap ♥️ For Part-6