Post content
Myntra interview questions for Data Analyst 2024. 1. You have a dataset with missing values. How would you use a combination of Pandas and NumPy to fill missing values based on the mean of the column? 2. How would you create a new column in a Pandas DataFrame by normalizing an existing numeric column using NumPy’s np.min() and np.max()? 3. Explain how to group a Pandas DataFrame by one column and apply a NumPy function, like np.std() (standard deviation), to each group. 4. How can you convert a time-series column in a Pandas DataFrame to NumPy’s datetime format for faster time-based calculations? 5. How would you identify and remove outliers from a Pandas DataFrame using NumPy’s Z-score method (scipy.stats.zscore)? 6. How would you use NumPy’s percentile() function to calculate specific quantiles for a numeric column in a Pandas DataFrame? 7. How would you use NumPy's polyfit() function to perform linear regression on a dataset stored in a Pandas DataFrame? 8. How can you use a combination of Pandas and NumPy to transform categorical data into dummy variables (one-hot encoding)? 9. How would you use both Pandas and NumPy to split a dataset into training and testing sets based on a random seed? 10. How can you apply NumPy's vectorize() function on a Pandas Series for better performance? 11. How would you optimize a Pandas DataFrame containing millions of rows by converting columns to NumPy arrays? Explain the benefits in terms of memory and speed. 12. How can you perform complex mathematical operations, such as matrix multiplication, using NumPy on a subset of a Pandas DataFrame? 13. Explain how you can use np.select() to perform conditional column operations in a Pandas DataFrame. 14. How can you handle time series data in Pandas and use NumPy to perform statistical analysis like rolling variance or covariance? 15. How can you integrate NumPy's random module (np.random) to generate random numbers and add them as a new column in a Pandas DataFrame? 16. Explain how you would use Pandas' applymap() function combined with NumPy’s vectorized operations to transform all elements in a DataFrame. 17. How can you apply mathematical transformations (e.g., square root, logarithm) from NumPy to specific columns in a Pandas DataFrame? 18. How would you efficiently perform element-wise operations between a Pandas DataFrame and a NumPy array of different dimensions? 19. How can you use NumPy functions like np.linalg.inv() or np.linalg.det() for linear algebra operations on numeric columns of a Pandas DataFrame? 20. Explain how you would compute the covariance matrix between multiple numeric columns of a DataFrame using NumPy. 21. What are the key differences between a Pandas DataFrame and a NumPy array? When would you use one over the other? 22. How can you convert a NumPy array into a Pandas DataFrame, and vice versa? Provide an example. You can find the answers here Hope this helps you 😊