Coforge/TCS : Data Engineering Interview Question
1 You have a huge text file, how would you replicate a given row "n" number of times, write a code for this?
2 Which storage level in persist is optimized for storage ?
3 Steps you would take to tune a slow running spark application ?
4 How do you implement SCD type 2 in Pyspark?
5 Limitation of for-each activity in ADF ?
6 How is deployment done in your project ? Explain about development/testing etc. ?
7 SQL : Question about year-on-year growth in salary
8 Python : Reversal of string, removing delimiter and capitalizing
9 SQL : Anti and semi Join and their use-cases
10 SQL : Total sales for each product in last 12 months