Coforge/TCS : Data Engineering Interview Question

1 You have a huge text file, how would you replicate a given row "n" number of times, write a code for this?

2 Which storage level in persist is optimized for storage ?

3 Steps you would take to tune a slow running spark application ?

4 How do you implement SCD type 2 in Pyspark?

5 Limitation of for-each activity in ADF ?

6 How is deployment done in your project ? Explain about development/testing etc. ?

7 SQL : Question about year-on-year growth in salary

8 Python : Reversal of string, removing delimiter and capitalizing

9 SQL : Anti and semi Join and their use-cases

10 SQL : Total sales for each product in last 12 months