I have a pyspark dataframe with 30 rows and an array of 6 elements. The version of pyspark as taken from MS Fabric is 3.4.
Let's say the array is [5,4,3,4,1,0]. I need to create a column that repeats these 6 numbers 5 times. That is, it creates a column with elements [5, 4, 3, 4, 1, 0, 5, 4, 3, 4, 1, 0, 5, 4, 3, 4, 1, 0, ...] and column-bind it with initial dataframe.
The repeat function does not help because it repeats the full array as new arrays. It creates [5,4,3,4,1,0], [5,4,3,4,1,0], ...
How can I create this column?
As of pyspark 2.4.0, you can use a combination of
array_repeatandflattento obtain the desired result:Example