Use awk to walk a tree expressed via indendation

89 Views Asked by At
spec:
  replicas: 1
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  selector:
    matchLabels:
      app.kubernetes.io/name: myapp
      app.kubernetes.io/instance: myapp

I would like to walk the tree with (POSIX) awk generating all paths to each key:

spec
spec:replicas
spec:strategy
spec:strategy:rollingUpdate
spec:strategy:rollingUpdate:maxSurge
spec:strategy:rollingUpdate:maxUnavailable
spec:selector
spec:selector:matchLabels
spec:selector:matchLabels:app.kubernetes.io/name
spec:selector:matchLabels:app.kubernetes.io/instance

Ideally in this depth-first search pre-order ordering.

I found this related question:

awk to insert after nth occurrence with indentation

But the solution is too far from what I'm after that I wasn't able to repurpose it with my pitiful knowledge of awk.

I've now written

match($0, /[^[:space:]]/) {
    arr[RSTART]=$1;
    for (i=1; i<RSTART; i+=1) {
        printf "%s", arr[i]
    };
    print sub(/:$/, "", arr[RSTART])
}

But the output is a bizarre

1
spec1
spec1
specstrategy1
specstrategyrollingUpdate1
specstrategyrollingUpdate1
spec1
specselector1
specselectormatchLabels1
specselectormatchLabels1

instead of what I was expecting. I think that's because sub is in-place replacement instead of outputting the new value? But I have no idea where the 1s come from.

3

There are 3 best solutions below

5
Daweo On BEST ANSWER

I would harness GNU AWK for this task following way, let file.txt content be

spec:
  replicas: 1
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  selector:
    matchLabels:
      app.kubernetes.io/name: myapp
      app.kubernetes.io/instance: myapp

then

awk 'match($0,/[[:alpha:]]/){arr[RSTART]=$1;for(i=1;i<RSTART;i+=1){printf "%s",arr[i]};print gensub(/:/,"",1,arr[RSTART])}' file.txt

gives output

spec
spec:replicas
spec:strategy
spec:strategy:rollingUpdate
spec:strategy:rollingUpdate:maxSurge
spec:strategy:rollingUpdate:maxUnavailable
spec:selector
spec:selector:matchLabels
spec:selector:matchLabels:app.kubernetes.io/name
spec:selector:matchLabels:app.kubernetes.io/instance

Explanation: I use match string function to find position of (1st) alphabetic character, if it is found I set value in array arr under key being said position to 1st field, then for all values lower than position I output value from under key (not all must exists, but for non-existing printf "%s" is no-operation), then print value from under key being position with : removed. Disclaimer: this solution assumes space never appears in path part AND : appears exactly one at end of part AND path part always starts with letter.

(tested in GNU Awk 5.1.0)

0
pmf On

A YAML processor would give you robust results:

yq -r 'paths | join(":")' input.yaml
yq '.* | .. | path | join(":")' input.yaml

Given your sample input, both output:

spec
spec:replicas
spec:strategy
spec:strategy:rollingUpdate
spec:strategy:rollingUpdate:maxSurge
spec:strategy:rollingUpdate:maxUnavailable
spec:selector
spec:selector:matchLabels
spec:selector:matchLabels:app.kubernetes.io/name
spec:selector:matchLabels:app.kubernetes.io/instance
0
Kaz On

In TXR Lisp:

$ txr paths.tl < data
spec
spec:replicas
spec:strategy
spec:strategy:rollingUpdate
spec:strategy:rollingUpdate:maxSurge
spec:strategy:rollingUpdate:maxUnavailable
spec:selector
spec:selector:matchLabels
spec:selector:matchLabels:app:kubernetes:io/name
spec:selector:matchLabels:app:kubernetes:io/instance

Code:

(let (path)
  (while-true-match-case (get-line)
    (`@{spaces #/(  )*/}@{key}:@nil`
     (let ((indent (trunc (len spaces) 2))
           (npath (spl "." key)))
       (set [path indent..:] npath)
       (put-line (join-with ":" path))))))