How to fix line breaks in the text cdr file?

94 Views Asked by At

i have following output from cdr file as shown in the below code snippet.

2021-10-27 00:06:53,203:16344:5:0:4:573192000019::6:0:4:573160001511:*999#:1:1:573160032001:1:6:732123904909775:1:1:573156068892:::::SUCCESS:PULL:2021-10-2700:06:53.203:101630:20076482:28389:2
,2:aa8c2b31-ac49-4c16-9e2f-f8a83ba63cd6
2021-10-27 00:06:57,120:16344:5:0:4:573192000019::6:0:4:573160001511:*111#:1:1:573160032002:1:6:732123907508180:1:1:573134396303:::::SUCCESS:PULL:2021-10-27 00:06:57.12:101631:26706476:11566:3,3,3192244169:d21e7dca-6dfa-43e6-8bcd-b95ebd35cdea

As can be seen in the Code Snippet, line number 2 should be part of line number 1, but there is a line break splitting the transaction into two lines. The requirement is following.

2021-10-27 00:06:53,203:16344:5:0:4:573192000019::6:0:4:573160001511:*999#:1:1:573160032001:1:6:732123904909775:1:1:573156068892:::::SUCCESS:PULL:2021-10-2700:06:53.203:101630:20076482:28389:2,2:aa8c2b31-ac49-4c16-9e2f-f8a83ba63cd6
2021-10-27 00:06:57,120:16344:5:0:4:573192000019::6:0:4:573160001511:*111#:1:1:573160032002:1:6:732123907508180:1:1:573134396303:::::SUCCESS:PULL:2021-10-27 00:06:57.12:101631:26706476:11566:3,3,3192244169:d21e7dca-6dfa-43e6-8bcd-b95ebd35cdea

How can this be done in already generated file ? There are thousands of similar lines in the cdr file which needs to corrected .TIA

4

There are 4 best solutions below

0
Ed Morton On BEST ANSWER
$ awk '{printf "%s%s", (/^[0-9]{4}(-[0-9]{2}){2}/ ? ors : ""), $0; ors=ORS} END{print ""}' file
2021-10-27 00:06:53,203:16344:5:0:4:573192000019::6:0:4:573160001511:*999#:1:1:573160032001:1:6:732123904909775:1:1:573156068892:::::SUCCESS:PULL:2021-10-2700:06:53.203:101630:20076482:28389:2,2:aa8c2b31-ac49-4c16-9e2f-f8a83ba63cd6
2021-10-27 00:06:57,120:16344:5:0:4:573192000019::6:0:4:573160001511:*111#:1:1:573160032002:1:6:732123907508180:1:1:573134396303:::::SUCCESS:PULL:2021-10-27 00:06:57.12:101631:26706476:11566:3,3,3192244169:d21e7dca-6dfa-43e6-8bcd-b95ebd35cdea
0
tshiono On

Would you please try the following:

awk '
/^[[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}/ {       # if the record starts with a date
    if (line) print line                                # flush the line buffer
    line = ""
}
{
    line = line $0                                      # append the current record to the line buffer
}
END {
    if (line) print line                                # flush the line buffer at the end of the file
}
' file.txt

Output with the provided example:

2021-10-27 00:06:53,203:16344:5:0:4:573192000019::6:0:4:573160001511:*999#:1:1:573160032001:1:6:732123904909775:1:1:573156068892:::::SUCCESS:PULL:2021-10-2700:06:53.203:101630:20076482:28389:2,2:aa8c2b31-ac49-4c16-9e2f-f8a83ba63cd6
2021-10-27 00:06:57,120:16344:5:0:4:573192000019::6:0:4:573160001511:*111#:1:1:573160032002:1:6:732123907508180:1:1:573134396303:::::SUCCESS:PULL:2021-10-27 00:06:57.12:101631:26706476:11566:3,3,3192244169:d21e7dca-6dfa-43e6-8bcd-b95ebd35cdea
0
Cyrus On
awk -F ':' 'NF!=37 { curr=$0; getline; $0 = curr $0 }1' file

Output:

2021-10-27 00:06:53,203:16344:5:0:4:573192000019::6:0:4:573160001511:*999#:1:1:573160032001:1:6:732123904909775:1:1:573156068892:::::SUCCESS:PULL:2021-10-2700:06:53.203:101630:20076482:28389:2,2:aa8c2b31-ac49-4c16-9e2f-f8a83ba63cd6
2021-10-27 00:06:57,120:16344:5:0:4:573192000019::6:0:4:573160001511:*111#:1:1:573160032002:1:6:732123907508180:1:1:573134396303:::::SUCCESS:PULL:2021-10-27 00:06:57.12:101631:26706476:11566:3,3,3192244169:d21e7dca-6dfa-43e6-8bcd-b95ebd35cdea

I assume that each row should have 37 columns.

See: 8 Powerful Awk Built-in Variables – FS, OFS, RS, ORS, NR, NF, FILENAME, FNR

0
potong On

This might work for you (GNU sed):

sed ':a;N;/\n....-..-.. ..:..:..,/!s/\n//;ta;P;D' file

Open a 2 line window in the pattern space.

If the second line does not begin with a date and time, remove the newline (append line 2 to line 1), append the next line and repeat until failure.

Print/delete the first of the two lines and repeat.