I am wondering if there is a way to get paragraphs of text (source file would be a pyx file) by number as sed does with lines
sed -n ${i}p
At this moment I'd be interested to use awk with:
awk '/custom-pyx-tag\(/,/\)custom-pyx-tag/'
but I can't find documentation or examples about that.
I'm also trying to trim "\r\n" with gsub(/\r\n/,"; ") int the same awk command but it doesn't work, and I can't really figure out why.
Any hint would be very appreciated, thanks
EDIT:
This is just one example and not my exact need but I would need to know how to do it for a multipurpose project
Let's take the case that I have exported the ID3Tags of a huge collection of audio files and these have been stored in a pyx-like format, so in the end I will have a nice big file with this pattern repeating for each file in the collection:
audio-genre(
blablabla
)audio-genre
audio-artist(
bla.blabla
)audio-artist
audio album(
bla-bla-bla
)audio-album
audio-track-num(
0x
)audio-track-num
audio-track-title(
bla.bla-bla
)audio-track-title
audio-lyrics(
blablablablabla
bla.bla.bla.bla
blah-blah-blah
blabla-blabla
)audio-lyrics
...
Now if I want to extract the artist of the 1234th audio file I can use:
awk '/audio-artist\(/, /)audio-artist/' | sed '/audio-artist/d' | sed -n 1234p
so being one line it can be obtained with sed, but I don't know how to get an entire paragraph given its index, for example if I want to get the lyrics of the 6543th file how could I do it?
In the end it is just a question of whether there is a command equivalent to
sed -n $ {num} p
but to be used for paragraphs
One liner:
Using awk, and the file called audioartist, we consume the file as one line by setting the records separator (RS) to "". We then split the whole file into an array arr, based on the separator audio-artist. We look through the array arr starting from 2 in steps of 2 till the end of the array and strip out the opening and closing brackets, creating another array called arts with an incrementing count as the index and the stripped artist as the value. At the end we print the arts index specified by the passed indx variable (in this case 1234).