Sorting 2 lines at a time with sed

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP

Sorting 2 lines at a time with sed



I would like to use sed to sort a large playlist file. Each playlist item has 2 lines so I need to sort based on the string after tvg-name= in the first line (so in this case line one would be UK: Button 1(SG)) sorting this first line and the following line with it (the one that begins with http).


sed


tvg-name=


http



I've seen very similar regex examples but the syntax is beyond me to reverse engineer to fit my file. Can anyone help?


#EXTINF:-1 tvg-id="Button.uk" tvg-name="UK: Button 1 (SG)" tvg-logo="" group-title="ENTERTAINMENT & LIFESTYLE UK",UK: Button 1 (SG)
http://website.info:8080/live/xxxxxxxx/xxxxxxxx/112233.ts
#EXTINF:-1 tvg-id="London.uk" tvg-name="UK: London (HD) (720p)" tvg-logo="" group-title="ENTERTAINMENT & LIFESTYLE UK",UK: London (HD) (720p)
http://website.info:8080/live/xxxxxxxx/xxxxxxxx/36011.ts




3 Answers
3



If you want to do this with a tool like sed, you must either A) have faith that your fields don't contain anything malignant like tvg-id="http://..., or B) write extremely painstaking scripts.


tvg-id="http://...



I would try something crude but effective, like this. First, combine two lines into one:


sed 'N;s/n//'



Then copy the tvg-name field to the front of the line:


tvg-name


sed 's/(.*tvg-name=)("[^"]*")/212/'



Then sort:


sort



Then remove the field I'd added:


sed 's/^"[^"]*"//'



Then split the one line back into two:


sed 'h;s/http.*//;p;g;s/.*http/http/'



Putting it all together:


sed 'N;s/n//;s/(.*tvg-name=)("[^"]*")/212/' filename | sort | sed 's/^"[^"]*"//;h;s/http.*//;p;g;s/.*http/http/'



Without better sample input and the expected output it's a guess but this might be what you're trying to do:


$ awk -F'"' -v OFS='t' 'NF>1key=$4; cnt=0 print key, ++cnt, $0' file | sort -t$'t' -k1,1 -k2,2n -s | cut -f3-
#EXTINF:-1 tvg-id="Button.uk" tvg-name="UK: Button 1 (SG)" tvg-logo="" group-title="ENTERTAINMENT & LIFESTYLE UK",UK: Button 1 (SG)
#http://website.info:8080/live/xxxxxxxx/xxxxxxxx/112233.ts
##EXTINF:-1 tvg-id="London.uk" tvg-name="UK: London (HD) (720p)" tvg-logo="" group-title="ENTERTAINMENT & LIFESTYLE UK",UK: London (HD) (720p)
#http://website.info:8080/live/xxxxxxxx/xxxxxxxx/36011.ts



or if you wanted it sorted in reverse:


$ awk -F'"' -v OFS='t' 'NF>1key=$4; cnt=0 print key, ++cnt, $0' file | sort -t$'t' -k1,1r -k2,2n -s | cut -f3-
##EXTINF:-1 tvg-id="London.uk" tvg-name="UK: London (HD) (720p)" tvg-logo="" group-title="ENTERTAINMENT & LIFESTYLE UK",UK: London (HD) (720p)
#http://website.info:8080/live/xxxxxxxx/xxxxxxxx/36011.ts
#EXTINF:-1 tvg-id="Button.uk" tvg-name="UK: Button 1 (SG)" tvg-logo="" group-title="ENTERTAINMENT & LIFESTYLE UK",UK: Button 1 (SG)
#http://website.info:8080/live/xxxxxxxx/xxxxxxxx/112233.ts



It just prefixes each line with the key value from between the quotes followed by a count of the lines starting at that value, then sorts it by those added fields, then removes the added fields again. It's easiest to understand if you look at how it works in stages:



1) Add the 2 prefix fields to be used for sorting:


$ awk -F'"' -v OFS='t' 'NF>1key=$4; cnt=0 print key, ++cnt, $0' file
UK: Button 1 (SG) 1 #EXTINF:-1 tvg-id="Button.uk" tvg-name="UK: Button 1 (SG)" tvg-logo="" group-title="ENTERTAINMENT & LIFESTYLE UK",UK: Button 1 (SG)
UK: Button 1 (SG) 2 #http://website.info:8080/live/xxxxxxxx/xxxxxxxx/112233.ts
UK: London (HD) (720p) 1 ##EXTINF:-1 tvg-id="London.uk" tvg-name="UK: London (HD) (720p)" tvg-logo="" group-title="ENTERTAINMENT & LIFESTYLE UK",UK: London (HD) (720p)
UK: London (HD) (720p) 2 #http://website.info:8080/live/xxxxxxxx/xxxxxxxx/36011.ts



2) Sort the output using the 2 fields added at step 1:


$ awk -F'"' -v OFS='t' 'NF>1key=$4; cnt=0 print key, ++cnt, $0' file |
sort -t$'t' -k1,1r -k2,2n -s
UK: London (HD) (720p) 1 ##EXTINF:-1 tvg-id="London.uk" tvg-name="UK: London (HD) (720p)" tvg-logo="" group-title="ENTERTAINMENT & LIFESTYLE UK",UK: London (HD) (720p)
UK: London (HD) (720p) 2 #http://website.info:8080/live/xxxxxxxx/xxxxxxxx/36011.ts
UK: Button 1 (SG) 1 #EXTINF:-1 tvg-id="Button.uk" tvg-name="UK: Button 1 (SG)" tvg-logo="" group-title="ENTERTAINMENT & LIFESTYLE UK",UK: Button 1 (SG)
UK: Button 1 (SG) 2 #http://website.info:8080/live/xxxxxxxx/xxxxxxxx/112233.ts



3) Remove the 2 fields added at step 1:


$ awk -F'"' -v OFS='t' 'NF>1key=$4; cnt=0 print key, ++cnt, $0' file |
sort -t$'t' -k1,1r -k2,2n -s |
cut -f3-
##EXTINF:-1 tvg-id="London.uk" tvg-name="UK: London (HD) (720p)" tvg-logo="" group-title="ENTERTAINMENT & LIFESTYLE UK",UK: London (HD) (720p)
#http://website.info:8080/live/xxxxxxxx/xxxxxxxx/36011.ts
#EXTINF:-1 tvg-id="Button.uk" tvg-name="UK: Button 1 (SG)" tvg-logo="" group-title="ENTERTAINMENT & LIFESTYLE UK",UK: Button 1 (SG)
#http://website.info:8080/live/xxxxxxxx/xxxxxxxx/112233.ts



Note that the above will work for any number of lines in multi-line blocks and the same approach works for sorting on multiple different substrings. It's also very efficient (stores almost nothing in memory) and easy to debug/enhance/maintain later.



This might work for you (GNU sed & sort):


sed 'N;s/^(.*)n/11/;2h;2!H;$!d;x;s/.*/echo '''&''' | sort -k3,3/e;s/^(.*)1/1n/Mg' file



Convert each two line record into a single line by duplicating the first line and removing the newline between the first and second lines. Gather the entire file up into memory held in the hold space. At the end of file, swap to the hold space and using GNU sed's evaluate command as a flag in the substitution command, sort the file on the third field in the single line record. Reformat the single line record back into a two line record and print the results.





Hi neither answer seems to work. The sort does not work seem to alphabeticalise the rows. Also when I split the lines again the file is only half the size.
– john Wallace
20 hours ago





@johnWallace I left an extra command in by mistake and forgot to quote the echo command properly. If I use your (short) data, to my eyes, does as you wished, but I am only guessing. Perhaps you could provide more data before/after.
– potong
17 hours ago






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Comments

Popular posts from this blog

Executable numpy error

PySpark count values by condition

Trying to Print Gridster Items to PDF without overlapping contents