Is there a way to show the duplicated counts with the actual duplicated lines repeated?
For example, input:
AAAA XXXX
AAAA YYYY
BBBB ZZZZ
Expected output:
2 AAAA XXXX
2 AAAA YYYY
1 BBBB ZZZZ
Using the Linux program uniq, it refuses to show the duplicated line 2 AAAA YYYY.
Linux command used:
printf 'AAAA XXXX\nAAAA YYYY\nBBBB ZZZZ' | uniq --count --check-chars 4
2 AAAA XXXX
1 BBBB ZZZZ
The -D option in uniq means print all duplicate lines. But it says it is meaningless.
printf 'AAAA XXXX\nAAAA YYYY\nBBBB ZZZZ' | uniq --count -D --check-chars 4
uniq: printing all duplicated lines and repeat counts is meaningless
Try 'uniq --help' for more information.
In my actual use case, XXXX YYYY ZZZZ are the file paths, and AAAA BBBB are the md5 hashes of the file contents. If XXXX and YYYY hashes are identical, I need to check file XXXX and YYYY. However I cannot get the file path of YYYY.
You can use
jointo combine theuniqoutput with the original input.