Posted by: Saajha August 14, 2009
--Comparing Text Strings--
Login in to Rate this Post:     0       ?        
I have two text files to compare: file A and  file B

file A is nicely formatted with sections, headings etc, for better visibility

file B is a linear raw list of strings (really a bunch of machine names)

I am trying to compare file A and file B, and locate the strings in each file that don't exist in the other, and vice versa-- in other words, identify unique strings in each file.

UNIX utility *diff* works great, so do Windows tools like 'ExamDiff', 'CompareIt!' etc; but they only compare a single occurence of each string pair, and ignore the rests.

For instance, I have

List A        List B
------       ------
abc          bcd
def          def
def          ijk    
ghi           jkl

The result will be:

List A        List B
------        ------
abc           bcd
def           ijk
ghi            jkl

(Note that the eliminated strings were the ones that followed One-to-One matching)

While the expected result is:

List A        List B
------      ------
abc         bcd
ghi          ijk
              jkl

With both occurences of 'def' being eliminated - with One-to-many comparison.

Can anyone suggest a solution? A tool or an script logic?

~@~
Read Full Discussion Thread for this article