[VIEWED 7681
TIMES]
|
SAVE! for ease of future access.
|
|
|
Saajha
Please log in to subscribe to Saajha's postings.
Posted on 08-14-09 2:13
PM
Reply
[Subscribe]
|
Login in to Rate this Post:
0
?
|
|
I have two text files to compare: file A and file B
file A is nicely formatted with sections, headings etc, for better visibility
file B is a linear raw list of strings (really a bunch of machine names)
I am trying to compare file A and file B, and locate the strings in each file that don't exist in the other, and vice versa-- in other words, identify unique strings in each file.
UNIX utility *diff* works great, so do Windows tools like 'ExamDiff', 'CompareIt!' etc; but they only compare a single occurence of each string pair, and ignore the rests.
For instance, I have
List A List B ------ ------ abc bcd def def def ijk ghi jkl
The result will be:
List A List B ------ ------ abc bcd def ijk ghi jkl
(Note that the eliminated strings were the ones that followed One-to-One matching)
While the expected result is:
List A List B ------ ------ abc bcd ghi ijk jkl
With both occurences of 'def' being eliminated - with One-to-many comparison.
Can anyone suggest a solution? A tool or an script logic?
~@~
|
|
|
|
techGuy
Please log in to subscribe to techGuy's postings.
Posted on 08-14-09 4:37
PM
Reply
[Subscribe]
|
Login in to Rate this Post:
0
?
|
|
try http://www.scootersoftware.com/download.php
Last edited: 14-Aug-09 04:42 PM
|
|
|
Saajha
Please log in to subscribe to Saajha's postings.
Posted on 08-14-09 4:59
PM
Reply
[Subscribe]
|
Login in to Rate this Post:
0
?
|
|
Just installed and tried it -- still the same issue ..it does the comparison, but only for a single occurrence. I haven't had a chance to look at the options yet though. Thanks! ~@~
|
|
|
parajn
Please log in to subscribe to parajn's postings.
Posted on 08-14-09 7:07
PM
Reply
[Subscribe]
|
Login in to Rate this Post:
0
?
|
|
Download the 30 day free trail of arexis merge. This tool works great. http://www.araxis.com/merge/
|
|
|
Saajha
Please log in to subscribe to Saajha's postings.
Posted on 08-18-09 12:19
PM
Reply
[Subscribe]
|
Login in to Rate this Post:
0
?
|
|
Have you tried this tool for similar purpose? Will take a look. Thanks! BTW, I was able to get it done on Excel (underrated, but worked great) by playing around with logical comparison formulas. ~@~
|
|
|
gidilat
Please log in to subscribe to gidilat's postings.
Posted on 08-18-09 1:34
PM
Reply
[Subscribe]
|
Login in to Rate this Post:
0
?
|
|
Create two text files. x.txt with abc def def ghi and y.txt with bcd def ijk jkl Issue following commands (at cygwin prompt) //sort and copy unique elements of x to x1 sort -u x.txt > x1.txt //sort and copy unique elements of y to y1 sort -u y.txt > y1.txt //copy lines that appear in the both x1 and y1 to z comm -1 -2 x1.txt y1.txt >z.txt //output lines that appear in x1 only comm -2 -3 x1.txt z.txt //output lines that appear in y1 only comm -2 -3 y1.txt z.txt Works for this limited dataset. Give it a try.
|
|
|
Saajha
Please log in to subscribe to Saajha's postings.
Posted on 08-18-09 4:37
PM
Reply
[Subscribe]
|
Login in to Rate this Post:
0
?
|
|
Thanks @gidilat, for the advice, and welcome to the forum - if you are a new member :-) Just looked through the commands and spotted a minor gotcha.. When you do sort -u and sort the list in order by unique elements, you'd actually get rid of multiple occurrences of each element. Once that's done, it's really a one to one comparison, no? In the above scenario, ALL occurrences of def on list A were eliminated - as they matched def on list B. But if I had, say abc listed twice in list A and did not exist at all in list B, sort -u would remove the second instance of abc in list A, correct? But the goal is to have every single occurrence of abc in list A if it's not present in list B. Thanks again for chiming in. ~@~
|
|
|
gidilat
Please log in to subscribe to gidilat's postings.
Posted on 08-18-09 7:02
PM
Reply
[Subscribe]
|
Login in to Rate this Post:
0
?
|
|
You are changing the goal from "in other words, identify unique strings in each file" to
"goal is to have every single occurrence of abc in list A if it's not present in list' A problem needs to defined properly before it can be solved.
|
|
|
Saajha
Please log in to subscribe to Saajha's postings.
Posted on 08-19-09 11:08
AM
Reply
[Subscribe]
|
Login in to Rate this Post:
0
?
|
|
Hmm, so - I said:"...the goal is to have every single occurrence of abc in list A if it's not present in list B" Doesn't that leave each list's strings/elements unique to those of the other list (and hence "...unique strings in each file")? When we compare two files and speak about 'uniqueness', I'd think the reference would be toward uniqueness with respect to each other, and not within oneself. Regardless, I'll buy the fact that 'unique strings in each file with respect to each other' or something similar would've made it little more descriptive. Sorry -- Thanks! So, just for the heck of it - I tried comm without sorting and isolating the unique elements, works just as good as any other alternatives (except Excel) that I tried in the past. It does the comparison, but only gets rid of a single occurrence of the match. ~@~
|
|