--Comparing Text Strings--

Your Banner Here

Discover Nepali Service Providers in your area

[Show all top banners]

Saajha

Replies to this thread:

techGuy 5566 days ago

Saajha 5566 days ago

parajn 5566 days ago

Saajha 5562 days ago

gidilat 5562 days ago

Saajha 5562 days ago

gidilat 5562 days ago

Saajha 5561 days ago

MORE RELATED DISCUSSIONS

More by Saajha

Google founders make $2 billion each in an hour...

-- A Ghazal in a Mehfil ---

Motorcycle Group Ride: NY/NJ

Calgary/Banff

bond_007

**Watch Out 'India Times' Website Visitors**

Motorcycle Riders around?

-- NYC Power Outage --

Pittsburgh Steelers in Superbowl today

Senior Network Engineer and Voice Engineer

What's on your bookshelf?

Sysadmins- Advise pls.

Cisco Voice Engineer

SQL HELP!

Soooo Happy - Finally I found... II

Let's Talk Dogs?

%% Western Oldies %%

Happy Birthday Village Boy... ...

DASHAIN DHOON/ MANGAL DHOON

Sequel to 'So Happy Finally I found..': Nostalgia Continues

See more by Saajha

What people are reading

Visitor is reading --Comparing Text Strings--

Visitor is reading NEPAL=ENGLISH

Visitor is reading Badar ko hath ma nariwal...

Visitor from US is reading consulting company

Visitor is reading Idaho state University

Visitor from US is reading consulting company

Visitor from US is reading For How long with US economy will be like this

Visitor is reading Boston-Looking for roommate/Apt.

Visitor from FI is reading Hamro Neta Boston Aunu Bha ho?

Your Banner Here

Subscribers

[Total Subscribers 1]

Slackdemic

:: Subscribe

View Members

Back to: Computer/IT

Refresh page to view new replies

--Comparing Text Strings--

[VIEWED 7681 TIMES]

SAVE! for ease of future access.

Saajha

Posted on 08-14-09 2:13 PM Reply [Subscribe]

I have two text files to compare: file A and file B

file A is nicely formatted with sections, headings etc, for better visibility

file B is a linear raw list of strings (really a bunch of machine names)

I am trying to compare file A and file B, and locate the strings in each file that don't exist in the other, and vice versa-- in other words, identify unique strings in each file.

UNIX utility *diff* works great, so do Windows tools like 'ExamDiff', 'CompareIt!' etc; but they only compare a single occurence of each string pair, and ignore the rests.

For instance, I have

List A       List B
------       ------
abc          bcd
def          def
def          ijk
ghi           jkl

The result will be:

List A       List B
------        ------
abc           bcd
def           ijk
ghi            jkl

(Note that the eliminated strings were the ones that followed One-to-One matching)

While the expected result is:

List A       List B
------      ------
abc       bcd
ghi          ijk
              jkl

With both occurences of 'def' being eliminated - with One-to-many comparison.

Can anyone suggest a solution? A tool or an script logic?

~@~

View/Share this post only

Your Banner Here

techGuy

Posted on 08-14-09 4:37 PM Reply [Subscribe]

try http://www.scootersoftware.com/download.php

Last edited: 14-Aug-09 04:42 PM

View/Share this post only

Saajha

Posted on 08-14-09 4:59 PM Reply [Subscribe]

Just installed and tried it -- still the same issue ..it does the comparison, but only for a single occurrence. I haven't had a chance to look at the options yet though.

Thanks!

~@~

View/Share this post only

parajn

Posted on 08-14-09 7:07 PM Reply [Subscribe]

Download the 30 day free trail of arexis merge. This tool works great.
http://www.araxis.com/merge/

View/Share this post only

Saajha

Posted on 08-18-09 12:19 PM Reply [Subscribe]

Have you tried this tool for similar purpose?
Will take a look. Thanks!

BTW, I was able to get it done on Excel (underrated, but worked great) by playing around with logical comparison formulas.

~@~

View/Share this post only

gidilat

Posted on 08-18-09 1:34 PM Reply [Subscribe]

Create two text files.
x.txt with
abc
def
def
ghi

and y.txt with
bcd
def
ijk
jkl

Issue following commands (at cygwin prompt)
//sort and copy unique elements of x to x1
sort -u x.txt > x1.txt
//sort and copy unique elements of y to y1
sort -u y.txt > y1.txt

//copy lines that appear in the both x1 and y1 to z
comm -1 -2 x1.txt y1.txt >z.txt

//output lines that appear in x1 only
comm -2 -3 x1.txt z.txt

//output lines that appear in y1 only
comm -2 -3 y1.txt z.txt

Works for this limited dataset.
Give it a try.

View/Share this post only

Saajha

Posted on 08-18-09 4:37 PM Reply [Subscribe]

Thanks @gidilat, for the advice, and welcome to the forum - if you are a new member :-)

Just looked through the commands and spotted a minor gotcha..

When you do sort -u and sort the list in order by unique elements, you'd actually get rid of multiple occurrences of each element. Once that's done, it's really a one to one comparison, no?

In the above scenario, ALL occurrences of def on list A were eliminated - as they matched def on list B.
But if I had, say abc listed twice in list A and did not exist at all in list B, sort -u would remove the second instance of abc in list A, correct? But the goal is to have every single occurrence of abc in list A if it's not present in list B.

Thanks again for chiming in.
~@~

View/Share this post only

gidilat

Posted on 08-18-09 7:02 PM Reply [Subscribe]

You are changing the goal from
"in other words, identify unique strings in each file"
to
"goal is to have every single occurrence of abc in list A if it's not present in list'

A problem needs to defined properly before it can be solved.

View/Share this post only

Saajha

Posted on 08-19-09 11:08 AM Reply [Subscribe]

Hmm, so - I said:
"...the goal is to have every single occurrence of abc in list A if it's not present in list B"

Doesn't that leave each list's strings/elements unique to those of the other list (and hence "...unique strings in each file")?

When we compare two files and speak about 'uniqueness', I'd think the reference would be toward uniqueness with respect to each other, and not within oneself.

Regardless, I'll buy the fact that 'unique strings in each file with respect to each other' or something similar would've made it little more descriptive. Sorry -- Thanks!

So, just for the heck of it - I tried comm without sorting and isolating the unique elements, works just as good as any other alternatives (except Excel) that I tried in the past. It does the comparison, but only gets rid of a single occurrence of the match.

~@~

View/Share this post only

Please Log in! to be able to reply! If you don't have a login, please register here.

YOU CAN ALSO

IN ORDER TO POST!

Within last 60 days

Recommended Popular Threads

Controvertial Threads

TPS Re-registration case still pending ..

Toilet paper or water?

ढ्याउ गर्दा दसैँको खसी गनाउच

Tourist Visa - Seeking Suggestions and Guidance

and it begins - on Day 1 Trump will begin operations to deport millions of undocumented immigrants

From Trump “I will revoke TPS, and deport them back to their country.”

wanna be ruled by stupid or an Idiot ?

To Sajha admin

How to Retrieve a Copy of Domestic Violence Complaint???

Travel Document for TPS (approved)

MAGA denaturalization proposal!!

advanced parole

All the Qatar ailines from Nepal canceled to USA

NOTE: The opinions here represent the opinions of the individual posters, and not of Sajha.com. It is not possible for sajha.com to monitor all the postings, since sajha.com merely seeks to provide a cyber location for discussing ideas and concerns related to Nepal and the Nepalis. Please send an email to admin@sajha.com using a valid email address if you want any posting to be considered for deletion. Your request will be handled on a one to one basis. Sajha.com is a service please don't abuse it. - Thanks.

Sajha.com Privacy Policy

↑ Back to Top

Like us in Facebook!