public inbox for gentoo-user@lists.gentoo.org
 help / color / mirror / Atom feed
From: Alan McKinnon <alan@linuxholdings.co.za>
To: gentoo-user@lists.gentoo.org
Subject: Re: [gentoo-user]  Re: [OT]  Question about duplicate lines in file
Date: Mon, 12 Jun 2006 20:39:20 +0200	[thread overview]
Message-ID: <200606122039.20722.alan@linuxholdings.co.za> (raw)
In-Reply-To: <86y7w2qmwh.fsf@poke.chrekh.se>

On Monday 12 June 2006 19:55, Christer Ekholm wrote:
> Teresa and Dale <teendale@vista-express.com> writes:
> > Thanks, read the man page, it was short so it didn't take long. 
> > I tried this:
> >
> > uniq -u /home/dale/Desktop/hosts /home/dale/Desktop/hostsort
> >
> > It doesn't look like it did anything but copy the same thing
> > over. There are only 2 lines missing.  Does spaces count?  Some
> > put in a lot of spaces between the localhost and the web address.
> >  Maybe that has a affect??
>
> The problem with uniq is that it (according to the manpage),
>
>   "Discard all but one of successive identical lines"
>
> You need to have a sorted file for uniq to do what you want, or
> sort it with the -u  option
>
>   sort -u hosts > hostsort
>
> If you don't want to ruin your original order you have to do
> something else. This is one way of doing it with perl.
>
>   perl -ne 'print unless exists $h{$_}; $h{$_} = 1' hosts >
> hostsort


Almost there :-)

If /etc/hosts has these lines:
127.0.0.1 localhost
127.0.0.1  localhost
uniq will see these as different even though they are actually the 
same entry. So he needs something like tr to squash spaces. This will 
do it (as root):

cat /etc/hosts | tr -s ' ' | sort | uniq -i > /etc/hosts.new

If the new file is OK, use it to overwrite /etc/hosts

Explanation so Dale knows what I'm asking him to do:
cat send the file to tr
tr finds all cases of two or more consecutive spaces and replaces them 
with one space
sort does a sort
uniq finds consecutive lines that are the same and throws away the 
extra ones. The -i is there just in case two entries differ in case 
only (as FQDNs are strictly speaking case insensitive). As mentioned 
by others, uniq only matches consecutive dupes, so the list must be 
sorted first
> /etc/hosts.new writes the final output to the named disk file

Cheers,
alan

p.s. Those 15,000 entries in your hosts file are, um, a lot :-)


-- 
If only me, you and dead people understand hex, 
how many people understand hex?

Alan McKinnon
alan at linuxholdings dot co dot za
+27 82, double three seven, one nine three five
-- 
gentoo-user@gentoo.org mailing list



  reply	other threads:[~2006-06-12 19:06 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-06-12 16:42 [gentoo-user] [OT] Question about duplicate lines in file Teresa and Dale
2006-06-12 16:54 ` Raymond Lewis Rebbeck
2006-06-12 17:19   ` Teresa and Dale
2006-06-12 17:32     ` Matthew Cline
2006-06-12 17:37     ` Raymond Lewis Rebbeck
2006-06-12 17:37     ` Neil Bothwick
2006-06-12 17:45     ` Mike Williams
2006-06-12 17:55     ` [gentoo-user] " Christer Ekholm
2006-06-12 18:39       ` Alan McKinnon [this message]
2006-06-12 19:15         ` Neil Bothwick
2006-06-12 22:52           ` Teresa and Dale
2006-06-12 23:23             ` Neil Bothwick

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200606122039.20722.alan@linuxholdings.co.za \
    --to=alan@linuxholdings.co.za \
    --cc=gentoo-user@lists.gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox