From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from lists.gentoo.org ([140.105.134.102] helo=robin.gentoo.org) by nuthatch.gentoo.org with esmtp (Exim 4.60) (envelope-from ) id 1FprkB-0003WU-22 for garchives@archives.gentoo.org; Mon, 12 Jun 2006 19:06:35 +0000 Received: from robin.gentoo.org (localhost [127.0.0.1]) by robin.gentoo.org (8.13.6/8.13.6) with SMTP id k5CJ0Man010349; Mon, 12 Jun 2006 19:00:22 GMT Received: from ctb-mesg1.saix.net (ctb-mesg1.saix.net [196.25.240.81]) by robin.gentoo.org (8.13.6/8.13.6) with ESMTP id k5CIh01T031992 for ; Mon, 12 Jun 2006 18:43:01 GMT Received: from wblv-ip-nas-1-p417.telkom-ipnet.co.za (wblv-ip-nas-1-p417.telkom-ipnet.co.za [155.239.147.161]) by ctb-mesg1.saix.net (Postfix) with ESMTP id A523451C1 for ; Mon, 12 Jun 2006 20:42:58 +0200 (SAST) From: Alan McKinnon To: gentoo-user@lists.gentoo.org Subject: Re: [gentoo-user] Re: [OT] Question about duplicate lines in file Date: Mon, 12 Jun 2006 20:39:20 +0200 User-Agent: KMail/1.9.1 References: <448D9974.9030000@vista-express.com> <448DA232.4080809@vista-express.com> <86y7w2qmwh.fsf@poke.chrekh.se> In-Reply-To: <86y7w2qmwh.fsf@poke.chrekh.se> Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-user@gentoo.org Reply-to: gentoo-user@lists.gentoo.org MIME-Version: 1.0 Content-Disposition: inline Message-Id: <200606122039.20722.alan@linuxholdings.co.za> Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Archives-Salt: 52dcce10-2b56-4ae7-a617-06fe9e1954b0 X-Archives-Hash: cc25dd674e54a4943e1877f9f71bb093 On Monday 12 June 2006 19:55, Christer Ekholm wrote: > Teresa and Dale writes: > > Thanks, read the man page, it was short so it didn't take long. > > I tried this: > > > > uniq -u /home/dale/Desktop/hosts /home/dale/Desktop/hostsort > > > > It doesn't look like it did anything but copy the same thing > > over. There are only 2 lines missing. Does spaces count? Some > > put in a lot of spaces between the localhost and the web address. > > Maybe that has a affect?? > > The problem with uniq is that it (according to the manpage), > > "Discard all but one of successive identical lines" > > You need to have a sorted file for uniq to do what you want, or > sort it with the -u option > > sort -u hosts > hostsort > > If you don't want to ruin your original order you have to do > something else. This is one way of doing it with perl. > > perl -ne 'print unless exists $h{$_}; $h{$_} = 1' hosts > > hostsort Almost there :-) If /etc/hosts has these lines: 127.0.0.1 localhost 127.0.0.1 localhost uniq will see these as different even though they are actually the same entry. So he needs something like tr to squash spaces. This will do it (as root): cat /etc/hosts | tr -s ' ' | sort | uniq -i > /etc/hosts.new If the new file is OK, use it to overwrite /etc/hosts Explanation so Dale knows what I'm asking him to do: cat send the file to tr tr finds all cases of two or more consecutive spaces and replaces them with one space sort does a sort uniq finds consecutive lines that are the same and throws away the extra ones. The -i is there just in case two entries differ in case only (as FQDNs are strictly speaking case insensitive). As mentioned by others, uniq only matches consecutive dupes, so the list must be sorted first > /etc/hosts.new writes the final output to the named disk file Cheers, alan p.s. Those 15,000 entries in your hosts file are, um, a lot :-) -- If only me, you and dead people understand hex, how many people understand hex? Alan McKinnon alan at linuxholdings dot co dot za +27 82, double three seven, one nine three five -- gentoo-user@gentoo.org mailing list