From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from lists.gentoo.org ([140.105.134.102] helo=robin.gentoo.org) by nuthatch.gentoo.org with esmtp (Exim 4.60) (envelope-from ) id 1Fpqke-0004FT-Fx for garchives@archives.gentoo.org; Mon, 12 Jun 2006 18:03:01 +0000 Received: from robin.gentoo.org (localhost [127.0.0.1]) by robin.gentoo.org (8.13.6/8.13.6) with SMTP id k5CHuvfI020855; Mon, 12 Jun 2006 17:56:57 GMT Received: from py-out-1112.google.com (py-out-1112.google.com [64.233.166.177]) by robin.gentoo.org (8.13.6/8.13.6) with ESMTP id k5CHcV7r011467 for ; Mon, 12 Jun 2006 17:38:31 GMT Received: by py-out-1112.google.com with SMTP id x31so1759691pye for ; Mon, 12 Jun 2006 10:38:30 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:from:to:subject:date:user-agent:references:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:message-id; b=o6ElVUR1l9DvzhjUiGHtv0P7q8Qcr6eO9BiSTmbXZmurVBnrpUsN2HOjCD25RBIeMtzbkxwfMTxd9ah9UIcvBWf1txajEoBr6YgsU8P3H0DauM8k+BLKFbJpsE7ZNllca6BlxoEgdC955kt6YoPIdcGe5GxY+FD2524N/nYlKl0= Received: by 10.35.103.12 with SMTP id f12mr4545297pym; Mon, 12 Jun 2006 10:38:30 -0700 (PDT) Received: from ymir.donotfeedtheray.com ( [210.0.108.13]) by mx.gmail.com with ESMTP id v50sm826422pyv.2006.06.12.10.38.26; Mon, 12 Jun 2006 10:38:30 -0700 (PDT) From: Raymond Lewis Rebbeck To: gentoo-user@lists.gentoo.org Subject: Re: [gentoo-user] [OT] Question about duplicate lines in file Date: Tue, 13 Jun 2006 03:07:31 +0930 User-Agent: KMail/1.9.1 References: <448D9974.9030000@vista-express.com> <200606130224.27021.dystopianray@gmail.com> <448DA232.4080809@vista-express.com> In-Reply-To: <448DA232.4080809@vista-express.com> Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-user@gentoo.org Reply-to: gentoo-user@lists.gentoo.org MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200606130307.32051.dystopianray@gmail.com> X-Archives-Salt: d46ecdca-5c64-4065-97b0-372cf8b5a349 X-Archives-Hash: 06dd0b7f78c0c996d67cd5432c7455f5 On Tuesday, 13 June 2006 2:49, Teresa and Dale wrote: > Raymond Lewis Rebbeck wrote: > >On Tuesday, 13 June 2006 2:12, Teresa and Dale wrote: > >>Hi folks, > >> > >>I have batched a bunch of servers in my hosts file to block, for ads and > >>all that crap. I got them from several different places, some I have > >>found too, and am sure there are dups in there, same server but pasted > >>from several sources. I am not a programer at all and don't even really > >>know what to search for. I would like to remove the duplicate entries > >>and then put them in alphabetical order if I could. I would gladly then > >>make this available if someone wanted to host it. I don't have a place > >>to host it. > >> > >>Oh, there is 15,000 entries in my hosts file. O_O > >> > >>Could someone tell me how this is done? May even learn something here. > >>If I can do this, I'm sure I will. > >> > >>Thanks. > >> > >>Dale > >> > >>:-) :-) > > > >'uniq' and 'sort' should do what you're after, check out the man pages. > > Thanks, read the man page, it was short so it didn't take long. I tried > this: > > uniq -u /home/dale/Desktop/hosts /home/dale/Desktop/hostsort > > It doesn't look like it did anything but copy the same thing over. > There are only 2 lines missing. Does spaces count? Some put in a lot > of spaces between the localhost and the web address. Maybe that has a > affect?? > > Thanks for the help. I had never seen that command before. I had heard > of sort, never used it though. I do have those on my desktop. I'm > playing with copies instead of my real hosts file. > > Thanks again. > > Dale > > :-) :-) Yes the spaces matter, you could possibly use 'tr' to turn all repeated spaces into a single space. $ tr -s ' ' < filename That should do it, then you can pipe it through uniq and sort and do whatever else you want with it. -- Raymond Lewis Rebbeck -- gentoo-user@gentoo.org mailing list