From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from pigeon.gentoo.org ([69.77.167.62] helo=lists.gentoo.org) by finch.gentoo.org with esmtp (Exim 4.60) (envelope-from ) id 1LbKlE-0006A4-JH for garchives@archives.gentoo.org; Sun, 22 Feb 2009 20:17:12 +0000 Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 8906FE02A7; Sun, 22 Feb 2009 20:17:11 +0000 (UTC) Received: from dcnode-02.unlimitedmail.net (smtp.unlimitedmail.net [94.127.184.242]) by pigeon.gentoo.org (Postfix) with ESMTP id 39571E02A7 for ; Sun, 22 Feb 2009 20:17:11 +0000 (UTC) Received: from ppp.zz ([137.204.208.98]) (authenticated bits=0) by dcnode-02.unlimitedmail.net (8.14.3/8.14.3) with ESMTP id n1MKGv31010413 for ; Sun, 22 Feb 2009 21:16:58 +0100 From: Etaoin Shrdlu To: gentoo-user@lists.gentoo.org Subject: Re: [gentoo-user] [OT] - command line read *.csv & create new file Date: Sun, 22 Feb 2009 21:15:31 +0100 User-Agent: KMail/1.9.9 References: <5bdc1c8b0902221106h71a8783y698aa209ace59a6@mail.gmail.com> In-Reply-To: <5bdc1c8b0902221106h71a8783y698aa209ace59a6@mail.gmail.com> Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-user@lists.gentoo.org Reply-to: gentoo-user@lists.gentoo.org MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200902222115.31620.shrdlu@unlimitedmail.org> X-UnlimitedMail-MailScanner-From: shrdlu@unlimitedmail.org X-Spam-Status: No X-Archives-Salt: 8e720754-c693-419a-af0e-7508a9ddca6e X-Archives-Hash: 1bd75dc616dc663d01ad12ba0ba55f22 On Sunday 22 February 2009, 20:06, Mark Knecht wrote: > Hi, > Very off topic other than I'd do this on my Gentoo box prior to > using R on my Gentoo box. Please ignore if not of interest. > > I've got a really big data file in essentially a *.csv format. > (comma delimited) I need to scan this file and create a new output > file. I'm wondering if there is a reasonably easy command line way of > doing this using something like sed or awk which I know nothing about. > Thanks in advance. > > The basic idea goes something like this: > > 1) The input file might look this the following where some of it is > attributes (shown as letters) and other parts are results. (shown as > numbers) > > A,B,C,D,1 > E,F,G,H,2 > I,J,K,L,3 > M,N,O,P,4 > Q,R,S,T,5 > U,V,W,X,6 Are the results always in the last field, and only a single field? Is the total number of fields per line always fixed? > 2) From the above data input file I want to take the attributes from a > few preceeding lines (say 3 in this example) and write them to the > output file along with the result on the last of the 3 lines. The > output file might look like this: > > A,B,C,D,E,F,G,H,I,J,K,L,3 > E,F,G,H,I,J,K,L,M,N,O,P,4 > I,J,K,L,M,N,O,P,Q,R,S,T,5 > M,N,O,P,Q,R,S,T,U,V,W,X,6 Is the number of lines you pick for the operation always 3 or can it vary? And, once you choose a number n of lines, should the whole file be processed concatenating n lines at a time, and the resulting single line be ended with the result of the nth line? in other words, does the following hold for the output format: ... With answers to the above questions, it's probably possible to hack together a solution.