From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from pigeon.gentoo.org ([69.77.167.62] helo=lists.gentoo.org) by finch.gentoo.org with esmtp (Exim 4.60) (envelope-from ) id 1LbQ1G-0003kA-0X for garchives@archives.gentoo.org; Mon, 23 Feb 2009 01:54:06 +0000 Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id C1D63E01DC; Mon, 23 Feb 2009 01:54:03 +0000 (UTC) Received: from rv-out-0506.google.com (rv-out-0506.google.com [209.85.198.230]) by pigeon.gentoo.org (Postfix) with ESMTP id 879C8E01DC for ; Mon, 23 Feb 2009 01:54:03 +0000 (UTC) Received: by rv-out-0506.google.com with SMTP id g9so1597932rvb.2 for ; Sun, 22 Feb 2009 17:54:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=8k/z3/DE3k8MOjuwg2vZbQFd2QzjpZSNUHvGJz7SdQw=; b=H+RB+mgGfiXb6Fi0TIPUd6T581XZZjFqBLlgUNUttXMUR5k3nYWMHJD0iEq5vywsDQ 3smDThwzIET6ag0Ox6BVl3DmsjWvGs6rqK/YdyQzBHbxZzLjd65/itYl64q4szE060S+ uYLKDjNLHYwWNsq9Gk0PJcyFElRCmvy9bMVxk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=DhN91mDc8XSy0SMdyNQuvDgNZmw3RMsZlW9GePtVSsPSoVKTlRz5u235HA6ElBAZxO oNk43CR9V/GKkzunLogpNj/5jGf1Qo/cAycDHU9Z5x0O5FrmUh0BwV3z4JkLkrBSnlnl zAwk+nOiZFLbM9R/TX9TIoHm4oVJnDOsgvy1Q= Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-user@lists.gentoo.org Reply-to: gentoo-user@lists.gentoo.org MIME-Version: 1.0 Received: by 10.143.2.19 with SMTP id e19mr1716774wfi.96.1235354043116; Sun, 22 Feb 2009 17:54:03 -0800 (PST) In-Reply-To: <20090223005753.GA12341@princeton.edu> References: <5bdc1c8b0902221106h71a8783y698aa209ace59a6@mail.gmail.com> <20090222205938.GA455@princeton.edu> <5bdc1c8b0902221515g3b932654k47568031f45d76d8@mail.gmail.com> <20090223005753.GA12341@princeton.edu> Date: Sun, 22 Feb 2009 17:54:03 -0800 Message-ID: <5bdc1c8b0902221754v30776ee0r3840ad1cb9a4c030@mail.gmail.com> Subject: Re: [gentoo-user] [OT] - command line read *.csv & create new file From: Mark Knecht To: gentoo-user@lists.gentoo.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Archives-Salt: 46b2aae3-c7c6-4bf4-a14e-ec7453ed266a X-Archives-Hash: c7b2280936e98e82526d23cd809a5a1d On Sun, Feb 22, 2009 at 4:57 PM, Willie Wong wrote: > On Sun, Feb 22, 2009 at 03:15:09PM -0800, Penguin Lover Mark Knecht squawked: >> 1) My actual input data starts with two fields which date & time. For >> lines 2 & 3 I need exclude the 2nd & 3rd date & time from the output >> corresponding to line 1, so these 3 lines: >> >> Date1,Time1,A,B,C,D,0 >> Date2,Time2,E,F,G,H,1 >> Date3,Time3,I,J,K,L,2 >> >> should generate >> >> Date1,Time1,A,B,C,D,E,F,G,H,,I,J,K,L,2 >> >> Essentially Date & Time from line 1, results from line 3. >> >> 2) The second is that possibly I don't need attribute G in my output >> file. I'm thinking that possibly a 3rd sed script that counts a >> certain number of commas and then doesn't copy up through the next >> comma? That's messy in the sense that I probably need to drop 10-15 >> columns out as my real data is maybe 100 fields wide so I'd have 10-15 >> addition scripts which is too much of a hack to be maintainable. >> Anyway, I appreciate the ideas. What you sent worked great. >> > > For both of these cases, since you are dropping columns and not > re-organizing, you'd have a much easier time just piping the command > through "cut". Try 'man cut' (it is only a few hundred words) for > usage. But with the sample you gave me, you just need to post process > with > > .... | cut -d , -f 1-6,9,10,12,15- > > and the Date2, Time2, G, Date3, Time3 columns will be dropped. Thanks. I'll investigate that tomorrow. > > As to your problem with the first two lines being mangled: I suspect > that the first two lines were formatted differently? Maybe stray > control characters got into your file or maybe there are leading > spaces? It's bizarre for both Etaoin's and my scripts to > coincidentally mess up the same lines. > > (Incidentally, where did you get the csv files from? When I worked in > a physics labs and collected data, I found that a lot of times the > processing of data using basic command-line tools like sed, bash, > perl, and bc can be done a lot more quickly if the initial datasets were > formatted in a sensible fashion. Of course there are times when such > luxury cannot be afforded.) They are primarialy coming from TradeStation. The data that I'm working with is stock pricing data along with technical indicators coming off of charts. Unfortunatelly I don't seem to have any control at all as to the order that the columns show up. It doesn't seem to be based on how I build the chart and certain things on the chart I don't need are still output to the file. It's pretty much take 100% of what's on the chart or take nothing. Fortunately the csv files are very good in terms of not dropping out data. At least every row has all the data. Cheers, Mark > > Best, > > W > -- > "What's the Lagrangian for a suction dart?" > ~DeathMech, Some Student. P-town PHY 205 > Sortir en Pantoufles: up 807 days, 23:29 > >