From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from pigeon.gentoo.org ([69.77.167.62] helo=lists.gentoo.org)
	by finch.gentoo.org with esmtp (Exim 4.60)
	(envelope-from <gentoo-user+bounces-91377-garchives=archives.gentoo.org@lists.gentoo.org>)
	id 1LbQ1G-0003kA-0X
	for garchives@archives.gentoo.org; Mon, 23 Feb 2009 01:54:06 +0000
Received: from pigeon.gentoo.org (localhost [127.0.0.1])
	by pigeon.gentoo.org (Postfix) with SMTP id C1D63E01DC;
	Mon, 23 Feb 2009 01:54:03 +0000 (UTC)
Received: from rv-out-0506.google.com (rv-out-0506.google.com [209.85.198.230])
	by pigeon.gentoo.org (Postfix) with ESMTP id 879C8E01DC
	for <gentoo-user@lists.gentoo.org>; Mon, 23 Feb 2009 01:54:03 +0000 (UTC)
Received: by rv-out-0506.google.com with SMTP id g9so1597932rvb.2
        for <gentoo-user@lists.gentoo.org>; Sun, 22 Feb 2009 17:54:03 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=gamma;
        h=domainkey-signature:mime-version:received:in-reply-to:references
         :date:message-id:subject:from:to:content-type
         :content-transfer-encoding;
        bh=8k/z3/DE3k8MOjuwg2vZbQFd2QzjpZSNUHvGJz7SdQw=;
        b=H+RB+mgGfiXb6Fi0TIPUd6T581XZZjFqBLlgUNUttXMUR5k3nYWMHJD0iEq5vywsDQ
         3smDThwzIET6ag0Ox6BVl3DmsjWvGs6rqK/YdyQzBHbxZzLjd65/itYl64q4szE060S+
         uYLKDjNLHYwWNsq9Gk0PJcyFElRCmvy9bMVxk=
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:in-reply-to:references:date:message-id:subject:from:to
         :content-type:content-transfer-encoding;
        b=DhN91mDc8XSy0SMdyNQuvDgNZmw3RMsZlW9GePtVSsPSoVKTlRz5u235HA6ElBAZxO
         oNk43CR9V/GKkzunLogpNj/5jGf1Qo/cAycDHU9Z5x0O5FrmUh0BwV3z4JkLkrBSnlnl
         zAwk+nOiZFLbM9R/TX9TIoHm4oVJnDOsgvy1Q=
Precedence: bulk
List-Post: <mailto:gentoo-user@lists.gentoo.org>
List-Help: <mailto:gentoo-user+help@lists.gentoo.org>
List-Unsubscribe: <mailto:gentoo-user+unsubscribe@lists.gentoo.org>
List-Subscribe: <mailto:gentoo-user+subscribe@lists.gentoo.org>
List-Id: Gentoo Linux mail <gentoo-user.gentoo.org>
X-BeenThere: gentoo-user@lists.gentoo.org
Reply-to: gentoo-user@lists.gentoo.org
MIME-Version: 1.0
Received: by 10.143.2.19 with SMTP id e19mr1716774wfi.96.1235354043116; Sun, 
	22 Feb 2009 17:54:03 -0800 (PST)
In-Reply-To: <20090223005753.GA12341@princeton.edu>
References: <5bdc1c8b0902221106h71a8783y698aa209ace59a6@mail.gmail.com>
	 <20090222205938.GA455@princeton.edu>
	 <5bdc1c8b0902221515g3b932654k47568031f45d76d8@mail.gmail.com>
	 <20090223005753.GA12341@princeton.edu>
Date: Sun, 22 Feb 2009 17:54:03 -0800
Message-ID: <5bdc1c8b0902221754v30776ee0r3840ad1cb9a4c030@mail.gmail.com>
Subject: Re: [gentoo-user] [OT] - command line read *.csv & create new file
From: Mark Knecht <markknecht@gmail.com>
To: gentoo-user@lists.gentoo.org
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
X-Archives-Salt: 46b2aae3-c7c6-4bf4-a14e-ec7453ed266a
X-Archives-Hash: c7b2280936e98e82526d23cd809a5a1d

On Sun, Feb 22, 2009 at 4:57 PM, Willie Wong <wwong@princeton.edu> wrote:
> On Sun, Feb 22, 2009 at 03:15:09PM -0800, Penguin Lover Mark Knecht squawked:
>> 1) My actual input data starts with two fields which date & time. For
>> lines 2 & 3 I need exclude the 2nd & 3rd date & time from the output
>> corresponding to line 1, so these 3 lines:
>>
>> Date1,Time1,A,B,C,D,0
>> Date2,Time2,E,F,G,H,1
>> Date3,Time3,I,J,K,L,2
>>
>> should generate
>>
>> Date1,Time1,A,B,C,D,E,F,G,H,,I,J,K,L,2
>>
>> Essentially Date & Time from line 1, results from line 3.
>>
>> 2) The second is that possibly I don't need attribute G in my output
>> file. I'm thinking that possibly a 3rd sed script that counts a
>> certain number of commas and then doesn't copy up through the next
>> comma? That's messy in the sense that I probably need to drop 10-15
>> columns out as my real data is maybe 100 fields wide so I'd have 10-15
>> addition scripts which is too much of a hack to be maintainable.
>> Anyway, I appreciate the ideas. What you sent worked great.
>>
>
> For both of these cases, since you are dropping columns and not
> re-organizing, you'd have a much easier time just piping the command
> through "cut". Try 'man cut' (it is only a few hundred words) for
> usage. But with the sample you gave me, you just need to post process
> with
>
> .... | cut -d , -f 1-6,9,10,12,15-
>
> and the Date2, Time2, G, Date3, Time3 columns will be dropped.

Thanks. I'll investigate that tomorrow.

>
> As to your problem with the first two lines being mangled: I suspect
> that the first two lines were formatted differently? Maybe stray
> control characters got into your file or maybe there are leading
> spaces? It's bizarre for both Etaoin's and my scripts to
> coincidentally mess up the same lines.
>
> (Incidentally, where did you get the csv files from? When I worked in
> a physics labs and collected data, I found that a lot of times the
> processing of data using basic command-line tools like sed, bash,
> perl, and bc can be done a lot more quickly if the initial datasets were
> formatted in a sensible fashion. Of course there are times when such
> luxury cannot be afforded.)

They are primarialy coming from TradeStation. The data that I'm
working with is stock pricing data along with technical indicators
coming off of charts. Unfortunatelly I don't seem to have any control
at all as to the order that the columns show up. It doesn't seem to be
based on how I build the chart and certain things on the chart I don't
need are still output to the file. It's pretty much take 100% of
what's on the chart or take nothing.

Fortunately the csv files are very good in terms of not dropping out
data. At least every row has all the data.

Cheers,
Mark

>
> Best,
>
> W
> --
> "What's the Lagrangian for a suction dart?"
> ~DeathMech, Some Student. P-town PHY 205
> Sortir en Pantoufles: up 807 days, 23:29
>
>