public inbox for gentoo-user@lists.gentoo.org
 help / color / mirror / Atom feed
From: Willie Wong <wwong@Princeton.EDU>
To: gentoo-user@lists.gentoo.org
Subject: Re: [gentoo-user] [OT] - command line read *.csv & create new file
Date: Sun, 22 Feb 2009 19:57:53 -0500	[thread overview]
Message-ID: <20090223005753.GA12341@princeton.edu> (raw)
In-Reply-To: <5bdc1c8b0902221515g3b932654k47568031f45d76d8@mail.gmail.com>

On Sun, Feb 22, 2009 at 03:15:09PM -0800, Penguin Lover Mark Knecht squawked:
> 1) My actual input data starts with two fields which date & time. For
> lines 2 & 3 I need exclude the 2nd & 3rd date & time from the output
> corresponding to line 1, so these 3 lines:
> 
> Date1,Time1,A,B,C,D,0
> Date2,Time2,E,F,G,H,1
> Date3,Time3,I,J,K,L,2
> 
> should generate
> 
> Date1,Time1,A,B,C,D,E,F,G,H,,I,J,K,L,2
> 
> Essentially Date & Time from line 1, results from line 3.
> 
> 2) The second is that possibly I don't need attribute G in my output
> file. I'm thinking that possibly a 3rd sed script that counts a
> certain number of commas and then doesn't copy up through the next
> comma? That's messy in the sense that I probably need to drop 10-15
> columns out as my real data is maybe 100 fields wide so I'd have 10-15
> addition scripts which is too much of a hack to be maintainable.
> Anyway, I appreciate the ideas. What you sent worked great.
> 

For both of these cases, since you are dropping columns and not
re-organizing, you'd have a much easier time just piping the command
through "cut". Try 'man cut' (it is only a few hundred words) for
usage. But with the sample you gave me, you just need to post process
with

.... | cut -d , -f 1-6,9,10,12,15-

and the Date2, Time2, G, Date3, Time3 columns will be dropped. 

As to your problem with the first two lines being mangled: I suspect
that the first two lines were formatted differently? Maybe stray
control characters got into your file or maybe there are leading
spaces? It's bizarre for both Etaoin's and my scripts to
coincidentally mess up the same lines. 

(Incidentally, where did you get the csv files from? When I worked in
a physics labs and collected data, I found that a lot of times the
processing of data using basic command-line tools like sed, bash,
perl, and bc can be done a lot more quickly if the initial datasets were
formatted in a sensible fashion. Of course there are times when such
luxury cannot be afforded.)

Best, 

W
-- 
"What's the Lagrangian for a suction dart?"
~DeathMech, Some Student. P-town PHY 205
Sortir en Pantoufles: up 807 days, 23:29



  reply	other threads:[~2009-02-23  0:54 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-22 19:06 [gentoo-user] [OT] - command line read *.csv & create new file Mark Knecht
2009-02-22 20:15 ` Etaoin Shrdlu
2009-02-22 22:28   ` Mark Knecht
2009-02-22 22:57     ` Etaoin Shrdlu
2009-02-22 23:31       ` Mark Knecht
2009-02-23  6:17         ` Paul Hartman
2009-02-23  9:57         ` Etaoin Shrdlu
2009-02-23 16:05           ` Mark Knecht
2009-02-23 22:18             ` Etaoin Shrdlu
2009-02-24  2:26               ` Mark Knecht
2009-02-24 10:56                 ` Etaoin Shrdlu
2009-02-24 14:41                   ` Mark Knecht
2009-02-24 17:48                     ` Etaoin Shrdlu
2009-02-24 22:51                       ` Mark Knecht
2009-02-25 10:27                         ` Etaoin Shrdlu
2009-02-22 20:59 ` Willie Wong
2009-02-22 23:15   ` Mark Knecht
2009-02-23  0:57     ` Willie Wong [this message]
2009-02-23  1:54       ` Mark Knecht

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090223005753.GA12341@princeton.edu \
    --to=wwong@princeton.edu \
    --cc=gentoo-user@lists.gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox