public inbox for gentoo-user@lists.gentoo.org
 help / color / mirror / Atom feed
From: Etaoin Shrdlu <shrdlu@unlimitedmail.org>
To: gentoo-user@lists.gentoo.org
Subject: Re: [gentoo-user] [OT] - command line read *.csv & create new file
Date: Sun, 22 Feb 2009 21:15:31 +0100	[thread overview]
Message-ID: <200902222115.31620.shrdlu@unlimitedmail.org> (raw)
In-Reply-To: <5bdc1c8b0902221106h71a8783y698aa209ace59a6@mail.gmail.com>

On Sunday 22 February 2009, 20:06, Mark Knecht wrote:
> Hi,
>    Very off topic other than I'd do this on my Gentoo box prior to
> using R on my Gentoo box. Please ignore if not of interest.
>
>    I've got a really big data file in essentially a *.csv format.
> (comma delimited) I need to scan this file and create a new output
> file. I'm wondering if there is a reasonably easy command line way of
> doing this using something like sed or awk which I know nothing about.
> Thanks in advance.
>
>    The basic idea goes something like this:
>
> 1) The input file might look this the following where some of it is
> attributes (shown as letters) and other parts are results. (shown as
> numbers)
>
> A,B,C,D,1
> E,F,G,H,2
> I,J,K,L,3
> M,N,O,P,4
> Q,R,S,T,5
> U,V,W,X,6

Are the results always in the last field, and only a single field?
Is the total number of fields per line always fixed?

> 2) From the above data input file I want to take the attributes from a
> few preceeding lines (say 3 in this example) and write them to the
> output file along with the result on the last of the 3 lines. The
> output file might look like this:
>
> A,B,C,D,E,F,G,H,I,J,K,L,3
> E,F,G,H,I,J,K,L,M,N,O,P,4
> I,J,K,L,M,N,O,P,Q,R,S,T,5
> M,N,O,P,Q,R,S,T,U,V,W,X,6

Is the number of lines you pick for the operation always 3 or can it 
vary? And, once you choose a number n of lines, should the whole file be 
processed concatenating n lines at a time, and the resulting single line 
be ended with the result of the nth line? in other words, does the 
following hold for the output format:

<concatenation of attributes of lines 1..n> <result of line n>
<concatenation of attributes of lines 2..n+1> <result of line n+1>
<concatenation of attributes of lines 3..n+2> <result of line n+1>
<concatenation of attributes of lines 4..n+3> <result of line n+1>
...

With answers to the above questions, it's probably possible to hack 
together a solution.



  reply	other threads:[~2009-02-22 20:17 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-22 19:06 [gentoo-user] [OT] - command line read *.csv & create new file Mark Knecht
2009-02-22 20:15 ` Etaoin Shrdlu [this message]
2009-02-22 22:28   ` Mark Knecht
2009-02-22 22:57     ` Etaoin Shrdlu
2009-02-22 23:31       ` Mark Knecht
2009-02-23  6:17         ` Paul Hartman
2009-02-23  9:57         ` Etaoin Shrdlu
2009-02-23 16:05           ` Mark Knecht
2009-02-23 22:18             ` Etaoin Shrdlu
2009-02-24  2:26               ` Mark Knecht
2009-02-24 10:56                 ` Etaoin Shrdlu
2009-02-24 14:41                   ` Mark Knecht
2009-02-24 17:48                     ` Etaoin Shrdlu
2009-02-24 22:51                       ` Mark Knecht
2009-02-25 10:27                         ` Etaoin Shrdlu
2009-02-22 20:59 ` Willie Wong
2009-02-22 23:15   ` Mark Knecht
2009-02-23  0:57     ` Willie Wong
2009-02-23  1:54       ` Mark Knecht

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200902222115.31620.shrdlu@unlimitedmail.org \
    --to=shrdlu@unlimitedmail.org \
    --cc=gentoo-user@lists.gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox