From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from lists.gentoo.org ([140.105.134.102] helo=robin.gentoo.org) by nuthatch.gentoo.org with esmtp (Exim 4.67) (envelope-from ) id 1IGVAT-0004XC-Bs for garchives@archives.gentoo.org; Thu, 02 Aug 2007 07:32:21 +0000 Received: from robin.gentoo.org (localhost [127.0.0.1]) by robin.gentoo.org (8.14.0/8.14.0) with SMTP id l727V33k022339; Thu, 2 Aug 2007 07:31:03 GMT Received: from smtp.gentoo.org (smtp.gentoo.org [140.211.166.183]) by robin.gentoo.org (8.14.0/8.14.0) with ESMTP id l727QtjK017612 for ; Thu, 2 Aug 2007 07:26:56 GMT Received: from localhost (localhost [127.0.0.1]) by smtp.gentoo.org (Postfix) with ESMTP id 5A6E7653E4 for ; Thu, 2 Aug 2007 07:26:54 +0000 (UTC) X-Virus-Scanned: amavisd-new at gentoo.org X-Spam-Score: -0.108 X-Spam-Level: X-Spam-Status: No, score=-0.108 required=5.5 tests=[AWL=-0.109, BAYES_50=0.001] Received: from smtp.gentoo.org ([127.0.0.1]) by localhost (smtp.gentoo.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id uZaC+AYljnj7 for ; Thu, 2 Aug 2007 07:26:52 +0000 (UTC) Received: from ciao.gmane.org (main.gmane.org [80.91.229.2]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by smtp.gentoo.org (Postfix) with ESMTP id D14A4653C9 for ; Thu, 2 Aug 2007 07:26:50 +0000 (UTC) Received: from list by ciao.gmane.org with local (Exim 4.43) id 1IGV53-0002QK-RU for gentoo-user@gentoo.org; Thu, 02 Aug 2007 09:26:46 +0200 Received: from s0106000b6abea981.vs.shawcable.net ([24.86.149.50]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 02 Aug 2007 09:26:45 +0200 Received: from hawat.thufir by s0106000b6abea981.vs.shawcable.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 02 Aug 2007 09:26:45 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: gentoo-user@lists.gentoo.org From: Thufir Subject: [gentoo-user] Piggy Bank as a screen scraper Date: Thu, 2 Aug 2007 07:26:31 +0000 (UTC) Message-ID: Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-user@gentoo.org Reply-to: gentoo-user@lists.gentoo.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Complaints-To: usenet@sea.gmane.org X-Gmane-NNTP-Posting-Host: s0106000b6abea981.vs.shawcable.net User-Agent: Pan/0.131 (Ghosts: First Variation) Sender: news X-Archives-Salt: 65f8f5da-22a3-40a6-8092-4331708c3eb4 X-Archives-Hash: 4e74e57363c6ec39b43dafe4e144b757 I glanced over an article about Piggy Bank, , which interests me as a screen scraper. What I have in mind are RSS feeds from . Now, I can setup Feed-on-Feeds so that lotsa data from Craigslist downloads into the MySQL database. However, much of the useful detail is buried in the text :( So, this now makes me think of screen scraping the Feed-on-Feeds interface. Kinda backwards, I'm sure others would come up with something more sophisticated, directly accessing the database, but... Anyhow, my thinking is to use this piggy bank to break down, get at, some of the data. Then I can add that to the database to better track, well, whatever. Just kinda excited at the prospect of a new tool :) While piggy bank may not really be Linux specific, and definitely not Gentoo specific, I just really like the way the different Linux magazines talk about software and tools, getting things done :) -Thufir -- gentoo-user@gentoo.org mailing list