public inbox for gentoo-soc@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-soc] A few questions + Draft: Project IDFetch - Weekly report #1
@ 2010-06-02 21:12 mona
  0 siblings, 0 replies; only message in thread
From: mona @ 2010-06-02 21:12 UTC (permalink / raw
  To: gentoo-soc

Project IDFetch - Weekly report #1
==================================

The purpose of the project is to optimize software installation process,
by means of making distfile fetcher more intelligent and increasing
effectiveness of network connection utilization. The idea of the project
is not to rewrite the whole Portage system, but rather the part that
actually contiguous with the bottle neck of the network connection – the
distfile fetcher.

For more information on the project please see website:
    http://soc.dev.gentoo.org/~simka/
or
    http://idfetch.isgreat.org

git repository for idfetch project:
    git://git.overlays.gentoo.org/proj/idfetch
    http://git.overlays.gentoo.org/gitroot/proj/idfetch

git repository for changes to Portage:
    git://git.overlays.gentoo.org/proj/portage-idfetch
    http://git.overlays.gentoo.org/gitroot/proj/portage-idfetch


You can share your ideas on idfetch by joining IRC channel
#gentoo-idfetch at freenode or just sending me an email.

====================
The progress report:
====================

1) I've started from joining the mainstream and becoming pretty nervous
of the thoughts whether i can manage this (for some people seemingly
easy) project. After importing chocolate and coffee modules i tried to
switch to more productive things ;)

2) First thing to do was to export some data from the current portage
system:  basename, mirror URLs, size, checksums. I ended up with some
changes to fetch.py file that provided me with the following results:

# list of pkgs to be installed
Tidfetch_pkg_list : list of Tidfetch_pkg;

Tidfetch_pkg : dict
    ['pkg_name'] : string;
    ['distfile_list'] : list of Tidfetch_distfile;

Tidfetch_distfile : dict
    ['name'] : string;
    ['url_list'] : string;
    ['size'] : int;
    ['RMD160']
    ['SHA1']
    ['SHA256']

3) I started to use pickle module to exchange data between fetch.py and
twrapper (threaded wrapper for simultaneous downloads)

4) Following advice from Robin H. Johnson (my mentor) replaced pickle
module by json [1]. So now pkg_list.list (export file) looks this way:

[
    {
        "distfile_list": [
            {
                "RMD160": "10a19a10d0388bc084a7c1d3da845068d7169054", 
                "SHA1": "2a7198e8178b2e7dba87cb5794da515200b568f5",
                "SHA256":"0eb6f356119f2e49b2563210852e17f57f9dcc5755f350a69a46a0d641a0c401", 
                "name": "ghostscript-fonts-std-8.11.tar.gz", 
                "size": 3752871, 
                "url_list": [
                "ftp://gentoo.mirrors.tds.net/gentoo/distfiles/ghostscript-fonts-std-8.11.tar.gz",
                "ftp://ftp.lug.udel.edu/pub/gentoo/distfiles/ghostscript-fonts-std-8.11.tar.gz",
                "http://www.gtlib.gatech.edu/pub/gentoo/distfiles/ghostscript-fonts-std-8.11.tar.gz",
                ...........more mirrors - skipped ......
                ]
            }
        ], 
        "pkg_name": "gnu-gs-fonts-std-8.11"
    }, 
    {
        "distfile_list": [
            {
                "RMD160": "ae50d9eaccb3cc6aa48669eb5ea44a2857e80952", 
                "SHA1": "d6c3ed6f0c0deab9ee4f6d63f7b2c7ce3cbae280",
                "SHA256":"5efcc970b0ada0f8b5122e37ce8d02966999a4c8ece44df518f97c984134b645", 
                "name": "util-macros-1.6.1.tar.bz2", 
                "size": 62130, 
                "url_list": [
                "ftp://gentoo.mirrors.tds.net/gentoo/distfiles/util-macros-1.6.1.tar.bz2",
                "ftp://ftp.lug.udel.edu/pub/gentoo/distfiles/util-macros-1.6.1.tar.bz2",
                "http://www.gtlib.gatech.edu/pub/gentoo/distfiles/util-macros-1.6.1.tar.bz2",
                ...........more mirrors - skipped ......
                ]
            }
        ], 
        "pkg_name": "util-macros-1.6.1"
    }
]

5) Development of simple-threaded-twrapper started. Twrapper reads the
data from pkg_list.list file and starts downloading simultaneously
distfiles from the list (according to MAX_ACTIVE_DOWNLOADS in
idfetch_settings.py)

5.1) Before downloading file it checks if file already exists, it's
complete, check sums are ok. If so, skips this distfile. Otherwise
downloads it and checks its check sums.

6) Interface development: it's possible to choose between curses
(USE_CURSES_FLAG=1) and simple log-like output (USE_CURSES_FLAG=0).
Probably i'll do tput-interface implementation, because log-like output
is hard to follow and curses don't like buggy code :(

To see examples of output please follow these links:
     http://soc.dev.gentoo.org/~simka/curses.jpg
     http://soc.dev.gentoo.org/~simka/log-like.jpg

7) Robin H. Johnson suggested that Portage changes might be better in a
separate repo, that tracks the main Portage repo. Therefore, repository
for portage-idfetch project was created:
     git://git.overlays.gentoo.org/proj/portage-idfetch
     http://git.overlays.gentoo.org/gitroot/proj/portage-idfetch


[1] http://docs.python.org/library/json.html 





^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2010-06-02 21:12 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-02 21:12 [gentoo-soc] A few questions + Draft: Project IDFetch - Weekly report #1 mona

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox