From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id 4BB90139694 for ; Mon, 20 Mar 2017 23:12:17 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 4B2A121C087; Mon, 20 Mar 2017 23:12:14 +0000 (UTC) Received: from smtp.gentoo.org (smtp.gentoo.org [140.211.166.183]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 16DC521C087 for ; Mon, 20 Mar 2017 23:12:12 +0000 (UTC) Received: from oystercatcher.gentoo.org (unknown [IPv6:2a01:4f8:202:4333:225:90ff:fed9:fc84]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.gentoo.org (Postfix) with ESMTPS id 6379F340BEA for ; Mon, 20 Mar 2017 23:12:11 +0000 (UTC) Received: from localhost.localdomain (localhost [IPv6:::1]) by oystercatcher.gentoo.org (Postfix) with ESMTP id C74A06F3F for ; Mon, 20 Mar 2017 23:12:09 +0000 (UTC) From: "Mart Raudsepp" To: gentoo-commits@lists.gentoo.org Content-Transfer-Encoding: 8bit Content-type: text/plain; charset=UTF-8 Reply-To: gentoo-dev@lists.gentoo.org, "Mart Raudsepp" Message-ID: <1490051470.285a95500835248045b0736469e382a1f73fc6be.leio@gentoo> Subject: [gentoo-commits] proj/gentoo-bumpchecker:master commit in: modules/ X-VCS-Repository: proj/gentoo-bumpchecker X-VCS-Files: modules/gnome_module.py X-VCS-Directories: modules/ X-VCS-Committer: leio X-VCS-Committer-Name: Mart Raudsepp X-VCS-Revision: 285a95500835248045b0736469e382a1f73fc6be X-VCS-Branch: master Date: Mon, 20 Mar 2017 23:12:09 +0000 (UTC) Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-commits@lists.gentoo.org X-Archives-Salt: 9e43a4cd-b5b4-4c50-af08-554315fd8c8b X-Archives-Hash: 78dff7dc323668bfd995a982c79c510a commit: 285a95500835248045b0736469e382a1f73fc6be Author: Mart Raudsepp gentoo org> AuthorDate: Mon Mar 20 22:52:44 2017 +0000 Commit: Mart Raudsepp gentoo org> CommitDate: Mon Mar 20 23:11:10 2017 +0000 URL: https://gitweb.gentoo.org/proj/gentoo-bumpchecker.git/commit/?id=285a9550 gnome: make the cache.json requests parallel; reduces a run from 3m01 to 0m23 for me This relies on the requests-futures package, which in turn relies on python-3.2+ Futures (or a backport of it). If requests-futures import fail, it will fall back to the old slower fetching one by one. modules/gnome_module.py | 26 ++++++++++++++++++++++++-- 1 file changed, 24 insertions(+), 2 deletions(-) diff --git a/modules/gnome_module.py b/modules/gnome_module.py index afba235..e6544b6 100644 --- a/modules/gnome_module.py +++ b/modules/gnome_module.py @@ -4,8 +4,16 @@ # vim: set sts=4 sw=4 et tw=0 : import requests +try: + from requests_futures.sessions import FuturesSession + parallel_requests = True +except: + print("requests-futures not found for parallel fetching - will fallback to slower one-by-one version retrieval for latest version") + parallel_requests = False + import package_module, clioptions_module +MAX_WORKERS = 10 DEBUG = False @@ -34,12 +42,17 @@ class GNOME: gnome_release_list[1] = str(int(gnome_release_list[1]) + 1) self.gnome_release = ".".join(gnome_release_list[:2]) - self.http = requests.session() + if parallel_requests: + self.http = FuturesSession(max_workers=MAX_WORKERS) + else: + self.http = requests.session() self.url_base = "https://download.gnome.org/" self.release_versions_file_path = self.url_base + 'teams/releng/' def generate_data_from_versions_markup(self, url): data = self.http.get(url) + if parallel_requests: + data = data.result() if not data: raise ValueError("Couldn't open %s" % url) @@ -61,11 +74,20 @@ class GNOME: def generate_data_individual(self, release_packages): ret = [] + # First query all results; if parallel_requests==True, this will run in parallel + for pkg in release_packages: + name = pkg.name.split('/')[-1] + if name in name_mapping: + name = name_mapping[name] + pkg.requests_result = self.http.get(self.url_base + '/sources/' + name + '/cache.json') + + # And now handle the results - this is a separate loop for parallel fetch support for pkg in release_packages: name = pkg.name.split('/')[-1] if name in name_mapping: name = name_mapping[name] - data = self.http.get(self.url_base + '/sources/' + name + '/cache.json') + # pkg.requests_result is the resulting Response if parallel_requests else Future that we need to call result() on to wait/retrieve the response + data = pkg.requests_result.result() if parallel_requests else pkg.requests_results if not data: print("Warning: Unable to read cache.json for %s" % pkg.name) continue