* [gentoo-portage-dev] Add caching to a few commonly used functions @ 2020-06-27 6:34 Chun-Yu Shei 2020-06-27 6:34 ` [gentoo-portage-dev] [PATCH 1/3] Add caching to catpkgsplit function Chun-Yu Shei ` (4 more replies) 0 siblings, 5 replies; 30+ messages in thread From: Chun-Yu Shei @ 2020-06-27 6:34 UTC (permalink / raw To: gentoo-portage-dev Hi, I was recently interested in whether portage could be speed up, since dependency resolution can sometimes take a while on slower machines. After generating some flame graphs with cProfile and vmprof, I found 3 functions which seem to be called extremely frequently with the same arguments: catpkgsplit, use_reduce, and match_from_list. In the first two cases, it was simple to cache the results in dicts, while match_from_list was a bit trickier, since it seems to be a requirement that it return actual entries from the input "candidate_list". I also ran into some test failures if I did the caching after the mydep.unevaluated_atom.use and mydep.repo checks towards the end of the function, so the caching is only done up to just before that point. The catpkgsplit change seems to definitely be safe, and I'm pretty sure the use_reduce one is too, since anything that could possibly change the result is hashed. I'm a bit less certain about the match_from_list one, although all tests are passing. With all 3 patches together, "emerge -uDvpU --with-bdeps=y @world" speeds up from 43.53 seconds to 30.96 sec -- a 40.6% speedup. "emerge -ep @world" is just a tiny bit faster, going from 18.69 to 18.22 sec (2.5% improvement). Since the upgrade case is far more common, this would really help in daily use, and it shaves about 30 seconds off the time you have to wait to get to the [Yes/No] prompt (from ~90s to 60s) on my old Sandy Bridge laptop when performing normal upgrades. Hopefully, at least some of these patches can be incorporated, and please let me know if any changes are necessary. Thanks, Chun-Yu ^ permalink raw reply [flat|nested] 30+ messages in thread
* [gentoo-portage-dev] [PATCH 1/3] Add caching to catpkgsplit function 2020-06-27 6:34 [gentoo-portage-dev] Add caching to a few commonly used functions Chun-Yu Shei @ 2020-06-27 6:34 ` Chun-Yu Shei 2020-06-27 11:33 ` Michał Górny 2020-06-29 1:58 ` Sid Spry 2020-06-27 6:34 ` [gentoo-portage-dev] [PATCH 2/3] Add caching to use_reduce function Chun-Yu Shei ` (3 subsequent siblings) 4 siblings, 2 replies; 30+ messages in thread From: Chun-Yu Shei @ 2020-06-27 6:34 UTC (permalink / raw To: gentoo-portage-dev; +Cc: Chun-Yu Shei According to cProfile, catpkgsplit is called up to 1-5.5 million times during "emerge -uDvpU --with-bdeps=y @world". Adding a dict to cache its results reduces the time for this command from 43.53 -> 41.53 seconds -- a 4.8% speedup. --- lib/portage/versions.py | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/lib/portage/versions.py b/lib/portage/versions.py index 0c21373cc..ffec316ce 100644 --- a/lib/portage/versions.py +++ b/lib/portage/versions.py @@ -312,6 +312,7 @@ def _pkgsplit(mypkg, eapi=None): _cat_re = re.compile('^%s$' % _cat, re.UNICODE) _missing_cat = 'null' +_catpkgsplit_cache = {} def catpkgsplit(mydata, silent=1, eapi=None): """ @@ -331,6 +332,11 @@ def catpkgsplit(mydata, silent=1, eapi=None): return mydata.cpv_split except AttributeError: pass + + cache_entry = _catpkgsplit_cache.get(mydata) + if cache_entry is not None: + return cache_entry + mysplit = mydata.split('/', 1) p_split = None if len(mysplit) == 1: @@ -343,6 +349,7 @@ def catpkgsplit(mydata, silent=1, eapi=None): if not p_split: return None retval = (cat, p_split[0], p_split[1], p_split[2]) + _catpkgsplit_cache[mydata] = retval return retval class _pkg_str(_unicode): -- 2.27.0.212.ge8ba1cc988-goog ^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [gentoo-portage-dev] [PATCH 1/3] Add caching to catpkgsplit function 2020-06-27 6:34 ` [gentoo-portage-dev] [PATCH 1/3] Add caching to catpkgsplit function Chun-Yu Shei @ 2020-06-27 11:33 ` Michał Górny 2020-06-29 1:58 ` Sid Spry 1 sibling, 0 replies; 30+ messages in thread From: Michał Górny @ 2020-06-27 11:33 UTC (permalink / raw To: gentoo-portage-dev, Chun-Yu Shei Dnia June 27, 2020 6:34:13 AM UTC, Chun-Yu Shei <cshei@google.com> napisał(a): >According to cProfile, catpkgsplit is called up to 1-5.5 million times >during "emerge -uDvpU --with-bdeps=y @world". Adding a dict to cache >its >results reduces the time for this command from 43.53 -> 41.53 seconds >-- >a 4.8% speedup. Not saying caching is wrong for an interim solution but this is the kind of function where refactoring may yield even more gain. >--- > lib/portage/versions.py | 7 +++++++ > 1 file changed, 7 insertions(+) > >diff --git a/lib/portage/versions.py b/lib/portage/versions.py >index 0c21373cc..ffec316ce 100644 >--- a/lib/portage/versions.py >+++ b/lib/portage/versions.py >@@ -312,6 +312,7 @@ def _pkgsplit(mypkg, eapi=None): > > _cat_re = re.compile('^%s$' % _cat, re.UNICODE) > _missing_cat = 'null' >+_catpkgsplit_cache = {} > > def catpkgsplit(mydata, silent=1, eapi=None): > """ >@@ -331,6 +332,11 @@ def catpkgsplit(mydata, silent=1, eapi=None): > return mydata.cpv_split > except AttributeError: > pass >+ >+ cache_entry = _catpkgsplit_cache.get(mydata) >+ if cache_entry is not None: >+ return cache_entry >+ > mysplit = mydata.split('/', 1) > p_split = None > if len(mysplit) == 1: >@@ -343,6 +349,7 @@ def catpkgsplit(mydata, silent=1, eapi=None): > if not p_split: > return None > retval = (cat, p_split[0], p_split[1], p_split[2]) >+ _catpkgsplit_cache[mydata] = retval > return retval > > class _pkg_str(_unicode): -- Best regards, Michał Górny ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-portage-dev] [PATCH 1/3] Add caching to catpkgsplit function 2020-06-27 6:34 ` [gentoo-portage-dev] [PATCH 1/3] Add caching to catpkgsplit function Chun-Yu Shei 2020-06-27 11:33 ` Michał Górny @ 2020-06-29 1:58 ` Sid Spry 2020-07-06 15:26 ` Francesco Riosa 1 sibling, 1 reply; 30+ messages in thread From: Sid Spry @ 2020-06-29 1:58 UTC (permalink / raw To: gentoo-portage-dev On Sat, Jun 27, 2020, at 1:34 AM, Chun-Yu Shei wrote: > According to cProfile, catpkgsplit is called up to 1-5.5 million times > during "emerge -uDvpU --with-bdeps=y @world". Adding a dict to cache its > results reduces the time for this command from 43.53 -> 41.53 seconds -- > a 4.8% speedup. > --- > lib/portage/versions.py | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/lib/portage/versions.py b/lib/portage/versions.py > index 0c21373cc..ffec316ce 100644 > --- a/lib/portage/versions.py > +++ b/lib/portage/versions.py > @@ -312,6 +312,7 @@ def _pkgsplit(mypkg, eapi=None): > > _cat_re = re.compile('^%s$' % _cat, re.UNICODE) > _missing_cat = 'null' > +_catpkgsplit_cache = {} > > def catpkgsplit(mydata, silent=1, eapi=None): > """ > @@ -331,6 +332,11 @@ def catpkgsplit(mydata, silent=1, eapi=None): > return mydata.cpv_split > except AttributeError: > pass > + > + cache_entry = _catpkgsplit_cache.get(mydata) > + if cache_entry is not None: > + return cache_entry > + > mysplit = mydata.split('/', 1) > p_split = None > if len(mysplit) == 1: > @@ -343,6 +349,7 @@ def catpkgsplit(mydata, silent=1, eapi=None): > if not p_split: > return None > retval = (cat, p_split[0], p_split[1], p_split[2]) > + _catpkgsplit_cache[mydata] = retval > return retval > > class _pkg_str(_unicode): > -- > 2.27.0.212.ge8ba1cc988-goog > There are libraries that provide decorators, etc, for caching and memoization. Have you evaluated any of those? One is available in the standard library: https://docs.python.org/dev/library/functools.html#functools.lru_cache I comment as this would increase code clarity. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-portage-dev] [PATCH 1/3] Add caching to catpkgsplit function 2020-06-29 1:58 ` Sid Spry @ 2020-07-06 15:26 ` Francesco Riosa [not found] ` <cdb0d821-67c1-edb6-2cbc-f26eaa0d3d70@veremit.xyz> 0 siblings, 1 reply; 30+ messages in thread From: Francesco Riosa @ 2020-07-06 15:26 UTC (permalink / raw To: gentoo-portage-dev, Sid Spry Il 29/06/20 03:58, Sid Spry ha scritto: > There are libraries that provide decorators, etc, for caching and memoization. > Have you evaluated any of those? One is available in the standard library: > https://docs.python.org/dev/library/functools.html#functools.lru_cache > > I comment as this would increase code clarity. > I think portage developers try hard to avoid external dependancies I hope hard they do ^ permalink raw reply [flat|nested] 30+ messages in thread
[parent not found: <cdb0d821-67c1-edb6-2cbc-f26eaa0d3d70@veremit.xyz>]
* Re: [gentoo-portage-dev] [PATCH 1/3] Add caching to catpkgsplit function [not found] ` <cdb0d821-67c1-edb6-2cbc-f26eaa0d3d70@veremit.xyz> @ 2020-07-06 16:10 ` Francesco Riosa 2020-07-06 17:30 ` Chun-Yu Shei 0 siblings, 1 reply; 30+ messages in thread From: Francesco Riosa @ 2020-07-06 16:10 UTC (permalink / raw To: Michael 'veremitz' Everitt, gentoo-portage-dev, Sid Spry, Zac Medico Il 06/07/20 17:50, Michael 'veremitz' Everitt ha scritto: > On 06/07/20 16:26, Francesco Riosa wrote: >> Il 29/06/20 03:58, Sid Spry ha scritto: >>> There are libraries that provide decorators, etc, for caching and >>> memoization. >>> Have you evaluated any of those? One is available in the standard library: >>> https://docs.python.org/dev/library/functools.html#functools.lru_cache >>> >>> I comment as this would increase code clarity. >>> >> I think portage developers try hard to avoid external dependancies >> I hope hard they do >> >> > I think the key word here is 'external' - anything which is part of the > python standard library is game for inclusion in portage, and has/does > provide much needed optimisation. Many of the issues in portage are > so-called "solved problems" in computing terms, and as such, we should take > advantage of these to improve performance at every available opportunity. > Of course, there are presently only one, two or three key developers able > to make/test these changes (indeed at scale) so progress is often slower > than desirable in current circumstances... > > [sent direct due to posting restrictions...] yes I've replied too fast and didn't notice Sid was referring to _standard_ libraries (not even recent additions) sorry for the noise - Francesco ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-portage-dev] [PATCH 1/3] Add caching to catpkgsplit function 2020-07-06 16:10 ` Francesco Riosa @ 2020-07-06 17:30 ` Chun-Yu Shei 2020-07-06 18:03 ` Zac Medico 0 siblings, 1 reply; 30+ messages in thread From: Chun-Yu Shei @ 2020-07-06 17:30 UTC (permalink / raw To: gentoo-portage-dev Cc: Michael 'veremitz' Everitt, Sid Spry, Zac Medico I finally got a chance to try Sid's lru_cache suggestion, and the results were really good. Simply adding it on catpkgsplit and moving the body of use_reduce into a separate function (that accepts tuples instead of unhashable lists/sets) and decorating it with lru_cache gets a similar 40% overall speedup for the upgrade case I tested. It seems like even a relatively small cache size (1000 entries) gives quite a speedup, even though in the use_reduce case, the cache size eventually reaches almost 20,000 entries if no limit is set. With these two changes, adding caching to match_from_list didn't seem to make much/any difference. The catch is that lru_cache is only available in Python 3.2, so would it make sense to add a dummy lru_cache implementation for Python < 3.2 that does nothing? There is also a backports-functools-lru-cache package that's already available in the Portage tree, but that would add an additional external dependency. I agree that refactoring could yield an even bigger gain, but hopefully this can be implemented as an interim solution to speed up the common emerge case of resolving upgrades. I'm happy to submit new patches for this, if someone can suggest how to best handle the Python < 3.2 case. :) Thanks, Chun-Yu On Mon, Jul 6, 2020 at 9:10 AM Francesco Riosa <vivo75@gmail.com> wrote: > > Il 06/07/20 17:50, Michael 'veremitz' Everitt ha scritto: > > On 06/07/20 16:26, Francesco Riosa wrote: > >> Il 29/06/20 03:58, Sid Spry ha scritto: > >>> There are libraries that provide decorators, etc, for caching and > >>> memoization. > >>> Have you evaluated any of those? One is available in the standard library: > >>> https://docs.python.org/dev/library/functools.html#functools.lru_cache > >>> > >>> I comment as this would increase code clarity. > >>> > >> I think portage developers try hard to avoid external dependancies > >> I hope hard they do > >> > >> > > I think the key word here is 'external' - anything which is part of the > > python standard library is game for inclusion in portage, and has/does > > provide much needed optimisation. Many of the issues in portage are > > so-called "solved problems" in computing terms, and as such, we should take > > advantage of these to improve performance at every available opportunity. > > Of course, there are presently only one, two or three key developers able > > to make/test these changes (indeed at scale) so progress is often slower > > than desirable in current circumstances... > > > > [sent direct due to posting restrictions...] > yes I've replied too fast and didn't notice Sid was referring to > _standard_ libraries (not even recent additions) > > sorry for the noise > > - Francesco > > ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-portage-dev] [PATCH 1/3] Add caching to catpkgsplit function 2020-07-06 17:30 ` Chun-Yu Shei @ 2020-07-06 18:03 ` Zac Medico 2020-07-07 3:41 ` Zac Medico 0 siblings, 1 reply; 30+ messages in thread From: Zac Medico @ 2020-07-06 18:03 UTC (permalink / raw To: gentoo-portage-dev, Chun-Yu Shei Cc: Michael 'veremitz' Everitt, Sid Spry, Zac Medico [-- Attachment #1.1: Type: text/plain, Size: 1649 bytes --] On 7/6/20 10:30 AM, Chun-Yu Shei wrote: > I finally got a chance to try Sid's lru_cache suggestion, and the > results were really good. Simply adding it on catpkgsplit and moving > the body of use_reduce into a separate function (that accepts tuples > instead of unhashable lists/sets) and decorating it with lru_cache > gets a similar 40% overall speedup for the upgrade case I tested. It > seems like even a relatively small cache size (1000 entries) gives > quite a speedup, even though in the use_reduce case, the cache size > eventually reaches almost 20,000 entries if no limit is set. With > these two changes, adding caching to match_from_list didn't seem to > make much/any difference. That's great! > The catch is that lru_cache is only available in Python 3.2, so would > it make sense to add a dummy lru_cache implementation for Python < 3.2 > that does nothing? There is also a backports-functools-lru-cache > package that's already available in the Portage tree, but that would > add an additional external dependency. > > I agree that refactoring could yield an even bigger gain, but > hopefully this can be implemented as an interim solution to speed up > the common emerge case of resolving upgrades. I'm happy to submit new > patches for this, if someone can suggest how to best handle the Python > < 3.2 case. :) > > Thanks, > Chun-Yu We can safely drop support for < Python 3.6 at this point. Alternatively we could add a compatibility shim for Python 2.7 that does not perform any caching, but I really don't think it's worth the trouble to support it any longer. -- Thanks, Zac [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 981 bytes --] ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-portage-dev] [PATCH 1/3] Add caching to catpkgsplit function 2020-07-06 18:03 ` Zac Medico @ 2020-07-07 3:41 ` Zac Medico 2020-07-09 7:03 ` Chun-Yu Shei 0 siblings, 1 reply; 30+ messages in thread From: Zac Medico @ 2020-07-07 3:41 UTC (permalink / raw To: gentoo-portage-dev, Chun-Yu Shei Cc: Michael 'veremitz' Everitt, Sid Spry [-- Attachment #1.1: Type: text/plain, Size: 1799 bytes --] On 7/6/20 11:03 AM, Zac Medico wrote: > On 7/6/20 10:30 AM, Chun-Yu Shei wrote: >> I finally got a chance to try Sid's lru_cache suggestion, and the >> results were really good. Simply adding it on catpkgsplit and moving >> the body of use_reduce into a separate function (that accepts tuples >> instead of unhashable lists/sets) and decorating it with lru_cache >> gets a similar 40% overall speedup for the upgrade case I tested. It >> seems like even a relatively small cache size (1000 entries) gives >> quite a speedup, even though in the use_reduce case, the cache size >> eventually reaches almost 20,000 entries if no limit is set. With >> these two changes, adding caching to match_from_list didn't seem to >> make much/any difference. > > That's great! > >> The catch is that lru_cache is only available in Python 3.2, so would >> it make sense to add a dummy lru_cache implementation for Python < 3.2 >> that does nothing? There is also a backports-functools-lru-cache >> package that's already available in the Portage tree, but that would >> add an additional external dependency. >> >> I agree that refactoring could yield an even bigger gain, but >> hopefully this can be implemented as an interim solution to speed up >> the common emerge case of resolving upgrades. I'm happy to submit new >> patches for this, if someone can suggest how to best handle the Python >> < 3.2 case. :) >> >> Thanks, >> Chun-Yu > > We can safely drop support for < Python 3.6 at this point. Alternatively > we could add a compatibility shim for Python 2.7 that does not perform > any caching, but I really don't think it's worth the trouble to support > it any longer. We've dropped Python 2.7, so now the minimum version is Python 3.6. -- Thanks, Zac [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 981 bytes --] ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-portage-dev] [PATCH 1/3] Add caching to catpkgsplit function 2020-07-07 3:41 ` Zac Medico @ 2020-07-09 7:03 ` Chun-Yu Shei 2020-07-09 7:03 ` [gentoo-portage-dev] [PATCH] Add caching to use_reduce, vercmp, and catpkgsplit Chun-Yu Shei 2020-07-09 21:04 ` [gentoo-portage-dev] [PATCH 1/3] Add caching to catpkgsplit function Alec Warner 0 siblings, 2 replies; 30+ messages in thread From: Chun-Yu Shei @ 2020-07-09 7:03 UTC (permalink / raw To: gentoo-portage-dev Awesome! Here's a patch that adds @lru_cache to use_reduce, vercmp, and catpkgsplit. use_reduce was split into 2 functions, with the outer one converting lists/sets to tuples so they can be hashed and creating a copy of the returned list (since the caller seems to modify it sometimes). I tried to select cache sizes that minimized memory use increase, while still providing about the same speedup compared to a cache with unbounded size. "emerge -uDvpU --with-bdeps=y @world" runtime decreases from 44.32s -> 29.94s -- a 48% speedup, while the maximum value of the RES column in htop increases from 280 MB -> 290 MB. "emerge -ep @world" time slightly decreases from 18.77s -> 17.93, while max observed RES value actually decreases from 228 MB -> 214 MB (similar values observed across a few before/after runs). Here are the cache hit stats, max observed RES memory, and runtime in seconds for various sizes in the update case. Caching for each function was tested independently (only 1 function with caching enabled at a time): catpkgsplit: CacheInfo(hits=1222233, misses=21419, maxsize=None, currsize=21419) 270 MB 39.217 CacheInfo(hits=1218900, misses=24905, maxsize=10000, currsize=10000) 271 MB 39.112 CacheInfo(hits=1212675, misses=31022, maxsize=5000, currsize=5000) 271 MB 39.217 CacheInfo(hits=1207879, misses=35878, maxsize=2500, currsize=2500) 269 MB 39.438 CacheInfo(hits=1199402, misses=44250, maxsize=1000, currsize=1000) 271 MB 39.348 CacheInfo(hits=1149150, misses=94610, maxsize=100, currsize=100) 271 MB 39.487 use_reduce: CacheInfo(hits=45326, misses=18660, maxsize=None, currsize=18561) 407 MB 35.77 CacheInfo(hits=45186, misses=18800, maxsize=10000, currsize=10000) 353 MB 35.52 CacheInfo(hits=44977, misses=19009, maxsize=5000, currsize=5000) 335 MB 35.31 CacheInfo(hits=44691, misses=19295, maxsize=2500, currsize=2500) 318 MB 35.85 CacheInfo(hits=44178, misses=19808, maxsize=1000, currsize=1000) 301 MB 36.39 CacheInfo(hits=41211, misses=22775, maxsize=100, currsize=100) 299 MB 37.175 I didn't bother collecting detailed stats for vercmp, since the inputs/outputs are quite small and don't cause much memory increase. Please let me know if there are any other suggestions/improvements (and thanks Sid for the lru_cache suggestion!). Thanks, Chun-Yu ^ permalink raw reply [flat|nested] 30+ messages in thread
* [gentoo-portage-dev] [PATCH] Add caching to use_reduce, vercmp, and catpkgsplit 2020-07-09 7:03 ` Chun-Yu Shei @ 2020-07-09 7:03 ` Chun-Yu Shei 2020-07-12 21:46 ` Zac Medico 2020-07-09 21:04 ` [gentoo-portage-dev] [PATCH 1/3] Add caching to catpkgsplit function Alec Warner 1 sibling, 1 reply; 30+ messages in thread From: Chun-Yu Shei @ 2020-07-09 7:03 UTC (permalink / raw To: gentoo-portage-dev; +Cc: Chun-Yu Shei Each of these functions is called repeatedly with the same arguments many times. Cache sizes were selected to minimize memory use increase, while still providing about the same speedup compared to a cache with unbounded size. "emerge -uDvpU --with-bdeps=y @world" runtime decreases from 44.32s -> 29.94s -- a 48% speedup, while the maximum value of the RES column in htop increases from 280 MB -> 290 MB. "emerge -ep @world" time slightly decreases from 18.77s -> 17.93, while max observed RES value actually decreases from 228 MB -> 214 MB (similar values observed across a few before/after runs). --- lib/portage/dep/__init__.py | 106 +++++++++++++++++++++--------------- lib/portage/versions.py | 3 + 2 files changed, 66 insertions(+), 43 deletions(-) diff --git a/lib/portage/dep/__init__.py b/lib/portage/dep/__init__.py index 72988357a..4d91a411a 100644 --- a/lib/portage/dep/__init__.py +++ b/lib/portage/dep/__init__.py @@ -23,6 +23,7 @@ portage.proxy.lazyimport.lazyimport(globals(), 'portage.util:cmp_sort_key,writemsg', ) +from functools import lru_cache from portage import _encodings, _unicode_decode, _unicode_encode from portage.eapi import _get_eapi_attrs from portage.exception import InvalidAtom, InvalidData, InvalidDependString @@ -404,49 +405,9 @@ def paren_enclose(mylist, unevaluated_atom=False, opconvert=False): mystrparts.append(x) return " ".join(mystrparts) -def use_reduce(depstr, uselist=(), masklist=(), matchall=False, excludeall=(), is_src_uri=False, \ - eapi=None, opconvert=False, flat=False, is_valid_flag=None, token_class=None, matchnone=False, - subset=None): - """ - Takes a dep string and reduces the use? conditionals out, leaving an array - with subarrays. All redundant brackets are removed. - - @param depstr: depstring - @type depstr: String - @param uselist: Sequence of use enabled flags - @type uselist: Sequence - @param masklist: Sequence of masked flags (always treated as disabled) - @type masklist: Sequence - @param matchall: Treat all conditionals as active. Used by repoman. - @type matchall: Bool - @param excludeall: Sequence of flags for which negated conditionals are always treated as inactive. - @type excludeall: Sequence - @param is_src_uri: Indicates if depstr represents a SRC_URI - @type is_src_uri: Bool - @param eapi: Indicates the EAPI the dep string has to comply to - @type eapi: String - @param opconvert: Put every operator as first element into it's argument list - @type opconvert: Bool - @param flat: Create a flat list of all tokens - @type flat: Bool - @param is_valid_flag: Function that decides if a given use flag might be used in use conditionals - @type is_valid_flag: Function - @param token_class: Convert all non operator tokens into this class - @type token_class: Class - @param matchnone: Treat all conditionals as inactive. Used by digestgen(). - @type matchnone: Bool - @param subset: Select a subset of dependencies conditional on the given flags - @type subset: Sequence - @rtype: List - @return: The use reduced depend array - """ - if isinstance(depstr, list): - if portage._internal_caller: - warnings.warn(_("Passing paren_reduced dep arrays to %s is deprecated. " + \ - "Pass the original dep string instead.") % \ - ('portage.dep.use_reduce',), DeprecationWarning, stacklevel=2) - depstr = paren_enclose(depstr) - +@lru_cache(1024) +def use_reduce_cached(depstr, uselist, masklist, matchall, excludeall, is_src_uri, eapi, \ + opconvert, flat, is_valid_flag, token_class, matchnone, subset): if opconvert and flat: raise ValueError("portage.dep.use_reduce: 'opconvert' and 'flat' are mutually exclusive") @@ -769,6 +730,65 @@ def use_reduce(depstr, uselist=(), masklist=(), matchall=False, excludeall=(), i return stack[0] +def use_reduce(depstr, uselist=(), masklist=(), matchall=False, excludeall=(), is_src_uri=False, \ + eapi=None, opconvert=False, flat=False, is_valid_flag=None, token_class=None, matchnone=False, + subset=None): + """ + Takes a dep string and reduces the use? conditionals out, leaving an array + with subarrays. All redundant brackets are removed. + + @param depstr: depstring + @type depstr: String + @param uselist: Sequence of use enabled flags + @type uselist: Sequence + @param masklist: Sequence of masked flags (always treated as disabled) + @type masklist: Sequence + @param matchall: Treat all conditionals as active. Used by repoman. + @type matchall: Bool + @param excludeall: Sequence of flags for which negated conditionals are always treated as inactive. + @type excludeall: Sequence + @param is_src_uri: Indicates if depstr represents a SRC_URI + @type is_src_uri: Bool + @param eapi: Indicates the EAPI the dep string has to comply to + @type eapi: String + @param opconvert: Put every operator as first element into it's argument list + @type opconvert: Bool + @param flat: Create a flat list of all tokens + @type flat: Bool + @param is_valid_flag: Function that decides if a given use flag might be used in use conditionals + @type is_valid_flag: Function + @param token_class: Convert all non operator tokens into this class + @type token_class: Class + @param matchnone: Treat all conditionals as inactive. Used by digestgen(). + @type matchnone: Bool + @param subset: Select a subset of dependencies conditional on the given flags + @type subset: Sequence + @rtype: List + @return: The use reduced depend array + """ + if isinstance(depstr, list): + if portage._internal_caller: + warnings.warn(_("Passing paren_reduced dep arrays to %s is deprecated. " + \ + "Pass the original dep string instead.") % \ + ('portage.dep.use_reduce',), DeprecationWarning, stacklevel=2) + depstr = paren_enclose(depstr) + + if uselist is not None: + uselist = tuple(uselist) + if masklist is not None: + masklist = tuple(masklist) + if excludeall is not None: + excludeall = tuple(excludeall) + if subset is not None: + subset = tuple(subset) + + result = use_reduce_cached(depstr, uselist, masklist, matchall, excludeall, \ + is_src_uri, eapi, opconvert, flat, is_valid_flag, token_class, \ + matchnone, subset) + + # The list returned by this function may be modified, so return a copy. + return result[:] + def dep_opconvert(deplist): """ Iterate recursively through a list of deps, if the diff --git a/lib/portage/versions.py b/lib/portage/versions.py index 0c21373cc..9ede67230 100644 --- a/lib/portage/versions.py +++ b/lib/portage/versions.py @@ -25,6 +25,7 @@ portage.proxy.lazyimport.lazyimport(globals(), 'portage.repository.config:_gen_valid_repo', 'portage.util:cmp_sort_key', ) +from functools import lru_cache from portage import _unicode_decode from portage.eapi import _get_eapi_attrs from portage.exception import InvalidData @@ -116,6 +117,7 @@ def ververify(myver, silent=1): print(_("!!! syntax error in version: %s") % myver) return False +@lru_cache(1024) def vercmp(ver1, ver2, silent=1): """ Compare two versions @@ -313,6 +315,7 @@ def _pkgsplit(mypkg, eapi=None): _cat_re = re.compile('^%s$' % _cat, re.UNICODE) _missing_cat = 'null' +@lru_cache(10240) def catpkgsplit(mydata, silent=1, eapi=None): """ Takes a Category/Package-Version-Rev and returns a list of each. -- 2.27.0.383.g050319c2ae-goog ^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [gentoo-portage-dev] [PATCH] Add caching to use_reduce, vercmp, and catpkgsplit 2020-07-09 7:03 ` [gentoo-portage-dev] [PATCH] Add caching to use_reduce, vercmp, and catpkgsplit Chun-Yu Shei @ 2020-07-12 21:46 ` Zac Medico 2020-07-13 6:30 ` [gentoo-portage-dev] [PATCH] Add caching to use_reduce, Chun-Yu Shei 0 siblings, 1 reply; 30+ messages in thread From: Zac Medico @ 2020-07-12 21:46 UTC (permalink / raw To: gentoo-portage-dev, Chun-Yu Shei [-- Attachment #1.1: Type: text/plain, Size: 2476 bytes --] On 7/9/20 12:03 AM, Chun-Yu Shei wrote: > +def use_reduce(depstr, uselist=(), masklist=(), matchall=False, excludeall=(), is_src_uri=False, \ > + eapi=None, opconvert=False, flat=False, is_valid_flag=None, token_class=None, matchnone=False, > + subset=None): > + """ > + Takes a dep string and reduces the use? conditionals out, leaving an array > + with subarrays. All redundant brackets are removed. > + > + @param depstr: depstring > + @type depstr: String > + @param uselist: Sequence of use enabled flags > + @type uselist: Sequence > + @param masklist: Sequence of masked flags (always treated as disabled) > + @type masklist: Sequence > + @param matchall: Treat all conditionals as active. Used by repoman. > + @type matchall: Bool > + @param excludeall: Sequence of flags for which negated conditionals are always treated as inactive. > + @type excludeall: Sequence > + @param is_src_uri: Indicates if depstr represents a SRC_URI > + @type is_src_uri: Bool > + @param eapi: Indicates the EAPI the dep string has to comply to > + @type eapi: String > + @param opconvert: Put every operator as first element into it's argument list > + @type opconvert: Bool > + @param flat: Create a flat list of all tokens > + @type flat: Bool > + @param is_valid_flag: Function that decides if a given use flag might be used in use conditionals > + @type is_valid_flag: Function > + @param token_class: Convert all non operator tokens into this class > + @type token_class: Class > + @param matchnone: Treat all conditionals as inactive. Used by digestgen(). > + @type matchnone: Bool > + @param subset: Select a subset of dependencies conditional on the given flags > + @type subset: Sequence > + @rtype: List > + @return: The use reduced depend array > + """ > + if isinstance(depstr, list): > + if portage._internal_caller: > + warnings.warn(_("Passing paren_reduced dep arrays to %s is deprecated. " + \ > + "Pass the original dep string instead.") % \ > + ('portage.dep.use_reduce',), DeprecationWarning, stacklevel=2) > + depstr = paren_enclose(depstr) > + > + if uselist is not None: > + uselist = tuple(uselist) > + if masklist is not None: > + masklist = tuple(masklist) > + if excludeall is not None: > + excludeall = tuple(excludeall) > + if subset is not None: > + subset = tuple(subset) The patch looks great, but maybe it's better if we use frozenset instead of tuple for these. -- Thanks, Zac [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 981 bytes --] ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-portage-dev] [PATCH] Add caching to use_reduce, 2020-07-12 21:46 ` Zac Medico @ 2020-07-13 6:30 ` Chun-Yu Shei 2020-07-13 6:30 ` [gentoo-portage-dev] [PATCH] Add caching to use_reduce, vercmp, and catpkgsplit Chun-Yu Shei 0 siblings, 1 reply; 30+ messages in thread From: Chun-Yu Shei @ 2020-07-13 6:30 UTC (permalink / raw To: gentoo-portage-dev Sounds good, here's an updated patch that uses frozenset instead. I also moved the lru_cache imports up with the other system imports and renamed use_reduce_cached to _use_reduce_cached to fit with the existing convention. Memory usage and performance are essentially unchanged due to the tuple to frozenset change. Thanks, Chun-Yu ^ permalink raw reply [flat|nested] 30+ messages in thread
* [gentoo-portage-dev] [PATCH] Add caching to use_reduce, vercmp, and catpkgsplit 2020-07-13 6:30 ` [gentoo-portage-dev] [PATCH] Add caching to use_reduce, Chun-Yu Shei @ 2020-07-13 6:30 ` Chun-Yu Shei 2020-07-13 17:28 ` Zac Medico 0 siblings, 1 reply; 30+ messages in thread From: Chun-Yu Shei @ 2020-07-13 6:30 UTC (permalink / raw To: gentoo-portage-dev; +Cc: Chun-Yu Shei Each of these functions is called repeatedly with the same arguments many times. Cache sizes were selected to minimize memory use increase, while still providing about the same speedup compared to a cache with unbounded size. "emerge -uDvpU --with-bdeps=y @world" runtime decreases from 44.32s -> 29.94s -- a 48% speedup, while the maximum value of the RES column in htop increases from 280 MB -> 290 MB. "emerge -ep @world" time slightly decreases from 18.77s -> 17.93, while max observed RES value actually decreases from 228 MB -> 214 MB (similar values observed across a few before/after runs). --- lib/portage/dep/__init__.py | 107 +++++++++++++++++++++--------------- lib/portage/versions.py | 3 + 2 files changed, 67 insertions(+), 43 deletions(-) diff --git a/lib/portage/dep/__init__.py b/lib/portage/dep/__init__.py index 72988357a..314338f7c 100644 --- a/lib/portage/dep/__init__.py +++ b/lib/portage/dep/__init__.py @@ -17,6 +17,7 @@ __all__ = [ import re, sys import warnings +from functools import lru_cache import portage portage.proxy.lazyimport.lazyimport(globals(), @@ -404,49 +405,10 @@ def paren_enclose(mylist, unevaluated_atom=False, opconvert=False): mystrparts.append(x) return " ".join(mystrparts) -def use_reduce(depstr, uselist=(), masklist=(), matchall=False, excludeall=(), is_src_uri=False, \ - eapi=None, opconvert=False, flat=False, is_valid_flag=None, token_class=None, matchnone=False, - subset=None): - """ - Takes a dep string and reduces the use? conditionals out, leaving an array - with subarrays. All redundant brackets are removed. - - @param depstr: depstring - @type depstr: String - @param uselist: Sequence of use enabled flags - @type uselist: Sequence - @param masklist: Sequence of masked flags (always treated as disabled) - @type masklist: Sequence - @param matchall: Treat all conditionals as active. Used by repoman. - @type matchall: Bool - @param excludeall: Sequence of flags for which negated conditionals are always treated as inactive. - @type excludeall: Sequence - @param is_src_uri: Indicates if depstr represents a SRC_URI - @type is_src_uri: Bool - @param eapi: Indicates the EAPI the dep string has to comply to - @type eapi: String - @param opconvert: Put every operator as first element into it's argument list - @type opconvert: Bool - @param flat: Create a flat list of all tokens - @type flat: Bool - @param is_valid_flag: Function that decides if a given use flag might be used in use conditionals - @type is_valid_flag: Function - @param token_class: Convert all non operator tokens into this class - @type token_class: Class - @param matchnone: Treat all conditionals as inactive. Used by digestgen(). - @type matchnone: Bool - @param subset: Select a subset of dependencies conditional on the given flags - @type subset: Sequence - @rtype: List - @return: The use reduced depend array - """ - if isinstance(depstr, list): - if portage._internal_caller: - warnings.warn(_("Passing paren_reduced dep arrays to %s is deprecated. " + \ - "Pass the original dep string instead.") % \ - ('portage.dep.use_reduce',), DeprecationWarning, stacklevel=2) - depstr = paren_enclose(depstr) - +@lru_cache(1024) +def _use_reduce_cached(depstr, uselist, masklist, matchall, excludeall, \ + is_src_uri, eapi, opconvert, flat, is_valid_flag, token_class, \ + matchnone,subset): if opconvert and flat: raise ValueError("portage.dep.use_reduce: 'opconvert' and 'flat' are mutually exclusive") @@ -769,6 +731,65 @@ def use_reduce(depstr, uselist=(), masklist=(), matchall=False, excludeall=(), i return stack[0] +def use_reduce(depstr, uselist=(), masklist=(), matchall=False, excludeall=(), is_src_uri=False, \ + eapi=None, opconvert=False, flat=False, is_valid_flag=None, token_class=None, matchnone=False, + subset=None): + """ + Takes a dep string and reduces the use? conditionals out, leaving an array + with subarrays. All redundant brackets are removed. + + @param depstr: depstring + @type depstr: String + @param uselist: Sequence of use enabled flags + @type uselist: Sequence + @param masklist: Sequence of masked flags (always treated as disabled) + @type masklist: Sequence + @param matchall: Treat all conditionals as active. Used by repoman. + @type matchall: Bool + @param excludeall: Sequence of flags for which negated conditionals are always treated as inactive. + @type excludeall: Sequence + @param is_src_uri: Indicates if depstr represents a SRC_URI + @type is_src_uri: Bool + @param eapi: Indicates the EAPI the dep string has to comply to + @type eapi: String + @param opconvert: Put every operator as first element into it's argument list + @type opconvert: Bool + @param flat: Create a flat list of all tokens + @type flat: Bool + @param is_valid_flag: Function that decides if a given use flag might be used in use conditionals + @type is_valid_flag: Function + @param token_class: Convert all non operator tokens into this class + @type token_class: Class + @param matchnone: Treat all conditionals as inactive. Used by digestgen(). + @type matchnone: Bool + @param subset: Select a subset of dependencies conditional on the given flags + @type subset: Sequence + @rtype: List + @return: The use reduced depend array + """ + if isinstance(depstr, list): + if portage._internal_caller: + warnings.warn(_("Passing paren_reduced dep arrays to %s is deprecated. " + \ + "Pass the original dep string instead.") % \ + ('portage.dep.use_reduce',), DeprecationWarning, stacklevel=2) + depstr = paren_enclose(depstr) + + if uselist is not None: + uselist = frozenset(uselist) + if masklist is not None: + masklist = frozenset(masklist) + if excludeall is not None: + excludeall = frozenset(excludeall) + if subset is not None: + subset = frozenset(subset) + + result = _use_reduce_cached(depstr, uselist, masklist, matchall, \ + excludeall, is_src_uri, eapi, opconvert, flat, is_valid_flag, \ + token_class, matchnone, subset) + + # The list returned by this function may be modified, so return a copy. + return result[:] + def dep_opconvert(deplist): """ Iterate recursively through a list of deps, if the diff --git a/lib/portage/versions.py b/lib/portage/versions.py index 0c21373cc..100c8b84d 100644 --- a/lib/portage/versions.py +++ b/lib/portage/versions.py @@ -13,6 +13,7 @@ __all__ = [ import re import sys import warnings +from functools import lru_cache if sys.hexversion < 0x3000000: _unicode = unicode @@ -116,6 +117,7 @@ def ververify(myver, silent=1): print(_("!!! syntax error in version: %s") % myver) return False +@lru_cache(1024) def vercmp(ver1, ver2, silent=1): """ Compare two versions @@ -313,6 +315,7 @@ def _pkgsplit(mypkg, eapi=None): _cat_re = re.compile('^%s$' % _cat, re.UNICODE) _missing_cat = 'null' +@lru_cache(10240) def catpkgsplit(mydata, silent=1, eapi=None): """ Takes a Category/Package-Version-Rev and returns a list of each. -- 2.27.0.389.gc38d7665816-goog ^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [gentoo-portage-dev] [PATCH] Add caching to use_reduce, vercmp, and catpkgsplit 2020-07-13 6:30 ` [gentoo-portage-dev] [PATCH] Add caching to use_reduce, vercmp, and catpkgsplit Chun-Yu Shei @ 2020-07-13 17:28 ` Zac Medico 2020-07-13 18:54 ` Ulrich Mueller 0 siblings, 1 reply; 30+ messages in thread From: Zac Medico @ 2020-07-13 17:28 UTC (permalink / raw To: gentoo-portage-dev, Chun-Yu Shei [-- Attachment #1.1: Type: text/plain, Size: 979 bytes --] On 7/12/20 11:30 PM, Chun-Yu Shei wrote: > Each of these functions is called repeatedly with the same arguments > many times. Cache sizes were selected to minimize memory use increase, > while still providing about the same speedup compared to a cache with > unbounded size. "emerge -uDvpU --with-bdeps=y @world" runtime decreases > from 44.32s -> 29.94s -- a 48% speedup, while the maximum value of the > RES column in htop increases from 280 MB -> 290 MB. > > "emerge -ep @world" time slightly decreases from 18.77s -> 17.93, while > max observed RES value actually decreases from 228 MB -> 214 MB (similar > values observed across a few before/after runs). > --- > lib/portage/dep/__init__.py | 107 +++++++++++++++++++++--------------- > lib/portage/versions.py | 3 + > 2 files changed, 67 insertions(+), 43 deletions(-) Merged: https://gitweb.gentoo.org/proj/portage.git/commit/?id=d9ee5b09664ab2255b62c1d52d554721ef8b716a -- Thanks, Zac [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 981 bytes --] ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-portage-dev] [PATCH] Add caching to use_reduce, vercmp, and catpkgsplit 2020-07-13 17:28 ` Zac Medico @ 2020-07-13 18:54 ` Ulrich Mueller 2020-07-13 19:04 ` Chun-Yu Shei 0 siblings, 1 reply; 30+ messages in thread From: Ulrich Mueller @ 2020-07-13 18:54 UTC (permalink / raw To: Zac Medico; +Cc: gentoo-portage-dev, Chun-Yu Shei [-- Attachment #1: Type: text/plain, Size: 209 bytes --] >>>>> On Mon, 13 Jul 2020, Zac Medico wrote: > Merged: > https://gitweb.gentoo.org/proj/portage.git/commit/?id=d9ee5b09664ab2255b62c1d52d554721ef8b716a Looks like the author's copyright signoff is missing? [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 507 bytes --] ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-portage-dev] [PATCH] Add caching to use_reduce, vercmp, and catpkgsplit 2020-07-13 18:54 ` Ulrich Mueller @ 2020-07-13 19:04 ` Chun-Yu Shei 2020-07-13 19:24 ` Ulrich Mueller 0 siblings, 1 reply; 30+ messages in thread From: Chun-Yu Shei @ 2020-07-13 19:04 UTC (permalink / raw To: Ulrich Mueller; +Cc: Zac Medico, gentoo-portage-dev [-- Attachment #1: Type: text/plain, Size: 557 bytes --] Ah, I wasn't aware that I should have added that... I'm happy to say "Signed-off-by: Chun-Yu Shei <cshei@google.com>" somewhere if necessary. The patch has gone through Google's open source patching approval process and I'm able to agree to any CLA required. On Mon, Jul 13, 2020 at 11:54 AM Ulrich Mueller <ulm@gentoo.org> wrote: > >>>>> On Mon, 13 Jul 2020, Zac Medico wrote: > > > Merged: > > > > https://gitweb.gentoo.org/proj/portage.git/commit/?id=d9ee5b09664ab2255b62c1d52d554721ef8b716a > > Looks like the author's copyright signoff is missing? > [-- Attachment #2: Type: text/html, Size: 1105 bytes --] ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-portage-dev] [PATCH] Add caching to use_reduce, vercmp, and catpkgsplit 2020-07-13 19:04 ` Chun-Yu Shei @ 2020-07-13 19:24 ` Ulrich Mueller 0 siblings, 0 replies; 30+ messages in thread From: Ulrich Mueller @ 2020-07-13 19:24 UTC (permalink / raw To: Chun-Yu Shei; +Cc: gentoo-portage-dev, Zac Medico [-- Attachment #1: Type: text/plain, Size: 666 bytes --] >>>>> On Mon, 13 Jul 2020, Chun-Yu Shei wrote: > Ah, I wasn't aware that I should have added that... I'm happy to say > "Signed-off-by: Chun-Yu Shei <cshei@google.com>" somewhere if > necessary. Should be enough to say it here, because this mailing list is archived. We could of course add an empty commit with "Fixes" and "Signed-off-by", so we'd have the paper trail in the repo. > The patch has gone through Google's open source patching approval > process and I'm able to agree to any CLA required. We don't require a CLA. As long as you can signoff the copyright, things are fine. See https://www.gentoo.org/glep/glep-0076.html for all the details. Ulrich [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 507 bytes --] ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-portage-dev] [PATCH 1/3] Add caching to catpkgsplit function 2020-07-09 7:03 ` Chun-Yu Shei 2020-07-09 7:03 ` [gentoo-portage-dev] [PATCH] Add caching to use_reduce, vercmp, and catpkgsplit Chun-Yu Shei @ 2020-07-09 21:04 ` Alec Warner 2020-07-09 21:06 ` Chun-Yu Shei 1 sibling, 1 reply; 30+ messages in thread From: Alec Warner @ 2020-07-09 21:04 UTC (permalink / raw To: gentoo-portage-dev [-- Attachment #1: Type: text/plain, Size: 2577 bytes --] On Thu, Jul 9, 2020 at 12:03 AM Chun-Yu Shei <cshei@google.com> wrote: > Awesome! Here's a patch that adds @lru_cache to use_reduce, vercmp, and > catpkgsplit. use_reduce was split into 2 functions, with the outer one > converting lists/sets to tuples so they can be hashed and creating a > copy of the returned list (since the caller seems to modify it > sometimes). I tried to select cache sizes that minimized memory use > increase, > while still providing about the same speedup compared to a cache with > unbounded size. "emerge -uDvpU --with-bdeps=y @world" runtime decreases > from 44.32s -> 29.94s -- a 48% speedup, while the maximum value of the > RES column in htop increases from 280 MB -> 290 MB. > > "emerge -ep @world" time slightly decreases from 18.77s -> 17.93, while > max observed RES value actually decreases from 228 MB -> 214 MB (similar > values observed across a few before/after runs). > > Here are the cache hit stats, max observed RES memory, and runtime in > seconds for various sizes in the update case. Caching for each > function was tested independently (only 1 function with caching enabled > at a time): > > catpkgsplit: > CacheInfo(hits=1222233, misses=21419, maxsize=None, currsize=21419) > 270 MB > 39.217 > > CacheInfo(hits=1218900, misses=24905, maxsize=10000, currsize=10000) > 271 MB > 39.112 > > CacheInfo(hits=1212675, misses=31022, maxsize=5000, currsize=5000) > 271 MB > 39.217 > > CacheInfo(hits=1207879, misses=35878, maxsize=2500, currsize=2500) > 269 MB > 39.438 > > CacheInfo(hits=1199402, misses=44250, maxsize=1000, currsize=1000) > 271 MB > 39.348 > > CacheInfo(hits=1149150, misses=94610, maxsize=100, currsize=100) > 271 MB > 39.487 > > > use_reduce: > CacheInfo(hits=45326, misses=18660, maxsize=None, currsize=18561) > 407 MB > 35.77 > > CacheInfo(hits=45186, misses=18800, maxsize=10000, currsize=10000) > 353 MB > 35.52 > > CacheInfo(hits=44977, misses=19009, maxsize=5000, currsize=5000) > 335 MB > 35.31 > > CacheInfo(hits=44691, misses=19295, maxsize=2500, currsize=2500) > 318 MB > 35.85 > > CacheInfo(hits=44178, misses=19808, maxsize=1000, currsize=1000) > 301 MB > 36.39 > > CacheInfo(hits=41211, misses=22775, maxsize=100, currsize=100) > 299 MB > 37.175 > > > I didn't bother collecting detailed stats for vercmp, since the > inputs/outputs are quite small and don't cause much memory increase. > Please let me know if there are any other suggestions/improvements (and > thanks Sid for the lru_cache suggestion!). > I don't see a patch attached; can you link to it? -A > > Thanks, > Chun-Yu > > > > [-- Attachment #2: Type: text/html, Size: 3384 bytes --] ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-portage-dev] [PATCH 1/3] Add caching to catpkgsplit function 2020-07-09 21:04 ` [gentoo-portage-dev] [PATCH 1/3] Add caching to catpkgsplit function Alec Warner @ 2020-07-09 21:06 ` Chun-Yu Shei 2020-07-09 21:13 ` Alec Warner 0 siblings, 1 reply; 30+ messages in thread From: Chun-Yu Shei @ 2020-07-09 21:06 UTC (permalink / raw To: gentoo-portage-dev [-- Attachment #1: Type: text/plain, Size: 2979 bytes --] Hmm, that's strange... it seems to have made it to the list archives: https://archives.gentoo.org/gentoo-portage-dev/message/a4db905a64e3c1f6d88c4876e8291a65 (but it is entirely possible that I used "git send-email" incorrectly) On Thu, Jul 9, 2020 at 2:04 PM Alec Warner <antarus@gentoo.org> wrote: > > > On Thu, Jul 9, 2020 at 12:03 AM Chun-Yu Shei <cshei@google.com> wrote: > >> Awesome! Here's a patch that adds @lru_cache to use_reduce, vercmp, and >> catpkgsplit. use_reduce was split into 2 functions, with the outer one >> converting lists/sets to tuples so they can be hashed and creating a >> copy of the returned list (since the caller seems to modify it >> sometimes). I tried to select cache sizes that minimized memory use >> increase, >> while still providing about the same speedup compared to a cache with >> unbounded size. "emerge -uDvpU --with-bdeps=y @world" runtime decreases >> from 44.32s -> 29.94s -- a 48% speedup, while the maximum value of the >> RES column in htop increases from 280 MB -> 290 MB. >> >> "emerge -ep @world" time slightly decreases from 18.77s -> 17.93, while >> max observed RES value actually decreases from 228 MB -> 214 MB (similar >> values observed across a few before/after runs). >> >> Here are the cache hit stats, max observed RES memory, and runtime in >> seconds for various sizes in the update case. Caching for each >> function was tested independently (only 1 function with caching enabled >> at a time): >> >> catpkgsplit: >> CacheInfo(hits=1222233, misses=21419, maxsize=None, currsize=21419) >> 270 MB >> 39.217 >> >> CacheInfo(hits=1218900, misses=24905, maxsize=10000, currsize=10000) >> 271 MB >> 39.112 >> >> CacheInfo(hits=1212675, misses=31022, maxsize=5000, currsize=5000) >> 271 MB >> 39.217 >> >> CacheInfo(hits=1207879, misses=35878, maxsize=2500, currsize=2500) >> 269 MB >> 39.438 >> >> CacheInfo(hits=1199402, misses=44250, maxsize=1000, currsize=1000) >> 271 MB >> 39.348 >> >> CacheInfo(hits=1149150, misses=94610, maxsize=100, currsize=100) >> 271 MB >> 39.487 >> >> >> use_reduce: >> CacheInfo(hits=45326, misses=18660, maxsize=None, currsize=18561) >> 407 MB >> 35.77 >> >> CacheInfo(hits=45186, misses=18800, maxsize=10000, currsize=10000) >> 353 MB >> 35.52 >> >> CacheInfo(hits=44977, misses=19009, maxsize=5000, currsize=5000) >> 335 MB >> 35.31 >> >> CacheInfo(hits=44691, misses=19295, maxsize=2500, currsize=2500) >> 318 MB >> 35.85 >> >> CacheInfo(hits=44178, misses=19808, maxsize=1000, currsize=1000) >> 301 MB >> 36.39 >> >> CacheInfo(hits=41211, misses=22775, maxsize=100, currsize=100) >> 299 MB >> 37.175 >> >> >> I didn't bother collecting detailed stats for vercmp, since the >> inputs/outputs are quite small and don't cause much memory increase. >> Please let me know if there are any other suggestions/improvements (and >> thanks Sid for the lru_cache suggestion!). >> > > I don't see a patch attached; can you link to it? > > -A > > >> >> Thanks, >> Chun-Yu >> >> >> >> [-- Attachment #2: Type: text/html, Size: 4136 bytes --] ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-portage-dev] [PATCH 1/3] Add caching to catpkgsplit function 2020-07-09 21:06 ` Chun-Yu Shei @ 2020-07-09 21:13 ` Alec Warner 0 siblings, 0 replies; 30+ messages in thread From: Alec Warner @ 2020-07-09 21:13 UTC (permalink / raw To: gentoo-portage-dev [-- Attachment #1: Type: text/plain, Size: 3209 bytes --] On Thu, Jul 9, 2020 at 2:06 PM Chun-Yu Shei <cshei@google.com> wrote: > Hmm, that's strange... it seems to have made it to the list archives: > https://archives.gentoo.org/gentoo-portage-dev/message/a4db905a64e3c1f6d88c4876e8291a65 > > (but it is entirely possible that I used "git send-email" incorrectly) > Ahhh it's visible there; I'll blame gMail ;) -A > > On Thu, Jul 9, 2020 at 2:04 PM Alec Warner <antarus@gentoo.org> wrote: > >> >> >> On Thu, Jul 9, 2020 at 12:03 AM Chun-Yu Shei <cshei@google.com> wrote: >> >>> Awesome! Here's a patch that adds @lru_cache to use_reduce, vercmp, and >>> catpkgsplit. use_reduce was split into 2 functions, with the outer one >>> converting lists/sets to tuples so they can be hashed and creating a >>> copy of the returned list (since the caller seems to modify it >>> sometimes). I tried to select cache sizes that minimized memory use >>> increase, >>> while still providing about the same speedup compared to a cache with >>> unbounded size. "emerge -uDvpU --with-bdeps=y @world" runtime decreases >>> from 44.32s -> 29.94s -- a 48% speedup, while the maximum value of the >>> RES column in htop increases from 280 MB -> 290 MB. >>> >>> "emerge -ep @world" time slightly decreases from 18.77s -> 17.93, while >>> max observed RES value actually decreases from 228 MB -> 214 MB (similar >>> values observed across a few before/after runs). >>> >>> Here are the cache hit stats, max observed RES memory, and runtime in >>> seconds for various sizes in the update case. Caching for each >>> function was tested independently (only 1 function with caching enabled >>> at a time): >>> >>> catpkgsplit: >>> CacheInfo(hits=1222233, misses=21419, maxsize=None, currsize=21419) >>> 270 MB >>> 39.217 >>> >>> CacheInfo(hits=1218900, misses=24905, maxsize=10000, currsize=10000) >>> 271 MB >>> 39.112 >>> >>> CacheInfo(hits=1212675, misses=31022, maxsize=5000, currsize=5000) >>> 271 MB >>> 39.217 >>> >>> CacheInfo(hits=1207879, misses=35878, maxsize=2500, currsize=2500) >>> 269 MB >>> 39.438 >>> >>> CacheInfo(hits=1199402, misses=44250, maxsize=1000, currsize=1000) >>> 271 MB >>> 39.348 >>> >>> CacheInfo(hits=1149150, misses=94610, maxsize=100, currsize=100) >>> 271 MB >>> 39.487 >>> >>> >>> use_reduce: >>> CacheInfo(hits=45326, misses=18660, maxsize=None, currsize=18561) >>> 407 MB >>> 35.77 >>> >>> CacheInfo(hits=45186, misses=18800, maxsize=10000, currsize=10000) >>> 353 MB >>> 35.52 >>> >>> CacheInfo(hits=44977, misses=19009, maxsize=5000, currsize=5000) >>> 335 MB >>> 35.31 >>> >>> CacheInfo(hits=44691, misses=19295, maxsize=2500, currsize=2500) >>> 318 MB >>> 35.85 >>> >>> CacheInfo(hits=44178, misses=19808, maxsize=1000, currsize=1000) >>> 301 MB >>> 36.39 >>> >>> CacheInfo(hits=41211, misses=22775, maxsize=100, currsize=100) >>> 299 MB >>> 37.175 >>> >>> >>> I didn't bother collecting detailed stats for vercmp, since the >>> inputs/outputs are quite small and don't cause much memory increase. >>> Please let me know if there are any other suggestions/improvements (and >>> thanks Sid for the lru_cache suggestion!). >>> >> >> I don't see a patch attached; can you link to it? >> >> -A >> >> >>> >>> Thanks, >>> Chun-Yu >>> >>> >>> >>> [-- Attachment #2: Type: text/html, Size: 4780 bytes --] ^ permalink raw reply [flat|nested] 30+ messages in thread
* [gentoo-portage-dev] [PATCH 2/3] Add caching to use_reduce function 2020-06-27 6:34 [gentoo-portage-dev] Add caching to a few commonly used functions Chun-Yu Shei 2020-06-27 6:34 ` [gentoo-portage-dev] [PATCH 1/3] Add caching to catpkgsplit function Chun-Yu Shei @ 2020-06-27 6:34 ` Chun-Yu Shei 2020-06-27 6:34 ` [gentoo-portage-dev] [PATCH 3/3] Add partial caching to match_from_list Chun-Yu Shei ` (2 subsequent siblings) 4 siblings, 0 replies; 30+ messages in thread From: Chun-Yu Shei @ 2020-06-27 6:34 UTC (permalink / raw To: gentoo-portage-dev; +Cc: Chun-Yu Shei This function is called extremely frequently with similar arguments, so this optimization reduces "emerge -uDvpU --with-bdeps=y @world" runtime from 43.5 -> 34.5s -- a 25.8% speedup. --- lib/portage/dep/__init__.py | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/lib/portage/dep/__init__.py b/lib/portage/dep/__init__.py index 72988357a..df296dd81 100644 --- a/lib/portage/dep/__init__.py +++ b/lib/portage/dep/__init__.py @@ -404,6 +404,8 @@ def paren_enclose(mylist, unevaluated_atom=False, opconvert=False): mystrparts.append(x) return " ".join(mystrparts) +_use_reduce_cache = {} + def use_reduce(depstr, uselist=(), masklist=(), matchall=False, excludeall=(), is_src_uri=False, \ eapi=None, opconvert=False, flat=False, is_valid_flag=None, token_class=None, matchnone=False, subset=None): @@ -440,6 +442,27 @@ def use_reduce(depstr, uselist=(), masklist=(), matchall=False, excludeall=(), i @rtype: List @return: The use reduced depend array """ + uselist_key = None + masklist_key = None + excludeall_key = None + subset_key = None + if uselist is not None: + uselist_key = tuple(uselist) + if masklist is not None: + masklist_key = tuple(masklist) + if excludeall is not None: + excludeall_key = tuple(excludeall) + if subset is not None: + subset_key = tuple(subset) + cache_key = (depstr, uselist_key, masklist_key, matchall, excludeall_key, \ + is_src_uri, eapi, opconvert, flat, is_valid_flag, token_class, \ + matchnone, subset_key) + + cache_entry = _use_reduce_cache.get(cache_key) + if cache_entry is not None: + # The list returned by this function may be modified, so return a copy. + return cache_entry[:] + if isinstance(depstr, list): if portage._internal_caller: warnings.warn(_("Passing paren_reduced dep arrays to %s is deprecated. " + \ @@ -767,6 +790,9 @@ def use_reduce(depstr, uselist=(), masklist=(), matchall=False, excludeall=(), i raise InvalidDependString( _("Missing file name at end of string")) + # The list returned by this function may be modified, so store a copy. + _use_reduce_cache[cache_key] = stack[0][:] + return stack[0] def dep_opconvert(deplist): -- 2.27.0.212.ge8ba1cc988-goog ^ permalink raw reply related [flat|nested] 30+ messages in thread
* [gentoo-portage-dev] [PATCH 3/3] Add partial caching to match_from_list 2020-06-27 6:34 [gentoo-portage-dev] Add caching to a few commonly used functions Chun-Yu Shei 2020-06-27 6:34 ` [gentoo-portage-dev] [PATCH 1/3] Add caching to catpkgsplit function Chun-Yu Shei 2020-06-27 6:34 ` [gentoo-portage-dev] [PATCH 2/3] Add caching to use_reduce function Chun-Yu Shei @ 2020-06-27 6:34 ` Chun-Yu Shei 2020-06-27 7:35 ` [gentoo-portage-dev] Add caching to a few commonly used functions Fabian Groffen 2020-06-28 3:00 ` Zac Medico 4 siblings, 0 replies; 30+ messages in thread From: Chun-Yu Shei @ 2020-06-27 6:34 UTC (permalink / raw To: gentoo-portage-dev; +Cc: Chun-Yu Shei This function is called frequently with similar arguments, so cache as much of the partial results as possible. It seems that "match_from_list" must return a list containing actual entries from the "candidate_list" argument, so we store frozensets in "_match_from_list_cache" and test items from "candidate_list" for membership in these sets. The filtering performed by the mydep.unevaluated_atom.use and mydep.repo checks towards the end of the function is also not cached, since this causes some test failures. This results in a reduction of "emerge -uDvpU --with-bdeps=y @world" runtime from 43.53 -> 40.15 sec -- an 8.4% speedup. --- lib/portage/dep/__init__.py | 359 +++++++++++++++++++----------------- 1 file changed, 189 insertions(+), 170 deletions(-) diff --git a/lib/portage/dep/__init__.py b/lib/portage/dep/__init__.py index df296dd81..dbd23bb23 100644 --- a/lib/portage/dep/__init__.py +++ b/lib/portage/dep/__init__.py @@ -2174,6 +2174,8 @@ def best_match_to_list(mypkg, mylist): return bestm +_match_from_list_cache = {} + def match_from_list(mydep, candidate_list): """ Searches list for entries that matches the package. @@ -2197,209 +2199,226 @@ def match_from_list(mydep, candidate_list): if not isinstance(mydep, Atom): mydep = Atom(mydep, allow_wildcard=True, allow_repo=True) - mycpv = mydep.cpv - mycpv_cps = catpkgsplit(mycpv) # Can be None if not specific - build_id = mydep.build_id + cache_key = (mydep, tuple(candidate_list)) + key_has_hash = True + cache_entry = None + if mydep.build_id is None and key_has_hash: + try: + cache_entry = _match_from_list_cache.get(cache_key) + except TypeError: + key_has_hash = False - if not mycpv_cps: - ver = None - rev = None - else: - cat, pkg, ver, rev = mycpv_cps - if mydep == mycpv: - raise KeyError(_("Specific key requires an operator" - " (%s) (try adding an '=')") % (mydep)) - - if ver and rev: - operator = mydep.operator - if not operator: - writemsg(_("!!! Invalid atom: %s\n") % mydep, noiselevel=-1) - return [] + if cache_entry is not None: + # Note: the list returned by this function must contain actual entries + # from "candidate_list", so store frozensets in "_match_from_list_cache" + # and test items from "candidate_list" for membership in these sets. + mylist = [x for x in candidate_list if x in cache_entry] else: - operator = None - - mylist = [] + mycpv = mydep.cpv + mycpv_cps = catpkgsplit(mycpv) # Can be None if not specific + build_id = mydep.build_id - if mydep.extended_syntax: + if not mycpv_cps: + ver = None + rev = None + else: + cat, pkg, ver, rev = mycpv_cps + if mydep == mycpv: + raise KeyError(_("Specific key requires an operator" + " (%s) (try adding an '=')") % (mydep)) - for x in candidate_list: - cp = getattr(x, "cp", None) - if cp is None: - mysplit = catpkgsplit(remove_slot(x)) - if mysplit is not None: - cp = mysplit[0] + '/' + mysplit[1] + if ver and rev: + operator = mydep.operator + if not operator: + writemsg(_("!!! Invalid atom: %s\n") % mydep, noiselevel=-1) + return [] + else: + operator = None - if cp is None: - continue + mylist = [] - if cp == mycpv or extended_cp_match(mydep.cp, cp): - mylist.append(x) + if mydep.extended_syntax: - if mylist and mydep.operator == "=*": + for x in candidate_list: + cp = getattr(x, "cp", None) + if cp is None: + mysplit = catpkgsplit(remove_slot(x)) + if mysplit is not None: + cp = mysplit[0] + '/' + mysplit[1] - candidate_list = mylist - mylist = [] - # Currently, only \*\w+\* is supported. - ver = mydep.version[1:-1] + if cp is None: + continue - for x in candidate_list: - x_ver = getattr(x, "version", None) - if x_ver is None: - xs = catpkgsplit(remove_slot(x)) - if xs is None: - continue - x_ver = "-".join(xs[-2:]) - if ver in x_ver: + if cp == mycpv or extended_cp_match(mydep.cp, cp): mylist.append(x) - elif operator is None: - for x in candidate_list: - cp = getattr(x, "cp", None) - if cp is None: - mysplit = catpkgsplit(remove_slot(x)) - if mysplit is not None: - cp = mysplit[0] + '/' + mysplit[1] + if mylist and mydep.operator == "=*": - if cp is None: - continue + candidate_list = mylist + mylist = [] + # Currently, only \*\w+\* is supported. + ver = mydep.version[1:-1] - if cp == mydep.cp: - mylist.append(x) + for x in candidate_list: + x_ver = getattr(x, "version", None) + if x_ver is None: + xs = catpkgsplit(remove_slot(x)) + if xs is None: + continue + x_ver = "-".join(xs[-2:]) + if ver in x_ver: + mylist.append(x) - elif operator == "=": # Exact match - for x in candidate_list: - xcpv = getattr(x, "cpv", None) - if xcpv is None: - xcpv = remove_slot(x) - if not cpvequal(xcpv, mycpv): - continue - if (build_id is not None and - getattr(xcpv, "build_id", None) != build_id): - continue - mylist.append(x) + elif operator is None: + for x in candidate_list: + cp = getattr(x, "cp", None) + if cp is None: + mysplit = catpkgsplit(remove_slot(x)) + if mysplit is not None: + cp = mysplit[0] + '/' + mysplit[1] - elif operator == "=*": # glob match - # XXX: Nasty special casing for leading zeros - # Required as =* is a literal prefix match, so can't - # use vercmp - myver = mycpv_cps[2].lstrip("0") - if not myver or not myver[0].isdigit(): - myver = "0"+myver - if myver == mycpv_cps[2]: - mycpv_cmp = mycpv - else: - # Use replace to preserve the revision part if it exists - # (mycpv_cps[3] can't be trusted because in contains r0 - # even when the input has no revision part). - mycpv_cmp = mycpv.replace( - mydep.cp + "-" + mycpv_cps[2], - mydep.cp + "-" + myver, 1) - for x in candidate_list: - try: - x.cp - except AttributeError: - try: - pkg = _pkg_str(remove_slot(x)) - except InvalidData: + if cp is None: continue - else: - pkg = x - xs = pkg.cpv_split - myver = xs[2].lstrip("0") - if not myver or not myver[0].isdigit(): - myver = "0"+myver - if myver == xs[2]: - xcpv = pkg.cpv - else: - # Use replace to preserve the revision part if it exists. - xcpv = pkg.cpv.replace( - pkg.cp + "-" + xs[2], - pkg.cp + "-" + myver, 1) - if xcpv.startswith(mycpv_cmp): - # =* glob matches only on boundaries between version parts, - # so 1* does not match 10 (bug 560466). - next_char = xcpv[len(mycpv_cmp):len(mycpv_cmp)+1] - if (not next_char or next_char in '._-' or - mycpv_cmp[-1].isdigit() != next_char.isdigit()): + if cp == mydep.cp: mylist.append(x) - elif operator == "~": # version, any revision, match - for x in candidate_list: - xs = getattr(x, "cpv_split", None) - if xs is None: - xs = catpkgsplit(remove_slot(x)) - if xs is None: - raise InvalidData(x) - if not cpvequal(xs[0]+"/"+xs[1]+"-"+xs[2], mycpv_cps[0]+"/"+mycpv_cps[1]+"-"+mycpv_cps[2]): - continue - if xs[2] != ver: - continue - mylist.append(x) + elif operator == "=": # Exact match + for x in candidate_list: + xcpv = getattr(x, "cpv", None) + if xcpv is None: + xcpv = remove_slot(x) + if not cpvequal(xcpv, mycpv): + continue + if (build_id is not None and + getattr(xcpv, "build_id", None) != build_id): + continue + mylist.append(x) - elif operator in (">", ">=", "<", "<="): - for x in candidate_list: - if hasattr(x, 'cp'): - pkg = x + elif operator == "=*": # glob match + # XXX: Nasty special casing for leading zeros + # Required as =* is a literal prefix match, so can't + # use vercmp + myver = mycpv_cps[2].lstrip("0") + if not myver or not myver[0].isdigit(): + myver = "0"+myver + if myver == mycpv_cps[2]: + mycpv_cmp = mycpv else: + # Use replace to preserve the revision part if it exists + # (mycpv_cps[3] can't be trusted because in contains r0 + # even when the input has no revision part). + mycpv_cmp = mycpv.replace( + mydep.cp + "-" + mycpv_cps[2], + mydep.cp + "-" + myver, 1) + for x in candidate_list: try: - pkg = _pkg_str(remove_slot(x)) - except InvalidData: - continue + x.cp + except AttributeError: + try: + pkg = _pkg_str(remove_slot(x)) + except InvalidData: + continue + else: + pkg = x + + xs = pkg.cpv_split + myver = xs[2].lstrip("0") + if not myver or not myver[0].isdigit(): + myver = "0"+myver + if myver == xs[2]: + xcpv = pkg.cpv + else: + # Use replace to preserve the revision part if it exists. + xcpv = pkg.cpv.replace( + pkg.cp + "-" + xs[2], + pkg.cp + "-" + myver, 1) + if xcpv.startswith(mycpv_cmp): + # =* glob matches only on boundaries between version parts, + # so 1* does not match 10 (bug 560466). + next_char = xcpv[len(mycpv_cmp):len(mycpv_cmp)+1] + if (not next_char or next_char in '._-' or + mycpv_cmp[-1].isdigit() != next_char.isdigit()): + mylist.append(x) - if pkg.cp != mydep.cp: - continue - try: - result = vercmp(pkg.version, mydep.version) - except ValueError: # pkgcmp may return ValueError during int() conversion - writemsg(_("\nInvalid package name: %s\n") % x, noiselevel=-1) - raise - if result is None: - continue - elif operator == ">": - if result > 0: - mylist.append(x) - elif operator == ">=": - if result >= 0: - mylist.append(x) - elif operator == "<": - if result < 0: - mylist.append(x) - elif operator == "<=": - if result <= 0: - mylist.append(x) - else: - raise KeyError(_("Unknown operator: %s") % mydep) - else: - raise KeyError(_("Unknown operator: %s") % mydep) + elif operator == "~": # version, any revision, match + for x in candidate_list: + xs = getattr(x, "cpv_split", None) + if xs is None: + xs = catpkgsplit(remove_slot(x)) + if xs is None: + raise InvalidData(x) + if not cpvequal(xs[0]+"/"+xs[1]+"-"+xs[2], mycpv_cps[0]+"/"+mycpv_cps[1]+"-"+mycpv_cps[2]): + continue + if xs[2] != ver: + continue + mylist.append(x) - if mydep.slot is not None: - candidate_list = mylist - mylist = [] - for x in candidate_list: - x_pkg = None - try: - x.cpv - except AttributeError: - xslot = dep_getslot(x) - if xslot is not None: + elif operator in (">", ">=", "<", "<="): + for x in candidate_list: + if hasattr(x, 'cp'): + pkg = x + else: try: - x_pkg = _pkg_str(remove_slot(x), slot=xslot) + pkg = _pkg_str(remove_slot(x)) except InvalidData: continue - else: - x_pkg = x - if x_pkg is None: - mylist.append(x) - else: + if pkg.cp != mydep.cp: + continue try: - x_pkg.slot + result = vercmp(pkg.version, mydep.version) + except ValueError: # pkgcmp may return ValueError during int() conversion + writemsg(_("\nInvalid package name: %s\n") % x, noiselevel=-1) + raise + if result is None: + continue + elif operator == ">": + if result > 0: + mylist.append(x) + elif operator == ">=": + if result >= 0: + mylist.append(x) + elif operator == "<": + if result < 0: + mylist.append(x) + elif operator == "<=": + if result <= 0: + mylist.append(x) + else: + raise KeyError(_("Unknown operator: %s") % mydep) + else: + raise KeyError(_("Unknown operator: %s") % mydep) + + if mydep.slot is not None: + candidate_list = mylist + mylist = [] + for x in candidate_list: + x_pkg = None + try: + x.cpv except AttributeError: + xslot = dep_getslot(x) + if xslot is not None: + try: + x_pkg = _pkg_str(remove_slot(x), slot=xslot) + except InvalidData: + continue + else: + x_pkg = x + + if x_pkg is None: mylist.append(x) else: - if _match_slot(mydep, x_pkg): + try: + x_pkg.slot + except AttributeError: mylist.append(x) + else: + if _match_slot(mydep, x_pkg): + mylist.append(x) + if key_has_hash: + _match_from_list_cache[cache_key] = frozenset(mylist) if mydep.unevaluated_atom.use: candidate_list = mylist -- 2.27.0.212.ge8ba1cc988-goog ^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: [gentoo-portage-dev] Add caching to a few commonly used functions 2020-06-27 6:34 [gentoo-portage-dev] Add caching to a few commonly used functions Chun-Yu Shei ` (2 preceding siblings ...) 2020-06-27 6:34 ` [gentoo-portage-dev] [PATCH 3/3] Add partial caching to match_from_list Chun-Yu Shei @ 2020-06-27 7:35 ` Fabian Groffen 2020-06-27 7:43 ` Chun-Yu Shei 2020-06-27 8:31 ` Kent Fredric 2020-06-28 3:00 ` Zac Medico 4 siblings, 2 replies; 30+ messages in thread From: Fabian Groffen @ 2020-06-27 7:35 UTC (permalink / raw To: gentoo-portage-dev [-- Attachment #1: Type: text/plain, Size: 1985 bytes --] Hi Chun-Yu, On 26-06-2020 23:34:12 -0700, Chun-Yu Shei wrote: > Hi, > > I was recently interested in whether portage could be speed up, since > dependency resolution can sometimes take a while on slower machines. > After generating some flame graphs with cProfile and vmprof, I found 3 > functions which seem to be called extremely frequently with the same > arguments: catpkgsplit, use_reduce, and match_from_list. In the first > two cases, it was simple to cache the results in dicts, while > match_from_list was a bit trickier, since it seems to be a requirement > that it return actual entries from the input "candidate_list". I also > ran into some test failures if I did the caching after the > mydep.unevaluated_atom.use and mydep.repo checks towards the end of the > function, so the caching is only done up to just before that point. > > The catpkgsplit change seems to definitely be safe, and I'm pretty sure > the use_reduce one is too, since anything that could possibly change the > result is hashed. I'm a bit less certain about the match_from_list one, > although all tests are passing. > > With all 3 patches together, "emerge -uDvpU --with-bdeps=y @world" > speeds up from 43.53 seconds to 30.96 sec -- a 40.6% speedup. "emerge > -ep @world" is just a tiny bit faster, going from 18.69 to 18.22 sec > (2.5% improvement). Since the upgrade case is far more common, this > would really help in daily use, and it shaves about 30 seconds off > the time you have to wait to get to the [Yes/No] prompt (from ~90s to > 60s) on my old Sandy Bridge laptop when performing normal upgrades. > > Hopefully, at least some of these patches can be incorporated, and please > let me know if any changes are necessary. This sounds like a good job to me! Do you have any idea what the added memory pressure for these changes are? Thanks, Fabian > > Thanks, > Chun-Yu > > > -- Fabian Groffen Gentoo on a different level [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-portage-dev] Add caching to a few commonly used functions 2020-06-27 7:35 ` [gentoo-portage-dev] Add caching to a few commonly used functions Fabian Groffen @ 2020-06-27 7:43 ` Chun-Yu Shei 2020-06-27 8:31 ` Kent Fredric 1 sibling, 0 replies; 30+ messages in thread From: Chun-Yu Shei @ 2020-06-27 7:43 UTC (permalink / raw To: gentoo-portage-dev Hi Fabian, Just eyeballing htop's RES column while emerge is running, the max value I see during "emerge -uDvpU --with-bdeps=y @world" increases from 272 MB to 380 MB, so it looks to be around 110 MB of extra memory usage on my system (with 1,094 total packages installed). Chun-Yu On Sat, Jun 27, 2020, 12:35 AM Fabian Groffen <grobian@gentoo.org> wrote: > > Hi Chun-Yu, > > On 26-06-2020 23:34:12 -0700, Chun-Yu Shei wrote: > > Hi, > > > > I was recently interested in whether portage could be speed up, since > > dependency resolution can sometimes take a while on slower machines. > > After generating some flame graphs with cProfile and vmprof, I found 3 > > functions which seem to be called extremely frequently with the same > > arguments: catpkgsplit, use_reduce, and match_from_list. In the first > > two cases, it was simple to cache the results in dicts, while > > match_from_list was a bit trickier, since it seems to be a requirement > > that it return actual entries from the input "candidate_list". I also > > ran into some test failures if I did the caching after the > > mydep.unevaluated_atom.use and mydep.repo checks towards the end of the > > function, so the caching is only done up to just before that point. > > > > The catpkgsplit change seems to definitely be safe, and I'm pretty sure > > the use_reduce one is too, since anything that could possibly change the > > result is hashed. I'm a bit less certain about the match_from_list one, > > although all tests are passing. > > > > With all 3 patches together, "emerge -uDvpU --with-bdeps=y @world" > > speeds up from 43.53 seconds to 30.96 sec -- a 40.6% speedup. "emerge > > -ep @world" is just a tiny bit faster, going from 18.69 to 18.22 sec > > (2.5% improvement). Since the upgrade case is far more common, this > > would really help in daily use, and it shaves about 30 seconds off > > the time you have to wait to get to the [Yes/No] prompt (from ~90s to > > 60s) on my old Sandy Bridge laptop when performing normal upgrades. > > > > Hopefully, at least some of these patches can be incorporated, and please > > let me know if any changes are necessary. > > This sounds like a good job to me! Do you have any idea what the added > memory pressure for these changes are? > > Thanks, > Fabian > > > > Thanks, > > Chun-Yu > > > > > > > > -- > Fabian Groffen > Gentoo on a different level ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-portage-dev] Add caching to a few commonly used functions 2020-06-27 7:35 ` [gentoo-portage-dev] Add caching to a few commonly used functions Fabian Groffen 2020-06-27 7:43 ` Chun-Yu Shei @ 2020-06-27 8:31 ` Kent Fredric 1 sibling, 0 replies; 30+ messages in thread From: Kent Fredric @ 2020-06-27 8:31 UTC (permalink / raw To: gentoo-portage-dev On Sat, 27 Jun 2020 at 19:35, Fabian Groffen <grobian@gentoo.org> wrote: > > Hi Chun-Yu, > > arguments: catpkgsplit, use_reduce, and match_from_list. In the first > > two cases, it was simple to cache the results in dicts, while > > match_from_list was a bit trickier, since it seems to be a requirement > > that it return actual entries from the input "candidate_list". I also > > ran into some test failures if I did the caching after the > > mydep.unevaluated_atom.use and mydep.repo checks towards the end of the > > function, so the caching is only done up to just before that point. You may also want to investigate the version aspect parsing logic where it converts versions into a data structure, partly because the last time I tried profiling portage, every sample seemed to turn up in there. And I'd expect to see a lot of commonality in this. # qlist -I --format "%{PV}" | wc -c 14678 # qlist -I --format "%{PV}" | sort -u | wc -c 8811 And given this version-parsing path is even handled for stuff *not* installed, I suspect the real-world implications are worse # find /usr/portage/ -name "*.ebuild" | sed 's|/usr/portage/||;s|/[^/]*/|/|;s|[.]ebuild$||' | xargs qatom -CF "%{PV}" | wc -l 32604 # find /usr/portage/ -name "*.ebuild" | sed 's|/usr/portage/||;s|/[^/]*/|/|;s|[.]ebuild$||' | xargs qatom -CF "%{PVR}" | sort -u | wc -l 10362 katipo2 ~ # find /usr/portage/ -name "*.ebuild" | sed 's|/usr/portage/||;s|/[^/]*/|/|;s|[.]ebuild$||' | xargs qatom -CF "%{PV}" | sort -u | wc -l 7515 Obviously this is very crude analysis, but you see there's room to potentially no-op half of all version parses. Though the speed/memory tradeoff may not be worth it. Note, that this is not just "parse the version on the ebuild", which is fast, but my sampling seemed to indicate it was parsing the version afresh for every version comparison, which means internally, it was parsing the same version dozens of times over, which is much slower! -- Kent KENTNL - https://metacpan.org/author/KENTNL ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-portage-dev] Add caching to a few commonly used functions 2020-06-27 6:34 [gentoo-portage-dev] Add caching to a few commonly used functions Chun-Yu Shei ` (3 preceding siblings ...) 2020-06-27 7:35 ` [gentoo-portage-dev] Add caching to a few commonly used functions Fabian Groffen @ 2020-06-28 3:00 ` Zac Medico 2020-06-28 3:12 ` Michał Górny 4 siblings, 1 reply; 30+ messages in thread From: Zac Medico @ 2020-06-28 3:00 UTC (permalink / raw To: gentoo-portage-dev, Chun-Yu Shei [-- Attachment #1.1: Type: text/plain, Size: 2624 bytes --] On 6/26/20 11:34 PM, Chun-Yu Shei wrote: > Hi, > > I was recently interested in whether portage could be speed up, since > dependency resolution can sometimes take a while on slower machines. > After generating some flame graphs with cProfile and vmprof, I found 3 > functions which seem to be called extremely frequently with the same > arguments: catpkgsplit, use_reduce, and match_from_list. In the first > two cases, it was simple to cache the results in dicts, while > match_from_list was a bit trickier, since it seems to be a requirement > that it return actual entries from the input "candidate_list". I also > ran into some test failures if I did the caching after the > mydep.unevaluated_atom.use and mydep.repo checks towards the end of the > function, so the caching is only done up to just before that point. > > The catpkgsplit change seems to definitely be safe, and I'm pretty sure > the use_reduce one is too, since anything that could possibly change the > result is hashed. I'm a bit less certain about the match_from_list one, > although all tests are passing. > > With all 3 patches together, "emerge -uDvpU --with-bdeps=y @world" > speeds up from 43.53 seconds to 30.96 sec -- a 40.6% speedup. "emerge > -ep @world" is just a tiny bit faster, going from 18.69 to 18.22 sec > (2.5% improvement). Since the upgrade case is far more common, this > would really help in daily use, and it shaves about 30 seconds off > the time you have to wait to get to the [Yes/No] prompt (from ~90s to > 60s) on my old Sandy Bridge laptop when performing normal upgrades. > > Hopefully, at least some of these patches can be incorporated, and please > let me know if any changes are necessary. > > Thanks, > Chun-Yu Using global variables for caches like these causes a form of memory leak for use cases involving long-running processes that need to work with many different repositories (and perhaps multiple versions of those repositories). There are at least a couple of different strategies that we can use to avoid this form of memory leak: 1) Limit the scope of the caches so that they have some sort of garbage collection life cycle. For example, it would be natural for the depgraph class to have a local cache of use_reduce results, so that the cache can be garbage collected along with the depgraph. 2) Eliminate redundant calls. For example, redundant calls to catpkgslit can be avoided by constructing more _pkg_str instances, since catpkgsplit is able to return early when its argument happens to be a _pkg_str instance. -- Thanks, Zac [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 981 bytes --] ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-portage-dev] Add caching to a few commonly used functions 2020-06-28 3:00 ` Zac Medico @ 2020-06-28 3:12 ` Michał Górny 2020-06-28 3:42 ` Zac Medico 0 siblings, 1 reply; 30+ messages in thread From: Michał Górny @ 2020-06-28 3:12 UTC (permalink / raw To: gentoo-portage-dev, Zac Medico, Chun-Yu Shei Dnia June 28, 2020 3:00:00 AM UTC, Zac Medico <zmedico@gentoo.org> napisał(a): >On 6/26/20 11:34 PM, Chun-Yu Shei wrote: >> Hi, >> >> I was recently interested in whether portage could be speed up, since >> dependency resolution can sometimes take a while on slower machines. >> After generating some flame graphs with cProfile and vmprof, I found >3 >> functions which seem to be called extremely frequently with the same >> arguments: catpkgsplit, use_reduce, and match_from_list. In the >first >> two cases, it was simple to cache the results in dicts, while >> match_from_list was a bit trickier, since it seems to be a >requirement >> that it return actual entries from the input "candidate_list". I >also >> ran into some test failures if I did the caching after the >> mydep.unevaluated_atom.use and mydep.repo checks towards the end of >the >> function, so the caching is only done up to just before that point. >> >> The catpkgsplit change seems to definitely be safe, and I'm pretty >sure >> the use_reduce one is too, since anything that could possibly change >the >> result is hashed. I'm a bit less certain about the match_from_list >one, >> although all tests are passing. >> >> With all 3 patches together, "emerge -uDvpU --with-bdeps=y @world" >> speeds up from 43.53 seconds to 30.96 sec -- a 40.6% speedup. >"emerge >> -ep @world" is just a tiny bit faster, going from 18.69 to 18.22 sec >> (2.5% improvement). Since the upgrade case is far more common, this >> would really help in daily use, and it shaves about 30 seconds off >> the time you have to wait to get to the [Yes/No] prompt (from ~90s to >> 60s) on my old Sandy Bridge laptop when performing normal upgrades. >> >> Hopefully, at least some of these patches can be incorporated, and >please >> let me know if any changes are necessary. >> >> Thanks, >> Chun-Yu > >Using global variables for caches like these causes a form of memory >leak for use cases involving long-running processes that need to work >with many different repositories (and perhaps multiple versions of >those >repositories). > >There are at least a couple of different strategies that we can use to >avoid this form of memory leak: > >1) Limit the scope of the caches so that they have some sort of garbage >collection life cycle. For example, it would be natural for the >depgraph >class to have a local cache of use_reduce results, so that the cache >can >be garbage collected along with the depgraph. > >2) Eliminate redundant calls. For example, redundant calls to >catpkgslit >can be avoided by constructing more _pkg_str instances, since >catpkgsplit is able to return early when its argument happens to be a >_pkg_str instance. I think the weak stuff from the standard library might also be helpful. -- Best regards, Michał Górny ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-portage-dev] Add caching to a few commonly used functions 2020-06-28 3:12 ` Michał Górny @ 2020-06-28 3:42 ` Zac Medico 2020-06-28 5:30 ` Michał Górny 0 siblings, 1 reply; 30+ messages in thread From: Zac Medico @ 2020-06-28 3:42 UTC (permalink / raw To: Michał Górny, gentoo-portage-dev, Zac Medico, Chun-Yu Shei [-- Attachment #1.1: Type: text/plain, Size: 3095 bytes --] On 6/27/20 8:12 PM, Michał Górny wrote: > Dnia June 28, 2020 3:00:00 AM UTC, Zac Medico <zmedico@gentoo.org> napisał(a): >> On 6/26/20 11:34 PM, Chun-Yu Shei wrote: >>> Hi, >>> >>> I was recently interested in whether portage could be speed up, since >>> dependency resolution can sometimes take a while on slower machines. >>> After generating some flame graphs with cProfile and vmprof, I found >> 3 >>> functions which seem to be called extremely frequently with the same >>> arguments: catpkgsplit, use_reduce, and match_from_list. In the >> first >>> two cases, it was simple to cache the results in dicts, while >>> match_from_list was a bit trickier, since it seems to be a >> requirement >>> that it return actual entries from the input "candidate_list". I >> also >>> ran into some test failures if I did the caching after the >>> mydep.unevaluated_atom.use and mydep.repo checks towards the end of >> the >>> function, so the caching is only done up to just before that point. >>> >>> The catpkgsplit change seems to definitely be safe, and I'm pretty >> sure >>> the use_reduce one is too, since anything that could possibly change >> the >>> result is hashed. I'm a bit less certain about the match_from_list >> one, >>> although all tests are passing. >>> >>> With all 3 patches together, "emerge -uDvpU --with-bdeps=y @world" >>> speeds up from 43.53 seconds to 30.96 sec -- a 40.6% speedup. >> "emerge >>> -ep @world" is just a tiny bit faster, going from 18.69 to 18.22 sec >>> (2.5% improvement). Since the upgrade case is far more common, this >>> would really help in daily use, and it shaves about 30 seconds off >>> the time you have to wait to get to the [Yes/No] prompt (from ~90s to >>> 60s) on my old Sandy Bridge laptop when performing normal upgrades. >>> >>> Hopefully, at least some of these patches can be incorporated, and >> please >>> let me know if any changes are necessary. >>> >>> Thanks, >>> Chun-Yu >> >> Using global variables for caches like these causes a form of memory >> leak for use cases involving long-running processes that need to work >> with many different repositories (and perhaps multiple versions of >> those >> repositories). >> >> There are at least a couple of different strategies that we can use to >> avoid this form of memory leak: >> >> 1) Limit the scope of the caches so that they have some sort of garbage >> collection life cycle. For example, it would be natural for the >> depgraph >> class to have a local cache of use_reduce results, so that the cache >> can >> be garbage collected along with the depgraph. >> >> 2) Eliminate redundant calls. For example, redundant calls to >> catpkgslit >> can be avoided by constructing more _pkg_str instances, since >> catpkgsplit is able to return early when its argument happens to be a >> _pkg_str instance. > > I think the weak stuff from the standard library might also be helpful. > > -- > Best regards, > Michał Górny > Hmm, maybe weak global caches are an option? -- Thanks, Zac [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 981 bytes --] ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: [gentoo-portage-dev] Add caching to a few commonly used functions 2020-06-28 3:42 ` Zac Medico @ 2020-06-28 5:30 ` Michał Górny 0 siblings, 0 replies; 30+ messages in thread From: Michał Górny @ 2020-06-28 5:30 UTC (permalink / raw To: gentoo-portage-dev, Zac Medico, Chun-Yu Shei Dnia June 28, 2020 3:42:33 AM UTC, Zac Medico <zmedico@gentoo.org> napisał(a): >On 6/27/20 8:12 PM, Michał Górny wrote: >> Dnia June 28, 2020 3:00:00 AM UTC, Zac Medico <zmedico@gentoo.org> >napisał(a): >>> On 6/26/20 11:34 PM, Chun-Yu Shei wrote: >>>> Hi, >>>> >>>> I was recently interested in whether portage could be speed up, >since >>>> dependency resolution can sometimes take a while on slower >machines. >>>> After generating some flame graphs with cProfile and vmprof, I >found >>> 3 >>>> functions which seem to be called extremely frequently with the >same >>>> arguments: catpkgsplit, use_reduce, and match_from_list. In the >>> first >>>> two cases, it was simple to cache the results in dicts, while >>>> match_from_list was a bit trickier, since it seems to be a >>> requirement >>>> that it return actual entries from the input "candidate_list". I >>> also >>>> ran into some test failures if I did the caching after the >>>> mydep.unevaluated_atom.use and mydep.repo checks towards the end of >>> the >>>> function, so the caching is only done up to just before that point. >>>> >>>> The catpkgsplit change seems to definitely be safe, and I'm pretty >>> sure >>>> the use_reduce one is too, since anything that could possibly >change >>> the >>>> result is hashed. I'm a bit less certain about the match_from_list >>> one, >>>> although all tests are passing. >>>> >>>> With all 3 patches together, "emerge -uDvpU --with-bdeps=y @world" >>>> speeds up from 43.53 seconds to 30.96 sec -- a 40.6% speedup. >>> "emerge >>>> -ep @world" is just a tiny bit faster, going from 18.69 to 18.22 >sec >>>> (2.5% improvement). Since the upgrade case is far more common, >this >>>> would really help in daily use, and it shaves about 30 seconds off >>>> the time you have to wait to get to the [Yes/No] prompt (from ~90s >to >>>> 60s) on my old Sandy Bridge laptop when performing normal upgrades. >>>> >>>> Hopefully, at least some of these patches can be incorporated, and >>> please >>>> let me know if any changes are necessary. >>>> >>>> Thanks, >>>> Chun-Yu >>> >>> Using global variables for caches like these causes a form of memory >>> leak for use cases involving long-running processes that need to >work >>> with many different repositories (and perhaps multiple versions of >>> those >>> repositories). >>> >>> There are at least a couple of different strategies that we can use >to >>> avoid this form of memory leak: >>> >>> 1) Limit the scope of the caches so that they have some sort of >garbage >>> collection life cycle. For example, it would be natural for the >>> depgraph >>> class to have a local cache of use_reduce results, so that the cache >>> can >>> be garbage collected along with the depgraph. >>> >>> 2) Eliminate redundant calls. For example, redundant calls to >>> catpkgslit >>> can be avoided by constructing more _pkg_str instances, since >>> catpkgsplit is able to return early when its argument happens to be >a >>> _pkg_str instance. >> >> I think the weak stuff from the standard library might also be >helpful. >> >> -- >> Best regards, >> Michał Górny >> > >Hmm, maybe weak global caches are an option? It would probably be necessary to add hit/miss counter and compare results before and after. -- Best regards, Michał Górny ^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2020-07-13 19:24 UTC | newest] Thread overview: 30+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-06-27 6:34 [gentoo-portage-dev] Add caching to a few commonly used functions Chun-Yu Shei 2020-06-27 6:34 ` [gentoo-portage-dev] [PATCH 1/3] Add caching to catpkgsplit function Chun-Yu Shei 2020-06-27 11:33 ` Michał Górny 2020-06-29 1:58 ` Sid Spry 2020-07-06 15:26 ` Francesco Riosa [not found] ` <cdb0d821-67c1-edb6-2cbc-f26eaa0d3d70@veremit.xyz> 2020-07-06 16:10 ` Francesco Riosa 2020-07-06 17:30 ` Chun-Yu Shei 2020-07-06 18:03 ` Zac Medico 2020-07-07 3:41 ` Zac Medico 2020-07-09 7:03 ` Chun-Yu Shei 2020-07-09 7:03 ` [gentoo-portage-dev] [PATCH] Add caching to use_reduce, vercmp, and catpkgsplit Chun-Yu Shei 2020-07-12 21:46 ` Zac Medico 2020-07-13 6:30 ` [gentoo-portage-dev] [PATCH] Add caching to use_reduce, Chun-Yu Shei 2020-07-13 6:30 ` [gentoo-portage-dev] [PATCH] Add caching to use_reduce, vercmp, and catpkgsplit Chun-Yu Shei 2020-07-13 17:28 ` Zac Medico 2020-07-13 18:54 ` Ulrich Mueller 2020-07-13 19:04 ` Chun-Yu Shei 2020-07-13 19:24 ` Ulrich Mueller 2020-07-09 21:04 ` [gentoo-portage-dev] [PATCH 1/3] Add caching to catpkgsplit function Alec Warner 2020-07-09 21:06 ` Chun-Yu Shei 2020-07-09 21:13 ` Alec Warner 2020-06-27 6:34 ` [gentoo-portage-dev] [PATCH 2/3] Add caching to use_reduce function Chun-Yu Shei 2020-06-27 6:34 ` [gentoo-portage-dev] [PATCH 3/3] Add partial caching to match_from_list Chun-Yu Shei 2020-06-27 7:35 ` [gentoo-portage-dev] Add caching to a few commonly used functions Fabian Groffen 2020-06-27 7:43 ` Chun-Yu Shei 2020-06-27 8:31 ` Kent Fredric 2020-06-28 3:00 ` Zac Medico 2020-06-28 3:12 ` Michał Górny 2020-06-28 3:42 ` Zac Medico 2020-06-28 5:30 ` Michał Górny
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox