* [gentoo-portage-dev] Avoid repeated failing os.access
@ 2014-02-18 21:02 Sebastian Luther
2014-02-18 21:02 ` [gentoo-portage-dev] [PATCH] portdbapi: Add cache to avoid repeated failing os.access calls Sebastian Luther
0 siblings, 1 reply; 4+ messages in thread
From: Sebastian Luther @ 2014-02-18 21:02 UTC (permalink / raw
To: gentoo-portage-dev
Hi all,
people have several times complained about useless access calls in
strace logs. The problem is that a successful call in
portdbapi.findname2 is cached by aux_get, but a failed is not.
This patch works around this by adding a cache for findname2.
This gives a speed up of about 5% compared to current master.
The real problem is that the resolver calls aux_get for a given
cpv for each existing repo. That's because portdbapi.cp_list returns
only cpvs without telling the caller to which repo it belongs.
Fixing that properly would require quite some work on the portdbapi
class.
There was the complaint that instead of writing the caching myself
I should be using a memoization decorator. Unfortunately nobody could
suggest a good implementation with proper license to use. If someone
knows one, let me know.
- Sebastian
^ permalink raw reply [flat|nested] 4+ messages in thread
* [gentoo-portage-dev] [PATCH] portdbapi: Add cache to avoid repeated failing os.access calls
2014-02-18 21:02 [gentoo-portage-dev] Avoid repeated failing os.access Sebastian Luther
@ 2014-02-18 21:02 ` Sebastian Luther
2014-02-18 21:35 ` David James
0 siblings, 1 reply; 4+ messages in thread
From: Sebastian Luther @ 2014-02-18 21:02 UTC (permalink / raw
To: gentoo-portage-dev
---
pym/portage/dbapi/porttree.py | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/pym/portage/dbapi/porttree.py b/pym/portage/dbapi/porttree.py
index 590e3c5..2dc406b 100644
--- a/pym/portage/dbapi/porttree.py
+++ b/pym/portage/dbapi/porttree.py
@@ -254,6 +254,7 @@ class portdbapi(dbapi):
self._aux_cache = {}
self._broken_ebuilds = set()
+ self._findname2_cache = {}
@property
def _event_loop(self):
@@ -372,6 +373,14 @@ class portdbapi(dbapi):
the file we wanted.
If myrepo is not None it will find packages from this repository(overlay)
"""
+ cache_key = (mycpv, mytree, myrepo)
+ try:
+ return self._findname2_cache[cache_key]
+ except KeyError:
+ self._findname2_cache[cache_key] = (None, 0)
+ except TypeError:
+ cache_key = None
+
if not mycpv:
return (None, 0)
@@ -383,6 +392,8 @@ class portdbapi(dbapi):
mysplit = mycpv.split("/")
psplit = pkgsplit(mysplit[1])
if psplit is None or len(mysplit) != 2:
+ if cache_key:
+ del self._findname2_cache[cache_key]
raise InvalidPackageName(mycpv)
# For optimal performace in this hot spot, we do manual unicode
@@ -402,6 +413,8 @@ class portdbapi(dbapi):
filename = x + _os.sep + relative_path
if _os.access(_unicode_encode(filename,
encoding=encoding, errors=errors), _os.R_OK):
+ if cache_key:
+ self._findname2_cache[cache_key] = (filename, x)
return (filename, x)
return (None, 0)
--
1.8.3.2
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [gentoo-portage-dev] [PATCH] portdbapi: Add cache to avoid repeated failing os.access calls
2014-02-18 21:02 ` [gentoo-portage-dev] [PATCH] portdbapi: Add cache to avoid repeated failing os.access calls Sebastian Luther
@ 2014-02-18 21:35 ` David James
2014-02-19 19:01 ` Sebastian Luther
0 siblings, 1 reply; 4+ messages in thread
From: David James @ 2014-02-18 21:35 UTC (permalink / raw
To: gentoo-portage-dev
[-- Attachment #1: Type: text/plain, Size: 2512 bytes --]
On Tue, Feb 18, 2014 at 1:02 PM, Sebastian Luther <SebastianLuther@gmx.de>wrote:
> ---
> pym/portage/dbapi/porttree.py | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
>
> diff --git a/pym/portage/dbapi/porttree.py b/pym/portage/dbapi/porttree.py
> index 590e3c5..2dc406b 100644
> --- a/pym/portage/dbapi/porttree.py
> +++ b/pym/portage/dbapi/porttree.py
> @@ -254,6 +254,7 @@ class portdbapi(dbapi):
>
> self._aux_cache = {}
> self._broken_ebuilds = set()
> + self._findname2_cache = {}
>
> @property
> def _event_loop(self):
> @@ -372,6 +373,14 @@ class portdbapi(dbapi):
> the file we wanted.
> If myrepo is not None it will find packages from this
> repository(overlay)
> """
> + cache_key = (mycpv, mytree, myrepo)
> + try:
> + return self._findname2_cache[cache_key]
> + except KeyError:
> + self._findname2_cache[cache_key] = (None, 0)
>
To me, it seems potentially error-prone to cache a (potentially) incorrect
value and then correct it later. Would it be possible to refactor your
patch so that we only cache the value when we know the final answer?
> + except TypeError:
>
In what circumstances does it happen that mytree / myrepo are unhashable
types? Can you add a comment to explain this?
+ cache_key = None
> +
> if not mycpv:
> return (None, 0)
>
> @@ -383,6 +392,8 @@ class portdbapi(dbapi):
> mysplit = mycpv.split("/")
> psplit = pkgsplit(mysplit[1])
> if psplit is None or len(mysplit) != 2:
> + if cache_key:
> + del self._findname2_cache[cache_key]
raise InvalidPackageName(mycpv)
>
> # For optimal performace in this hot spot, we do manual
> unicode
> @@ -402,6 +413,8 @@ class portdbapi(dbapi):
> filename = x + _os.sep + relative_path
> if _os.access(_unicode_encode(filename,
> encoding=encoding, errors=errors),
> _os.R_OK):
> + if cache_key:
> + self._findname2_cache[cache_key] =
> (filename, x)
> return (filename, x)
> return (None, 0)
>
> --
> 1.8.3.2
>
>
>
[-- Attachment #2: Type: text/html, Size: 3811 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [gentoo-portage-dev] [PATCH] portdbapi: Add cache to avoid repeated failing os.access calls
2014-02-18 21:35 ` David James
@ 2014-02-19 19:01 ` Sebastian Luther
0 siblings, 0 replies; 4+ messages in thread
From: Sebastian Luther @ 2014-02-19 19:01 UTC (permalink / raw
To: gentoo-portage-dev
Am 18.02.2014 22:35, schrieb David James:
> """
> + cache_key = (mycpv, mytree, myrepo)
> + try:
> + return self._findname2_cache[cache_key]
> + except KeyError:
> + self._findname2_cache[cache_key] = (None, 0)
>
>
> To me, it seems potentially error-prone to cache a (potentially)
> incorrect value and then correct it later.
It is. The problem are all these returns. The whole thing would need to
be reorganized to fix this. I'd rather go for a memoize decorator and
leave that thing alone. If I just could find a usable one.
Would it be possible to
> refactor your patch so that we only cache the value when we know the
> final answer?
>
>
> + except TypeError:
>
>
> In what circumstances does it happen that mytree / myrepo are unhashable
> types? Can you add a comment to explain this?
That's my mistake. I was under the impression that mytree is actually
mytreeS and would accept a list. I'll remove the "except TypeError:" in
a future version of the patch.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2014-02-19 19:01 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-02-18 21:02 [gentoo-portage-dev] Avoid repeated failing os.access Sebastian Luther
2014-02-18 21:02 ` [gentoo-portage-dev] [PATCH] portdbapi: Add cache to avoid repeated failing os.access calls Sebastian Luther
2014-02-18 21:35 ` David James
2014-02-19 19:01 ` Sebastian Luther
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox