* [gentoo-portage-dev] [PATCH 1/2] cpuinfo: use better available CPU calculation
@ 2019-02-16 6:21 robbat2
2019-02-16 6:21 ` [gentoo-portage-dev] [PATCH 2/2] Replace multiprocessing.cpu_count with portage.util.cpuinfo.get_cpu_count robbat2
0 siblings, 1 reply; 3+ messages in thread
From: robbat2 @ 2019-02-16 6:21 UTC (permalink / raw
To: gentoo-portage-dev; +Cc: Robin H. Johnson
From: "Robin H. Johnson" <robbat2@gentoo.org>
The existing portage.util.cpuinfo.get_cpu_count() behavior is wrong when
run in any environment where the cpuset is a subset of online CPUs.
The solution recommended by the 'os.cpu_count()' help is to use:
len(os.sched_getaffinity(0))
This only works on line, so keep multiprocessing.cpu_count() as a
fallback. In newer version of Python, multiprocessing.cpu_count() is a
wrapper for os.cpu_count().
Reported-By: Daniel Robbins <drobbins@funtoo.org>
Fixes: https://bugs.funtoo.org/browse/FL-6227
Signed-off-by: Robin H. Johnson <robbat2@gentoo.org>
---
lib/portage/util/cpuinfo.py | 33 +++++++++++++++++++++++++++++++--
1 file changed, 31 insertions(+), 2 deletions(-)
diff --git a/lib/portage/util/cpuinfo.py b/lib/portage/util/cpuinfo.py
index 669e707b5..9ab1c119d 100644
--- a/lib/portage/util/cpuinfo.py
+++ b/lib/portage/util/cpuinfo.py
@@ -1,15 +1,44 @@
-# Copyright 2015 Gentoo Foundation
+# Copyright 2015-2019 Gentoo Authors
# Distributed under the terms of the GNU General Public License v2
__all__ = ['get_cpu_count']
+# Before you set out to change this function, figure out what you're really
+# asking:
+#
+# - How many CPUs exist in this system (e.g. that the kernel is aware of?)
+# This is 'getconf _NPROCESSORS_CONF' / get_nprocs_conf(3)
+# In modern Linux, implemented by counting CPUs in /sys/devices/system/cpu/
+#
+# - How many CPUs in this system are ONLINE right now?
+# This is 'getconf _NPROCESSORS_ONLN' / get_nprocs(3)
+# In modern Linux, implemented by parsing /sys/devices/system/cpu/online
+#
+# - How many CPUs are available to this program?
+# This is 'nproc' / sched_getaffinity(2), which is implemented in modern
+# Linux kernels by querying the kernel scheduler; This might not be available
+# in some non-Linux systems!
+#
+# - How many CPUs are available to this thread?
+# This is pthread_getaffinity_np(3)
+#
+# As a further warning, the results returned by this function can differ
+# between runs, if altered by the scheduler or other external factors.
def get_cpu_count():
"""
- Try to obtain the number of CPUs available.
+ Try to obtain the number of CPUs available to this process.
@return: Number of CPUs or None if unable to obtain.
"""
+ try:
+ import os
+ # This was introduced in Python 3.3 only, but exists in Linux
+ # all the way back to the 2.5.8 kernel.
+ # This NOT available in FreeBSD!
+ return len(os.sched_getaffinity(0))
+ except (ImportError, NotImplementedError, AttributeError):
+ pass
try:
import multiprocessing
--
2.18.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [gentoo-portage-dev] [PATCH 2/2] Replace multiprocessing.cpu_count with portage.util.cpuinfo.get_cpu_count
2019-02-16 6:21 [gentoo-portage-dev] [PATCH 1/2] cpuinfo: use better available CPU calculation robbat2
@ 2019-02-16 6:21 ` robbat2
2019-02-16 6:58 ` Zac Medico
0 siblings, 1 reply; 3+ messages in thread
From: robbat2 @ 2019-02-16 6:21 UTC (permalink / raw
To: gentoo-portage-dev; +Cc: Robin H. Johnson
From: "Robin H. Johnson" <robbat2@gentoo.org>
portage.util.cpuinfo.get_cpu_count was only used in one spot before, and
other call-sites just used multiprocessing.cpu_count() directly.
Replace all multiprocessing.cpu_count() calls with get_cpu_count() in
portage.util.cpuinfo, to ensure consistency in CPU calculation.
Signed-off-by: Robin H. Johnson <robbat2@gentoo.org>
---
lib/portage/dbapi/porttree.py | 4 ++--
lib/portage/util/futures/executor/fork.py | 4 ++--
lib/portage/util/futures/iter_completed.py | 18 +++++++++---------
3 files changed, 13 insertions(+), 13 deletions(-)
diff --git a/lib/portage/dbapi/porttree.py b/lib/portage/dbapi/porttree.py
index 2ff3e1b34..64a5f3681 100644
--- a/lib/portage/dbapi/porttree.py
+++ b/lib/portage/dbapi/porttree.py
@@ -1471,11 +1471,11 @@ def _async_manifest_fetchlist(portdb, repo_config, cp, cpv_list=None,
@param cpv_list: list of ebuild cpv values for a Manifest
@type cpv_list: list
@param max_jobs: max number of futures to process concurrently (default
- is multiprocessing.cpu_count())
+ is portage.util.cpuinfo.get_cpu_count())
@type max_jobs: int
@param max_load: max load allowed when scheduling a new future,
otherwise schedule no more than 1 future at a time (default
- is multiprocessing.cpu_count())
+ is portage.util.cpuinfo.get_cpu_count())
@type max_load: int or float
@param loop: event loop
@type loop: EventLoop
diff --git a/lib/portage/util/futures/executor/fork.py b/lib/portage/util/futures/executor/fork.py
index 72844403c..add7b3c9e 100644
--- a/lib/portage/util/futures/executor/fork.py
+++ b/lib/portage/util/futures/executor/fork.py
@@ -7,13 +7,13 @@ __all__ = (
import collections
import functools
-import multiprocessing
import os
import sys
import traceback
from portage.util._async.AsyncFunction import AsyncFunction
from portage.util.futures import asyncio
+from portage.util.cpuinfo import get_cpu_count
class ForkExecutor(object):
@@ -24,7 +24,7 @@ class ForkExecutor(object):
This is entirely driven by an event loop.
"""
def __init__(self, max_workers=None, loop=None):
- self._max_workers = max_workers or multiprocessing.cpu_count()
+ self._max_workers = max_workers or get_cpu_count()
self._loop = asyncio._wrap_loop(loop)
self._submit_queue = collections.deque()
self._running_tasks = {}
diff --git a/lib/portage/util/futures/iter_completed.py b/lib/portage/util/futures/iter_completed.py
index 31b5e0c78..4c48ea0fe 100644
--- a/lib/portage/util/futures/iter_completed.py
+++ b/lib/portage/util/futures/iter_completed.py
@@ -2,11 +2,11 @@
# Distributed under the terms of the GNU General Public License v2
import functools
-import multiprocessing
from portage.util._async.AsyncTaskFuture import AsyncTaskFuture
from portage.util._async.TaskScheduler import TaskScheduler
from portage.util.futures import asyncio
+from portage.util.cpuinfo import get_cpu_count
def iter_completed(futures, max_jobs=None, max_load=None, loop=None):
@@ -18,11 +18,11 @@ def iter_completed(futures, max_jobs=None, max_load=None, loop=None):
@param futures: iterator of asyncio.Future (or compatible)
@type futures: iterator
@param max_jobs: max number of futures to process concurrently (default
- is multiprocessing.cpu_count())
+ is portage.util.cpuinfo.get_cpu_count())
@type max_jobs: int
@param max_load: max load allowed when scheduling a new future,
otherwise schedule no more than 1 future at a time (default
- is multiprocessing.cpu_count())
+ is portage.util.cpuinfo.get_cpu_count())
@type max_load: int or float
@param loop: event loop
@type loop: EventLoop
@@ -47,11 +47,11 @@ def async_iter_completed(futures, max_jobs=None, max_load=None, loop=None):
@param futures: iterator of asyncio.Future (or compatible)
@type futures: iterator
@param max_jobs: max number of futures to process concurrently (default
- is multiprocessing.cpu_count())
+ is portage.util.cpuinfo.get_cpu_count())
@type max_jobs: int
@param max_load: max load allowed when scheduling a new future,
otherwise schedule no more than 1 future at a time (default
- is multiprocessing.cpu_count())
+ is portage.util.cpuinfo.get_cpu_count())
@type max_load: int or float
@param loop: event loop
@type loop: EventLoop
@@ -61,8 +61,8 @@ def async_iter_completed(futures, max_jobs=None, max_load=None, loop=None):
"""
loop = asyncio._wrap_loop(loop)
- max_jobs = max_jobs or multiprocessing.cpu_count()
- max_load = max_load or multiprocessing.cpu_count()
+ max_jobs = max_jobs or portage.util.cpuinfo.get_cpu_count()
+ max_load = max_load or portage.util.cpuinfo.get_cpu_count()
future_map = {}
def task_generator():
@@ -120,11 +120,11 @@ def iter_gather(futures, max_jobs=None, max_load=None, loop=None):
@param futures: iterator of asyncio.Future (or compatible)
@type futures: iterator
@param max_jobs: max number of futures to process concurrently (default
- is multiprocessing.cpu_count())
+ is portage.util.cpuinfo.get_cpu_count())
@type max_jobs: int
@param max_load: max load allowed when scheduling a new future,
otherwise schedule no more than 1 future at a time (default
- is multiprocessing.cpu_count())
+ is portage.util.cpuinfo.get_cpu_count())
@type max_load: int or float
@param loop: event loop
@type loop: EventLoop
--
2.18.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [gentoo-portage-dev] [PATCH 2/2] Replace multiprocessing.cpu_count with portage.util.cpuinfo.get_cpu_count
2019-02-16 6:21 ` [gentoo-portage-dev] [PATCH 2/2] Replace multiprocessing.cpu_count with portage.util.cpuinfo.get_cpu_count robbat2
@ 2019-02-16 6:58 ` Zac Medico
0 siblings, 0 replies; 3+ messages in thread
From: Zac Medico @ 2019-02-16 6:58 UTC (permalink / raw
To: gentoo-portage-dev, robbat2
[-- Attachment #1.1: Type: text/plain, Size: 753 bytes --]
On 2/15/19 10:21 PM, robbat2@gentoo.org wrote:
> From: "Robin H. Johnson" <robbat2@gentoo.org>
>
> portage.util.cpuinfo.get_cpu_count was only used in one spot before, and
> other call-sites just used multiprocessing.cpu_count() directly.
>
> Replace all multiprocessing.cpu_count() calls with get_cpu_count() in
> portage.util.cpuinfo, to ensure consistency in CPU calculation.
>
> Signed-off-by: Robin H. Johnson <robbat2@gentoo.org>
> ---
> lib/portage/dbapi/porttree.py | 4 ++--
> lib/portage/util/futures/executor/fork.py | 4 ++--
> lib/portage/util/futures/iter_completed.py | 18 +++++++++---------
> 3 files changed, 13 insertions(+), 13 deletions(-)
Series looks good, please merge.
--
Thanks,
Zac
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 981 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2019-02-16 6:58 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-02-16 6:21 [gentoo-portage-dev] [PATCH 1/2] cpuinfo: use better available CPU calculation robbat2
2019-02-16 6:21 ` [gentoo-portage-dev] [PATCH 2/2] Replace multiprocessing.cpu_count with portage.util.cpuinfo.get_cpu_count robbat2
2019-02-16 6:58 ` Zac Medico
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox