public inbox for gentoo-portage-dev@lists.gentoo.org
 help / color / mirror / Atom feed
From: Zac Medico <zmedico@gentoo.org>
To: gentoo-portage-dev@lists.gentoo.org
Cc: Zac Medico <zmedico@gentoo.org>
Subject: [gentoo-portage-dev] [PATCH] process.spawn: Add returnproc parameter
Date: Wed, 31 Jan 2024 22:56:26 -0800	[thread overview]
Message-ID: <20240201065626.2769018-1-zmedico@gentoo.org> (raw)

In order to migrate away from unsafe os.fork() usage in threaded
processes (https://github.com/python/cpython/issues/84559), add a
returnproc parameter that is similar to returnpid, which causes
spawn to return Process objects instead of pids. The Process
API is a subset of asyncio.subprocess.Process.

In the future, spawn will return objects of a different type but
with a compatible interface to Process, in order to encapsulate
implementation-dependent objects like multiprocessing.Process which
are designed to manage the process lifecycle and need to persist
until it exits.

Trigger a UserWarning when the returnpid parameter is used, in
order to encourage migration to returnproc (do not use
DeprecationWarning since it is hidden by default). This warning
will be temporarily suppressed for portage internals, until they
finish migrating to returnproc. There are probably very few if
any external consumers of spawn with the returnpid parameter,
so it seems safe to move quickly with this deprecation.

Bug: https://bugs.gentoo.org/916566
Signed-off-by: Zac Medico <zmedico@gentoo.org>
---
 lib/portage/process.py                        | 79 +++++++++++++++++++
 lib/portage/tests/process/meson.build         |  1 +
 .../tests/process/test_spawn_returnproc.py    | 39 +++++++++
 3 files changed, 119 insertions(+)
 create mode 100644 lib/portage/tests/process/test_spawn_returnproc.py

diff --git a/lib/portage/process.py b/lib/portage/process.py
index 71750a715f..74953b3e22 100644
--- a/lib/portage/process.py
+++ b/lib/portage/process.py
@@ -15,6 +15,7 @@ import subprocess
 import sys
 import traceback
 import os as _os
+import warnings
 
 from dataclasses import dataclass
 from functools import lru_cache
@@ -27,6 +28,7 @@ import portage
 
 portage.proxy.lazyimport.lazyimport(
     globals(),
+    "portage.util._eventloop.global_event_loop:global_event_loop",
     "portage.util:dump_traceback,writemsg,writemsg_level",
 )
 
@@ -277,12 +279,78 @@ def calc_env_stats(env) -> EnvStats:
 env_too_large_warnings = 0
 
 
+class Process:
+    """
+    An object that wraps OS processes created by spawn.
+    In the future, spawn will return objects of a different type
+    but with a compatible interface to this class, in order
+    to encapsulate implementation-dependent objects like
+    multiprocessing.Process which are designed to manage
+    the process lifecycle and need to persist until it exits.
+    """
+
+    def __init__(self, pid):
+        self.pid = pid
+        self.returncode = None
+        self._exit_waiters = []
+
+    def __repr__(self):
+        return f"<{self.__class__.__name__} {self.pid}>"
+
+    async def wait(self):
+        """
+        Wait for the child process to terminate.
+
+        Set and return the returncode attribute.
+        """
+        if self.returncode is not None:
+            return self.returncode
+
+        loop = global_event_loop()
+        if not self._exit_waiters:
+            loop._asyncio_child_watcher.add_child_handler(self.pid, self._child_handler)
+        waiter = loop.create_future()
+        self._exit_waiters.append(waiter)
+        return await waiter
+
+    def _child_handler(self, pid, returncode):
+        if pid != self.pid:
+            raise AssertionError(f"expected pid {self.pid}, got {pid}")
+        self.returncode = returncode
+
+        for waiter in self._exit_waiters:
+            if not waiter.cancelled():
+                waiter.set_result(returncode)
+        self._exit_waiters = None
+
+    def send_signal(self, sig):
+        """Send a signal to the process."""
+        if self.returncode is not None:
+            # Skip signalling a process that we know has already died.
+            return
+
+        try:
+            os.kill(self.pid, sig)
+        except ProcessLookupError:
+            # Suppress the race condition error; bpo-40550.
+            pass
+
+    def terminate(self):
+        """Terminate the process with SIGTERM"""
+        self.send_signal(signal.SIGTERM)
+
+    def kill(self):
+        """Kill the process with SIGKILL"""
+        self.send_signal(signal.SIGKILL)
+
+
 def spawn(
     mycommand,
     env=None,
     opt_name=None,
     fd_pipes=None,
     returnpid=False,
+    returnproc=False,
     uid=None,
     gid=None,
     groups=None,
@@ -315,6 +383,9 @@ def spawn(
     @param returnpid: Return the Process IDs for a successful spawn.
     NOTE: This requires the caller clean up all the PIDs, otherwise spawn will clean them.
     @type returnpid: Boolean
+    @param returnproc: Return the Process objects for a successful spawn.
+    NOTE: This requires the caller clean up all the PIDs, otherwise spawn will clean them.
+    @type returnproc: Boolean
     @param uid: User ID to spawn as; useful for dropping privilages
     @type uid: Integer
     @param gid: Group ID to spawn as; useful for dropping privilages
@@ -491,7 +562,15 @@ def spawn(
     # If the caller wants to handle cleaning up the processes, we tell
     # it about all processes that were created.
     if returnpid:
+        if not portage._internal_caller:
+            warnings.warn(
+                "The portage.process.spawn returnpid paramenter is deprecated and replaced by returnproc",
+                UserWarning,
+                stacklevel=1,
+            )
         return mypids
+    if returnproc:
+        return [Process(pid) for pid in mypids]
 
     # Otherwise we clean them up.
     while mypids:
diff --git a/lib/portage/tests/process/meson.build b/lib/portage/tests/process/meson.build
index b86fa10fb1..e2b3c11d3b 100644
--- a/lib/portage/tests/process/meson.build
+++ b/lib/portage/tests/process/meson.build
@@ -8,6 +8,7 @@ py.install_sources(
         'test_pickle.py',
         'test_poll.py',
         'test_spawn_fail_e2big.py',
+        'test_spawn_returnproc.py',
         'test_spawn_warn_large_env.py',
         'test_unshare_net.py',
         '__init__.py',
diff --git a/lib/portage/tests/process/test_spawn_returnproc.py b/lib/portage/tests/process/test_spawn_returnproc.py
new file mode 100644
index 0000000000..b2d64e81ce
--- /dev/null
+++ b/lib/portage/tests/process/test_spawn_returnproc.py
@@ -0,0 +1,39 @@
+# Copyright 2024 Gentoo Authors
+# Distributed under the terms of the GNU General Public License v2
+
+import asyncio
+import os
+import signal
+
+from portage.process import find_binary, spawn
+from portage.tests import TestCase
+from portage.util._eventloop.global_event_loop import global_event_loop
+
+
+class SpawnReturnProcTestCase(TestCase):
+    def testSpawnReturnProcWait(self):
+        true_binary = find_binary("true")
+        self.assertNotEqual(true_binary, None)
+
+        loop = global_event_loop()
+
+        async def watch_pid():
+            procs = spawn([true_binary], returnproc=True)
+            self.assertEqual(await procs[0].wait(), os.EX_OK)
+            # A second wait should also work.
+            self.assertEqual(await procs[0].wait(), os.EX_OK)
+
+        loop.run_until_complete(watch_pid())
+
+    def testSpawnReturnProcTerminate(self):
+        sleep_binary = find_binary("sleep")
+        self.assertNotEqual(sleep_binary, None)
+
+        loop = global_event_loop()
+
+        async def watch_pid():
+            procs = spawn([sleep_binary, 9999], returnproc=True)
+            procs[0].terminate()
+            self.assertEqual(await procs[0].wait(), -signal.SIGTERM)
+
+        loop.run_until_complete(watch_pid())
-- 
2.41.0



             reply	other threads:[~2024-02-01  6:57 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-01  6:56 Zac Medico [this message]
2024-02-01 16:23 ` [gentoo-portage-dev] [PATCH v2] process.spawn: Add returnproc parameter Zac Medico

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240201065626.2769018-1-zmedico@gentoo.org \
    --to=zmedico@gentoo.org \
    --cc=gentoo-portage-dev@lists.gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox