public inbox for gentoo-portage-dev@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-portage-dev] [PATCH] file_copy: use sendfile return value to measure bytes copied (bug 635126)
@ 2017-10-23  2:17 Zac Medico
  2017-10-23  7:22 ` [gentoo-portage-dev] [PATCH v2] " Zac Medico
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Zac Medico @ 2017-10-23  2:17 UTC (permalink / raw
  To: gentoo-portage-dev; +Cc: Zac Medico

The sendfile *offset parameter refers to the input file offest, so
it cannot be used in the same way as the copy_file_range *off_out
parameter. Therefore, add sf_wrapper function which implements the
*off_out behavior for sendfile.

Bug: https://bugs.gentoo.org/635126
---
 src/portage_util_file_copy_reflink_linux.c | 34 ++++++++++++++++++++++++------
 1 file changed, 28 insertions(+), 6 deletions(-)

diff --git a/src/portage_util_file_copy_reflink_linux.c b/src/portage_util_file_copy_reflink_linux.c
index 4be9e0568..5bfa1cd8f 100644
--- a/src/portage_util_file_copy_reflink_linux.c
+++ b/src/portage_util_file_copy_reflink_linux.c
@@ -56,7 +56,7 @@ initreflink_linux(void)
 
 /**
  * cfr_wrapper - A copy_file_range syscall wrapper function, having a
- * function signature that is compatible with sendfile.
+ * function signature that is compatible with sf_wrapper.
  * @fd_out: output file descriptor
  * @fd_in: input file descriptor
  * @off_out: offset of the output file
@@ -79,6 +79,28 @@ cfr_wrapper(int fd_out, int fd_in, off_t *off_out, size_t len)
 }
 
 /**
+ * sf_wrapper - A sendfile wrapper function, having a function signature
+ * that is compatible with cfr_wrapper.
+ * @fd_out: output file descriptor
+ * @fd_in: input file descriptor
+ * @off_out: offset of the output file
+ * @len: number of bytes to copy between the file descriptors
+ *
+ * Return: Number of bytes written to out_fd on success, -1 on failure
+ * (errno is set appropriately).
+ */
+static ssize_t
+sf_wrapper(int fd_out, int fd_in, off_t *off_out, size_t len)
+{
+    ssize_t ret;
+    ret = sendfile(fd_out, fd_in, NULL, len);
+    if (ret > 0)
+        (*off_out) += ret;
+    return ret;
+}
+
+
+/**
  * do_lseek_data - Adjust file offsets to the next location containing
  * data, creating sparse empty blocks in the output file as needed.
  * @fd_in: input file descriptor
@@ -250,7 +272,7 @@ _reflink_linux_file_copy(PyObject *self, PyObject *args)
                          * syscall is not available (less than Linux 4.5).
                          */
                         error = 0;
-                        copyfunc = sendfile;
+                        copyfunc = sf_wrapper;
                         copyfunc_ret = copyfunc(fd_out,
                                                 fd_in,
                                                 &offset_out,
@@ -293,10 +315,10 @@ _reflink_linux_file_copy(PyObject *self, PyObject *args)
                     error = errno;
                 } else {
                     while (offset_out < stat_in.st_size) {
-                        copyfunc_ret = sendfile(fd_out,
-                                                fd_in,
-                                                &offset_out,
-                                                stat_in.st_size - offset_out);
+                        copyfunc_ret = sf_wrapper(fd_out,
+                                                  fd_in,
+                                                  &offset_out,
+                                                  stat_in.st_size - offset_out);
 
                         if (copyfunc_ret < 0) {
                             error = errno;
-- 
2.13.5



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [gentoo-portage-dev] [PATCH v2] file_copy: use sendfile return value to measure bytes copied (bug 635126)
  2017-10-23  2:17 [gentoo-portage-dev] [PATCH] file_copy: use sendfile return value to measure bytes copied (bug 635126) Zac Medico
@ 2017-10-23  7:22 ` Zac Medico
  2017-10-24 10:06 ` [gentoo-portage-dev] [PATCH v3] " Zac Medico
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Zac Medico @ 2017-10-23  7:22 UTC (permalink / raw
  To: gentoo-portage-dev; +Cc: Zac Medico

The sendfile *offset parameter refers to the input file offest, so
it cannot be used in the same way as the copy_file_range *off_out
parameter. Therefore, add sf_wrapper function which implements the
*off_out behavior for sendfile.

Bug: https://bugs.gentoo.org/635126
---
[PATCH v2] updates sf_wrapper to guarantee that the offset of
fd_in is in sync with *off_out on success

 src/portage_util_file_copy_reflink_linux.c | 41 +++++++++++++++++++++++++-----
 1 file changed, 35 insertions(+), 6 deletions(-)

diff --git a/src/portage_util_file_copy_reflink_linux.c b/src/portage_util_file_copy_reflink_linux.c
index 4be9e0568..3e97730df 100644
--- a/src/portage_util_file_copy_reflink_linux.c
+++ b/src/portage_util_file_copy_reflink_linux.c
@@ -56,7 +56,7 @@ initreflink_linux(void)
 
 /**
  * cfr_wrapper - A copy_file_range syscall wrapper function, having a
- * function signature that is compatible with sendfile.
+ * function signature that is compatible with sf_wrapper.
  * @fd_out: output file descriptor
  * @fd_in: input file descriptor
  * @off_out: offset of the output file
@@ -79,6 +79,35 @@ cfr_wrapper(int fd_out, int fd_in, off_t *off_out, size_t len)
 }
 
 /**
+ * sf_wrapper - A sendfile wrapper function, having a function signature
+ * that is compatible with cfr_wrapper.
+ * @fd_out: output file descriptor
+ * @fd_in: input file descriptor
+ * @off_out: offset of the output file
+ * @len: number of bytes to copy between the file descriptors
+ *
+ * Return: Number of bytes written to out_fd on success, -1 on failure
+ * (errno is set appropriately).
+ */
+static ssize_t
+sf_wrapper(int fd_out, int fd_in, off_t *off_out, size_t len)
+{
+    ssize_t ret;
+    ret = sendfile(fd_out, fd_in, NULL, len);
+    if (ret > 0) {
+        /* The sendfile docs do not guarantee that the offset of fd_in
+         * will be adjusted by the number of bytes written, so use
+         * lseek to guarantee it.
+         */
+        if (lseek(fd_in, (*off_out) + ret, SEEK_SET) < 0)
+            return -1;
+        (*off_out) += ret;
+    }
+    return ret;
+}
+
+
+/**
  * do_lseek_data - Adjust file offsets to the next location containing
  * data, creating sparse empty blocks in the output file as needed.
  * @fd_in: input file descriptor
@@ -250,7 +279,7 @@ _reflink_linux_file_copy(PyObject *self, PyObject *args)
                          * syscall is not available (less than Linux 4.5).
                          */
                         error = 0;
-                        copyfunc = sendfile;
+                        copyfunc = sf_wrapper;
                         copyfunc_ret = copyfunc(fd_out,
                                                 fd_in,
                                                 &offset_out,
@@ -293,10 +322,10 @@ _reflink_linux_file_copy(PyObject *self, PyObject *args)
                     error = errno;
                 } else {
                     while (offset_out < stat_in.st_size) {
-                        copyfunc_ret = sendfile(fd_out,
-                                                fd_in,
-                                                &offset_out,
-                                                stat_in.st_size - offset_out);
+                        copyfunc_ret = sf_wrapper(fd_out,
+                                                  fd_in,
+                                                  &offset_out,
+                                                  stat_in.st_size - offset_out);
 
                         if (copyfunc_ret < 0) {
                             error = errno;
-- 
2.13.5



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [gentoo-portage-dev] [PATCH v3] file_copy: use sendfile return value to measure bytes copied (bug 635126)
  2017-10-23  2:17 [gentoo-portage-dev] [PATCH] file_copy: use sendfile return value to measure bytes copied (bug 635126) Zac Medico
  2017-10-23  7:22 ` [gentoo-portage-dev] [PATCH v2] " Zac Medico
@ 2017-10-24 10:06 ` Zac Medico
  2017-10-24 10:48 ` [gentoo-portage-dev] [PATCH v4] " Zac Medico
  2017-10-24 18:41 ` [gentoo-portage-dev] [PATCH v5] " Zac Medico
  3 siblings, 0 replies; 7+ messages in thread
From: Zac Medico @ 2017-10-24 10:06 UTC (permalink / raw
  To: gentoo-portage-dev; +Cc: Zac Medico

The sendfile *offset parameter refers to the input file offest, so
it cannot be used in the same way as the copy_file_range *off_out
parameter. Therefore, add sf_wrapper function which implements the
*off_out behavior for sendfile.

Bug: https://bugs.gentoo.org/635126
---
[PATCH v3] adds an additional lseek call to ensure that the output
file is positioned at the correct offset

 src/portage_util_file_copy_reflink_linux.c | 43 +++++++++++++++++++++++++-----
 1 file changed, 37 insertions(+), 6 deletions(-)

diff --git a/src/portage_util_file_copy_reflink_linux.c b/src/portage_util_file_copy_reflink_linux.c
index 4be9e0568..35a197590 100644
--- a/src/portage_util_file_copy_reflink_linux.c
+++ b/src/portage_util_file_copy_reflink_linux.c
@@ -56,7 +56,7 @@ initreflink_linux(void)
 
 /**
  * cfr_wrapper - A copy_file_range syscall wrapper function, having a
- * function signature that is compatible with sendfile.
+ * function signature that is compatible with sf_wrapper.
  * @fd_out: output file descriptor
  * @fd_in: input file descriptor
  * @off_out: offset of the output file
@@ -79,6 +79,37 @@ cfr_wrapper(int fd_out, int fd_in, off_t *off_out, size_t len)
 }
 
 /**
+ * sf_wrapper - A sendfile wrapper function, having a function signature
+ * that is compatible with cfr_wrapper.
+ * @fd_out: output file descriptor
+ * @fd_in: input file descriptor
+ * @off_out: offset of the output file
+ * @len: number of bytes to copy between the file descriptors
+ *
+ * Return: Number of bytes written to out_fd on success, -1 on failure
+ * (errno is set appropriately).
+ */
+static ssize_t
+sf_wrapper(int fd_out, int fd_in, off_t *off_out, size_t len)
+{
+    ssize_t ret;
+    if (lseek(fd_out, *off_out, SEEK_SET) < 0)
+        return -1;
+    ret = sendfile(fd_out, fd_in, NULL, len);
+    if (ret > 0) {
+        /* The sendfile docs do not guarantee that the offset of fd_in
+         * will be adjusted by the number of bytes written, so use
+         * lseek to guarantee it.
+         */
+        if (lseek(fd_in, (*off_out) + ret, SEEK_SET) < 0)
+            return -1;
+        (*off_out) += ret;
+    }
+    return ret;
+}
+
+
+/**
  * do_lseek_data - Adjust file offsets to the next location containing
  * data, creating sparse empty blocks in the output file as needed.
  * @fd_in: input file descriptor
@@ -250,7 +281,7 @@ _reflink_linux_file_copy(PyObject *self, PyObject *args)
                          * syscall is not available (less than Linux 4.5).
                          */
                         error = 0;
-                        copyfunc = sendfile;
+                        copyfunc = sf_wrapper;
                         copyfunc_ret = copyfunc(fd_out,
                                                 fd_in,
                                                 &offset_out,
@@ -293,10 +324,10 @@ _reflink_linux_file_copy(PyObject *self, PyObject *args)
                     error = errno;
                 } else {
                     while (offset_out < stat_in.st_size) {
-                        copyfunc_ret = sendfile(fd_out,
-                                                fd_in,
-                                                &offset_out,
-                                                stat_in.st_size - offset_out);
+                        copyfunc_ret = sf_wrapper(fd_out,
+                                                  fd_in,
+                                                  &offset_out,
+                                                  stat_in.st_size - offset_out);
 
                         if (copyfunc_ret < 0) {
                             error = errno;
-- 
2.13.5



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [gentoo-portage-dev] [PATCH v4] file_copy: use sendfile return value to measure bytes copied (bug 635126)
  2017-10-23  2:17 [gentoo-portage-dev] [PATCH] file_copy: use sendfile return value to measure bytes copied (bug 635126) Zac Medico
  2017-10-23  7:22 ` [gentoo-portage-dev] [PATCH v2] " Zac Medico
  2017-10-24 10:06 ` [gentoo-portage-dev] [PATCH v3] " Zac Medico
@ 2017-10-24 10:48 ` Zac Medico
  2017-10-24 18:41 ` [gentoo-portage-dev] [PATCH v5] " Zac Medico
  3 siblings, 0 replies; 7+ messages in thread
From: Zac Medico @ 2017-10-24 10:48 UTC (permalink / raw
  To: gentoo-portage-dev; +Cc: Zac Medico

The sendfile *offset parameter refers to the input file offest, so
it cannot be used in the same way as the copy_file_range *off_out
parameter. Therefore, add sf_wrapper function which implements the
*off_out behavior for sendfile.

Bug: https://bugs.gentoo.org/635126
---
[PATCH v4] ensures that both input and output file offsets are aligned
prior to the sendfile call

 src/portage_util_file_copy_reflink_linux.c | 43 +++++++++++++++++++++++++-----
 1 file changed, 37 insertions(+), 6 deletions(-)

diff --git a/src/portage_util_file_copy_reflink_linux.c b/src/portage_util_file_copy_reflink_linux.c
index 4be9e0568..9d00ae733 100644
--- a/src/portage_util_file_copy_reflink_linux.c
+++ b/src/portage_util_file_copy_reflink_linux.c
@@ -56,7 +56,7 @@ initreflink_linux(void)
 
 /**
  * cfr_wrapper - A copy_file_range syscall wrapper function, having a
- * function signature that is compatible with sendfile.
+ * function signature that is compatible with sf_wrapper.
  * @fd_out: output file descriptor
  * @fd_in: input file descriptor
  * @off_out: offset of the output file
@@ -79,6 +79,37 @@ cfr_wrapper(int fd_out, int fd_in, off_t *off_out, size_t len)
 }
 
 /**
+ * sf_wrapper - A sendfile wrapper function, having a function signature
+ * that is compatible with cfr_wrapper.
+ * @fd_out: output file descriptor
+ * @fd_in: input file descriptor
+ * @off_out: offset of the output file
+ * @len: number of bytes to copy between the file descriptors
+ *
+ * Return: Number of bytes written to out_fd on success, -1 on failure
+ * (errno is set appropriately).
+ */
+static ssize_t
+sf_wrapper(int fd_out, int fd_in, off_t *off_out, size_t len)
+{
+    ssize_t ret;
+    /* The sendfile docs do not guarantee that the offset of fd_in
+     * will be adjusted by the number of bytes written. Therefore,
+     * prior to each sendfile call, always use lseek to ensure that
+     * both file offsets are in alignment.
+     */
+    if (lseek(fd_in, *off_out, SEEK_SET) < 0)
+        return -1;
+    if (lseek(fd_out, *off_out, SEEK_SET) < 0)
+        return -1;
+    ret = sendfile(fd_out, fd_in, NULL, len);
+    if (ret > 0)
+        *off_out += ret;
+    return ret;
+}
+
+
+/**
  * do_lseek_data - Adjust file offsets to the next location containing
  * data, creating sparse empty blocks in the output file as needed.
  * @fd_in: input file descriptor
@@ -250,7 +281,7 @@ _reflink_linux_file_copy(PyObject *self, PyObject *args)
                          * syscall is not available (less than Linux 4.5).
                          */
                         error = 0;
-                        copyfunc = sendfile;
+                        copyfunc = sf_wrapper;
                         copyfunc_ret = copyfunc(fd_out,
                                                 fd_in,
                                                 &offset_out,
@@ -293,10 +324,10 @@ _reflink_linux_file_copy(PyObject *self, PyObject *args)
                     error = errno;
                 } else {
                     while (offset_out < stat_in.st_size) {
-                        copyfunc_ret = sendfile(fd_out,
-                                                fd_in,
-                                                &offset_out,
-                                                stat_in.st_size - offset_out);
+                        copyfunc_ret = sf_wrapper(fd_out,
+                                                  fd_in,
+                                                  &offset_out,
+                                                  stat_in.st_size - offset_out);
 
                         if (copyfunc_ret < 0) {
                             error = errno;
-- 
2.13.5



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [gentoo-portage-dev] [PATCH v5] file_copy: use sendfile return value to measure bytes copied (bug 635126)
  2017-10-23  2:17 [gentoo-portage-dev] [PATCH] file_copy: use sendfile return value to measure bytes copied (bug 635126) Zac Medico
                   ` (2 preceding siblings ...)
  2017-10-24 10:48 ` [gentoo-portage-dev] [PATCH v4] " Zac Medico
@ 2017-10-24 18:41 ` Zac Medico
  2017-11-01 16:31   ` Brian Dolbec
  3 siblings, 1 reply; 7+ messages in thread
From: Zac Medico @ 2017-10-24 18:41 UTC (permalink / raw
  To: gentoo-portage-dev; +Cc: Zac Medico

The sendfile *offset parameter refers to the input file offest, so
it cannot be used in the same way as the copy_file_range *off_out
parameter. Therefore, add sf_wrapper function which implements the
*off_out behavior for sendfile.

Also update cfr_wrapper so that it does not rely on the fd_in file
offset, and remove corresponding fd_in lseek calls which are no
longer needed.

The file offset of fd_in is now completely unused, except in the
plain read/write loop, where lseek is called prior to entering
the loop.

Bug: https://bugs.gentoo.org/635126
---
[PATCH v5] eliminates all reliance on the file offset of fd_in,
except in the plain read/write loop, where lseek is called prior
to entering the loop.

 src/portage_util_file_copy_reflink_linux.c | 88 ++++++++++++++++++++----------
 1 file changed, 59 insertions(+), 29 deletions(-)

diff --git a/src/portage_util_file_copy_reflink_linux.c b/src/portage_util_file_copy_reflink_linux.c
index 4be9e0568..4c3303181 100644
--- a/src/portage_util_file_copy_reflink_linux.c
+++ b/src/portage_util_file_copy_reflink_linux.c
@@ -56,12 +56,18 @@ initreflink_linux(void)
 
 /**
  * cfr_wrapper - A copy_file_range syscall wrapper function, having a
- * function signature that is compatible with sendfile.
+ * function signature that is compatible with sf_wrapper.
  * @fd_out: output file descriptor
  * @fd_in: input file descriptor
- * @off_out: offset of the output file
+ * @off_out: must point to a buffer that specifies the starting offset
+ * where bytes will be copied to fd_out, and this buffer is adjusted by
+ * the number of bytes copied.
  * @len: number of bytes to copy between the file descriptors
  *
+ * Bytes are copied from fd_in starting from *off_out, and the file
+ * offset of fd_in is not changed. Effects on the file offset of
+ * fd_out are undefined.
+ *
  * Return: Number of bytes written to out_fd on success, -1 on failure
  * (errno is set appropriately).
  */
@@ -69,7 +75,8 @@ static ssize_t
 cfr_wrapper(int fd_out, int fd_in, off_t *off_out, size_t len)
 {
 #ifdef __NR_copy_file_range
-    return syscall(__NR_copy_file_range, fd_in, NULL, fd_out,
+    off_t off_in = *off_out;
+    return syscall(__NR_copy_file_range, fd_in, &off_in, fd_out,
                    off_out, len, 0);
 #else
     /* This is how it fails at runtime when the syscall is not supported. */
@@ -79,18 +86,50 @@ cfr_wrapper(int fd_out, int fd_in, off_t *off_out, size_t len)
 }
 
 /**
+ * sf_wrapper - A sendfile wrapper function, having a function signature
+ * that is compatible with cfr_wrapper.
+ * @fd_out: output file descriptor
+ * @fd_in: input file descriptor
+ * @off_out: must point to a buffer that specifies the starting offset
+ * where bytes will be copied to fd_out, and this buffer is adjusted by
+ * the number of bytes copied.
+ * @len: number of bytes to copy between the file descriptors
+ *
+ * Bytes are copied from fd_in starting from *off_out, and the file
+ * offset of fd_in is not changed. Effects on the file offset of
+ * fd_out are undefined.
+ *
+ * Return: Number of bytes written to out_fd on success, -1 on failure
+ * (errno is set appropriately).
+ */
+static ssize_t
+sf_wrapper(int fd_out, int fd_in, off_t *off_out, size_t len)
+{
+    ssize_t ret;
+    off_t off_in = *off_out;
+    /* The sendfile docs do not specify behavior of the output file
+     * offset, therefore it must be adjusted with lseek.
+     */
+    if (lseek(fd_out, *off_out, SEEK_SET) < 0)
+        return -1;
+    ret = sendfile(fd_out, fd_in, &off_in, len);
+    if (ret > 0)
+        *off_out += ret;
+    return ret;
+}
+
+
+/**
  * do_lseek_data - Adjust file offsets to the next location containing
  * data, creating sparse empty blocks in the output file as needed.
  * @fd_in: input file descriptor
  * @fd_out: output file descriptor
  * @off_out: offset of the output file
  *
- * Use lseek SEEK_DATA to adjust the fd_in file offset to the next
- * location containing data, and adjust the fd_in file offset and
- * off_out to the same location (creating sparse empty blocks as
- * needed). On success, both fd_in and fd_out file offsets are
- * guaranteed to be exactly equal to the value that off_out points to.
- * 
+ * Use lseek SEEK_DATA to adjust off_out to the next location from fd_in
+ * containing data (creates sparse empty blocks when appropriate). Effects
+ * on file offsets are undefined.
+ *
  * Return: On success, the number of bytes to copy before the next hole,
  * and -1 on failure (errno is set appropriately). Returns 0 when fd_in
  * reaches EOF.
@@ -250,7 +289,7 @@ _reflink_linux_file_copy(PyObject *self, PyObject *args)
                          * syscall is not available (less than Linux 4.5).
                          */
                         error = 0;
-                        copyfunc = sendfile;
+                        copyfunc = sf_wrapper;
                         copyfunc_ret = copyfunc(fd_out,
                                                 fd_in,
                                                 &offset_out,
@@ -284,27 +323,18 @@ _reflink_linux_file_copy(PyObject *self, PyObject *args)
             } else {
                 stat_in_acquired = 1;
 
-                /* For the sendfile call, the fd_in file offset must be
-                 * exactly equal to offset_out. Use lseek to ensure
-                 * correct state, in case an EINTR retry caused it to
-                 * get out of sync somewhow.
-                 */
-                if (lseek(fd_in, offset_out, SEEK_SET) < 0) {
-                    error = errno;
-                } else {
-                    while (offset_out < stat_in.st_size) {
-                        copyfunc_ret = sendfile(fd_out,
-                                                fd_in,
-                                                &offset_out,
-                                                stat_in.st_size - offset_out);
+                while (offset_out < stat_in.st_size) {
+                    copyfunc_ret = sf_wrapper(fd_out,
+                                              fd_in,
+                                              &offset_out,
+                                              stat_in.st_size - offset_out);
 
-                        if (copyfunc_ret < 0) {
-                            error = errno;
-                            if (errno == EINVAL && !offset_out) {
-                                sendfile_works = 0;
-                            }
-                            break;
+                    if (copyfunc_ret < 0) {
+                        error = errno;
+                        if (errno == EINVAL && !offset_out) {
+                            sendfile_works = 0;
                         }
+                        break;
                     }
                 }
             }
-- 
2.13.5



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [gentoo-portage-dev] [PATCH v5] file_copy: use sendfile return value to measure bytes copied (bug 635126)
  2017-10-24 18:41 ` [gentoo-portage-dev] [PATCH v5] " Zac Medico
@ 2017-11-01 16:31   ` Brian Dolbec
  2017-11-01 16:33     ` Zac Medico
  0 siblings, 1 reply; 7+ messages in thread
From: Brian Dolbec @ 2017-11-01 16:31 UTC (permalink / raw
  To: gentoo-portage-dev

On Tue, 24 Oct 2017 11:41:37 -0700
Zac Medico <zmedico@gentoo.org> wrote:

> The sendfile *offset parameter refers to the input file offest, so
> it cannot be used in the same way as the copy_file_range *off_out
> parameter. Therefore, add sf_wrapper function which implements the
> *off_out behavior for sendfile.
> 
> Also update cfr_wrapper so that it does not rely on the fd_in file
> offset, and remove corresponding fd_in lseek calls which are no
> longer needed.
> 
> The file offset of fd_in is now completely unused, except in the
> plain read/write loop, where lseek is called prior to entering
> the loop.
> 
> Bug: https://bugs.gentoo.org/635126
> ---
> [PATCH v5] eliminates all reliance on the file offset of fd_in,
> except in the plain read/write loop, where lseek is called prior
> to entering the loop.
> 
>  src/portage_util_file_copy_reflink_linux.c | 88
> ++++++++++++++++++++---------- 1 file changed, 59 insertions(+), 29
> deletions(-)
> 
> diff --git a/src/portage_util_file_copy_reflink_linux.c
> b/src/portage_util_file_copy_reflink_linux.c index
> 4be9e0568..4c3303181 100644 ---
> a/src/portage_util_file_copy_reflink_linux.c +++
> b/src/portage_util_file_copy_reflink_linux.c @@ -56,12 +56,18 @@
> initreflink_linux(void) 
>  /**
>   * cfr_wrapper - A copy_file_range syscall wrapper function, having a
> - * function signature that is compatible with sendfile.
> + * function signature that is compatible with sf_wrapper.
>   * @fd_out: output file descriptor
>   * @fd_in: input file descriptor
> - * @off_out: offset of the output file
> + * @off_out: must point to a buffer that specifies the starting
> offset
> + * where bytes will be copied to fd_out, and this buffer is adjusted
> by
> + * the number of bytes copied.
>   * @len: number of bytes to copy between the file descriptors
>   *
> + * Bytes are copied from fd_in starting from *off_out, and the file
> + * offset of fd_in is not changed. Effects on the file offset of
> + * fd_out are undefined.
> + *
>   * Return: Number of bytes written to out_fd on success, -1 on
> failure
>   * (errno is set appropriately).
>   */
> @@ -69,7 +75,8 @@ static ssize_t
>  cfr_wrapper(int fd_out, int fd_in, off_t *off_out, size_t len)
>  {
>  #ifdef __NR_copy_file_range
> -    return syscall(__NR_copy_file_range, fd_in, NULL, fd_out,
> +    off_t off_in = *off_out;
> +    return syscall(__NR_copy_file_range, fd_in, &off_in, fd_out,
>                     off_out, len, 0);
>  #else
>      /* This is how it fails at runtime when the syscall is not
> supported. */ @@ -79,18 +86,50 @@ cfr_wrapper(int fd_out, int fd_in,
> off_t *off_out, size_t len) }
>  
>  /**
> + * sf_wrapper - A sendfile wrapper function, having a function
> signature
> + * that is compatible with cfr_wrapper.
> + * @fd_out: output file descriptor
> + * @fd_in: input file descriptor
> + * @off_out: must point to a buffer that specifies the starting
> offset
> + * where bytes will be copied to fd_out, and this buffer is adjusted
> by
> + * the number of bytes copied.
> + * @len: number of bytes to copy between the file descriptors
> + *
> + * Bytes are copied from fd_in starting from *off_out, and the file
> + * offset of fd_in is not changed. Effects on the file offset of
> + * fd_out are undefined.
> + *
> + * Return: Number of bytes written to out_fd on success, -1 on
> failure
> + * (errno is set appropriately).
> + */
> +static ssize_t
> +sf_wrapper(int fd_out, int fd_in, off_t *off_out, size_t len)
> +{
> +    ssize_t ret;
> +    off_t off_in = *off_out;
> +    /* The sendfile docs do not specify behavior of the output file
> +     * offset, therefore it must be adjusted with lseek.
> +     */
> +    if (lseek(fd_out, *off_out, SEEK_SET) < 0)
> +        return -1;
> +    ret = sendfile(fd_out, fd_in, &off_in, len);
> +    if (ret > 0)
> +        *off_out += ret;
> +    return ret;
> +}
> +
> +
> +/**
>   * do_lseek_data - Adjust file offsets to the next location
> containing
>   * data, creating sparse empty blocks in the output file as needed.
>   * @fd_in: input file descriptor
>   * @fd_out: output file descriptor
>   * @off_out: offset of the output file
>   *
> - * Use lseek SEEK_DATA to adjust the fd_in file offset to the next
> - * location containing data, and adjust the fd_in file offset and
> - * off_out to the same location (creating sparse empty blocks as
> - * needed). On success, both fd_in and fd_out file offsets are
> - * guaranteed to be exactly equal to the value that off_out points
> to.
> - * 
> + * Use lseek SEEK_DATA to adjust off_out to the next location from
> fd_in
> + * containing data (creates sparse empty blocks when appropriate).
> Effects
> + * on file offsets are undefined.
> + *
>   * Return: On success, the number of bytes to copy before the next
> hole,
>   * and -1 on failure (errno is set appropriately). Returns 0 when
> fd_in
>   * reaches EOF.
> @@ -250,7 +289,7 @@ _reflink_linux_file_copy(PyObject *self, PyObject
> *args)
>                           * syscall is not available (less than Linux
> 4.5). */
>                          error = 0;
> -                        copyfunc = sendfile;
> +                        copyfunc = sf_wrapper;
>                          copyfunc_ret = copyfunc(fd_out,
>                                                  fd_in,
>                                                  &offset_out,
> @@ -284,27 +323,18 @@ _reflink_linux_file_copy(PyObject *self,
> PyObject *args) } else {
>                  stat_in_acquired = 1;
>  
> -                /* For the sendfile call, the fd_in file offset must
> be
> -                 * exactly equal to offset_out. Use lseek to ensure
> -                 * correct state, in case an EINTR retry caused it to
> -                 * get out of sync somewhow.
> -                 */
> -                if (lseek(fd_in, offset_out, SEEK_SET) < 0) {
> -                    error = errno;
> -                } else {
> -                    while (offset_out < stat_in.st_size) {
> -                        copyfunc_ret = sendfile(fd_out,
> -                                                fd_in,
> -                                                &offset_out,
> -                                                stat_in.st_size -
> offset_out);
> +                while (offset_out < stat_in.st_size) {
> +                    copyfunc_ret = sf_wrapper(fd_out,
> +                                              fd_in,
> +                                              &offset_out,
> +                                              stat_in.st_size -
> offset_out); 
> -                        if (copyfunc_ret < 0) {
> -                            error = errno;
> -                            if (errno == EINVAL && !offset_out) {
> -                                sendfile_works = 0;
> -                            }
> -                            break;
> +                    if (copyfunc_ret < 0) {
> +                        error = errno;
> +                        if (errno == EINVAL && !offset_out) {
> +                            sendfile_works = 0;
>                          }
> +                        break;
>                      }
>                  }
>              }


Looks good, has been checked by chutzpah

-- 
Brian Dolbec <dolsen>



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [gentoo-portage-dev] [PATCH v5] file_copy: use sendfile return value to measure bytes copied (bug 635126)
  2017-11-01 16:31   ` Brian Dolbec
@ 2017-11-01 16:33     ` Zac Medico
  0 siblings, 0 replies; 7+ messages in thread
From: Zac Medico @ 2017-11-01 16:33 UTC (permalink / raw
  To: gentoo-portage-dev, Brian Dolbec


[-- Attachment #1.1: Type: text/plain, Size: 1248 bytes --]

On 11/01/2017 09:31 AM, Brian Dolbec wrote:
> On Tue, 24 Oct 2017 11:41:37 -0700
> Zac Medico <zmedico@gentoo.org> wrote:
> 
>> The sendfile *offset parameter refers to the input file offest, so
>> it cannot be used in the same way as the copy_file_range *off_out
>> parameter. Therefore, add sf_wrapper function which implements the
>> *off_out behavior for sendfile.
>>
>> Also update cfr_wrapper so that it does not rely on the fd_in file
>> offset, and remove corresponding fd_in lseek calls which are no
>> longer needed.
>>
>> The file offset of fd_in is now completely unused, except in the
>> plain read/write loop, where lseek is called prior to entering
>> the loop.
>>
>> Bug: https://bugs.gentoo.org/635126
>> ---
>> [PATCH v5] eliminates all reliance on the file offset of fd_in,
>> except in the plain read/write loop, where lseek is called prior
>> to entering the loop.
>>
>>  src/portage_util_file_copy_reflink_linux.c | 88
>> ++++++++++++++++++++---------- 1 file changed, 59 insertions(+), 29
>> deletions(-)
> 
> 
> Looks good, has been checked by chutzpah
> 

Thanks, merged:

https://gitweb.gentoo.org/proj/portage.git/commit/?id=5b6cf172e378f6da88e9634aa4e89f2f34390659
-- 
Thanks,
Zac


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 224 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-11-01 16:34 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-23  2:17 [gentoo-portage-dev] [PATCH] file_copy: use sendfile return value to measure bytes copied (bug 635126) Zac Medico
2017-10-23  7:22 ` [gentoo-portage-dev] [PATCH v2] " Zac Medico
2017-10-24 10:06 ` [gentoo-portage-dev] [PATCH v3] " Zac Medico
2017-10-24 10:48 ` [gentoo-portage-dev] [PATCH v4] " Zac Medico
2017-10-24 18:41 ` [gentoo-portage-dev] [PATCH v5] " Zac Medico
2017-11-01 16:31   ` Brian Dolbec
2017-11-01 16:33     ` Zac Medico

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox