* [gentoo-portage-dev] [PATCH] file_copy: use sendfile return value to measure bytes copied (bug 635126)
@ 2017-10-23 2:17 Zac Medico
2017-10-23 7:22 ` [gentoo-portage-dev] [PATCH v2] " Zac Medico
` (3 more replies)
0 siblings, 4 replies; 7+ messages in thread
From: Zac Medico @ 2017-10-23 2:17 UTC (permalink / raw
To: gentoo-portage-dev; +Cc: Zac Medico
The sendfile *offset parameter refers to the input file offest, so
it cannot be used in the same way as the copy_file_range *off_out
parameter. Therefore, add sf_wrapper function which implements the
*off_out behavior for sendfile.
Bug: https://bugs.gentoo.org/635126
---
src/portage_util_file_copy_reflink_linux.c | 34 ++++++++++++++++++++++++------
1 file changed, 28 insertions(+), 6 deletions(-)
diff --git a/src/portage_util_file_copy_reflink_linux.c b/src/portage_util_file_copy_reflink_linux.c
index 4be9e0568..5bfa1cd8f 100644
--- a/src/portage_util_file_copy_reflink_linux.c
+++ b/src/portage_util_file_copy_reflink_linux.c
@@ -56,7 +56,7 @@ initreflink_linux(void)
/**
* cfr_wrapper - A copy_file_range syscall wrapper function, having a
- * function signature that is compatible with sendfile.
+ * function signature that is compatible with sf_wrapper.
* @fd_out: output file descriptor
* @fd_in: input file descriptor
* @off_out: offset of the output file
@@ -79,6 +79,28 @@ cfr_wrapper(int fd_out, int fd_in, off_t *off_out, size_t len)
}
/**
+ * sf_wrapper - A sendfile wrapper function, having a function signature
+ * that is compatible with cfr_wrapper.
+ * @fd_out: output file descriptor
+ * @fd_in: input file descriptor
+ * @off_out: offset of the output file
+ * @len: number of bytes to copy between the file descriptors
+ *
+ * Return: Number of bytes written to out_fd on success, -1 on failure
+ * (errno is set appropriately).
+ */
+static ssize_t
+sf_wrapper(int fd_out, int fd_in, off_t *off_out, size_t len)
+{
+ ssize_t ret;
+ ret = sendfile(fd_out, fd_in, NULL, len);
+ if (ret > 0)
+ (*off_out) += ret;
+ return ret;
+}
+
+
+/**
* do_lseek_data - Adjust file offsets to the next location containing
* data, creating sparse empty blocks in the output file as needed.
* @fd_in: input file descriptor
@@ -250,7 +272,7 @@ _reflink_linux_file_copy(PyObject *self, PyObject *args)
* syscall is not available (less than Linux 4.5).
*/
error = 0;
- copyfunc = sendfile;
+ copyfunc = sf_wrapper;
copyfunc_ret = copyfunc(fd_out,
fd_in,
&offset_out,
@@ -293,10 +315,10 @@ _reflink_linux_file_copy(PyObject *self, PyObject *args)
error = errno;
} else {
while (offset_out < stat_in.st_size) {
- copyfunc_ret = sendfile(fd_out,
- fd_in,
- &offset_out,
- stat_in.st_size - offset_out);
+ copyfunc_ret = sf_wrapper(fd_out,
+ fd_in,
+ &offset_out,
+ stat_in.st_size - offset_out);
if (copyfunc_ret < 0) {
error = errno;
--
2.13.5
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [gentoo-portage-dev] [PATCH v2] file_copy: use sendfile return value to measure bytes copied (bug 635126)
2017-10-23 2:17 [gentoo-portage-dev] [PATCH] file_copy: use sendfile return value to measure bytes copied (bug 635126) Zac Medico
@ 2017-10-23 7:22 ` Zac Medico
2017-10-24 10:06 ` [gentoo-portage-dev] [PATCH v3] " Zac Medico
` (2 subsequent siblings)
3 siblings, 0 replies; 7+ messages in thread
From: Zac Medico @ 2017-10-23 7:22 UTC (permalink / raw
To: gentoo-portage-dev; +Cc: Zac Medico
The sendfile *offset parameter refers to the input file offest, so
it cannot be used in the same way as the copy_file_range *off_out
parameter. Therefore, add sf_wrapper function which implements the
*off_out behavior for sendfile.
Bug: https://bugs.gentoo.org/635126
---
[PATCH v2] updates sf_wrapper to guarantee that the offset of
fd_in is in sync with *off_out on success
src/portage_util_file_copy_reflink_linux.c | 41 +++++++++++++++++++++++++-----
1 file changed, 35 insertions(+), 6 deletions(-)
diff --git a/src/portage_util_file_copy_reflink_linux.c b/src/portage_util_file_copy_reflink_linux.c
index 4be9e0568..3e97730df 100644
--- a/src/portage_util_file_copy_reflink_linux.c
+++ b/src/portage_util_file_copy_reflink_linux.c
@@ -56,7 +56,7 @@ initreflink_linux(void)
/**
* cfr_wrapper - A copy_file_range syscall wrapper function, having a
- * function signature that is compatible with sendfile.
+ * function signature that is compatible with sf_wrapper.
* @fd_out: output file descriptor
* @fd_in: input file descriptor
* @off_out: offset of the output file
@@ -79,6 +79,35 @@ cfr_wrapper(int fd_out, int fd_in, off_t *off_out, size_t len)
}
/**
+ * sf_wrapper - A sendfile wrapper function, having a function signature
+ * that is compatible with cfr_wrapper.
+ * @fd_out: output file descriptor
+ * @fd_in: input file descriptor
+ * @off_out: offset of the output file
+ * @len: number of bytes to copy between the file descriptors
+ *
+ * Return: Number of bytes written to out_fd on success, -1 on failure
+ * (errno is set appropriately).
+ */
+static ssize_t
+sf_wrapper(int fd_out, int fd_in, off_t *off_out, size_t len)
+{
+ ssize_t ret;
+ ret = sendfile(fd_out, fd_in, NULL, len);
+ if (ret > 0) {
+ /* The sendfile docs do not guarantee that the offset of fd_in
+ * will be adjusted by the number of bytes written, so use
+ * lseek to guarantee it.
+ */
+ if (lseek(fd_in, (*off_out) + ret, SEEK_SET) < 0)
+ return -1;
+ (*off_out) += ret;
+ }
+ return ret;
+}
+
+
+/**
* do_lseek_data - Adjust file offsets to the next location containing
* data, creating sparse empty blocks in the output file as needed.
* @fd_in: input file descriptor
@@ -250,7 +279,7 @@ _reflink_linux_file_copy(PyObject *self, PyObject *args)
* syscall is not available (less than Linux 4.5).
*/
error = 0;
- copyfunc = sendfile;
+ copyfunc = sf_wrapper;
copyfunc_ret = copyfunc(fd_out,
fd_in,
&offset_out,
@@ -293,10 +322,10 @@ _reflink_linux_file_copy(PyObject *self, PyObject *args)
error = errno;
} else {
while (offset_out < stat_in.st_size) {
- copyfunc_ret = sendfile(fd_out,
- fd_in,
- &offset_out,
- stat_in.st_size - offset_out);
+ copyfunc_ret = sf_wrapper(fd_out,
+ fd_in,
+ &offset_out,
+ stat_in.st_size - offset_out);
if (copyfunc_ret < 0) {
error = errno;
--
2.13.5
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [gentoo-portage-dev] [PATCH v3] file_copy: use sendfile return value to measure bytes copied (bug 635126)
2017-10-23 2:17 [gentoo-portage-dev] [PATCH] file_copy: use sendfile return value to measure bytes copied (bug 635126) Zac Medico
2017-10-23 7:22 ` [gentoo-portage-dev] [PATCH v2] " Zac Medico
@ 2017-10-24 10:06 ` Zac Medico
2017-10-24 10:48 ` [gentoo-portage-dev] [PATCH v4] " Zac Medico
2017-10-24 18:41 ` [gentoo-portage-dev] [PATCH v5] " Zac Medico
3 siblings, 0 replies; 7+ messages in thread
From: Zac Medico @ 2017-10-24 10:06 UTC (permalink / raw
To: gentoo-portage-dev; +Cc: Zac Medico
The sendfile *offset parameter refers to the input file offest, so
it cannot be used in the same way as the copy_file_range *off_out
parameter. Therefore, add sf_wrapper function which implements the
*off_out behavior for sendfile.
Bug: https://bugs.gentoo.org/635126
---
[PATCH v3] adds an additional lseek call to ensure that the output
file is positioned at the correct offset
src/portage_util_file_copy_reflink_linux.c | 43 +++++++++++++++++++++++++-----
1 file changed, 37 insertions(+), 6 deletions(-)
diff --git a/src/portage_util_file_copy_reflink_linux.c b/src/portage_util_file_copy_reflink_linux.c
index 4be9e0568..35a197590 100644
--- a/src/portage_util_file_copy_reflink_linux.c
+++ b/src/portage_util_file_copy_reflink_linux.c
@@ -56,7 +56,7 @@ initreflink_linux(void)
/**
* cfr_wrapper - A copy_file_range syscall wrapper function, having a
- * function signature that is compatible with sendfile.
+ * function signature that is compatible with sf_wrapper.
* @fd_out: output file descriptor
* @fd_in: input file descriptor
* @off_out: offset of the output file
@@ -79,6 +79,37 @@ cfr_wrapper(int fd_out, int fd_in, off_t *off_out, size_t len)
}
/**
+ * sf_wrapper - A sendfile wrapper function, having a function signature
+ * that is compatible with cfr_wrapper.
+ * @fd_out: output file descriptor
+ * @fd_in: input file descriptor
+ * @off_out: offset of the output file
+ * @len: number of bytes to copy between the file descriptors
+ *
+ * Return: Number of bytes written to out_fd on success, -1 on failure
+ * (errno is set appropriately).
+ */
+static ssize_t
+sf_wrapper(int fd_out, int fd_in, off_t *off_out, size_t len)
+{
+ ssize_t ret;
+ if (lseek(fd_out, *off_out, SEEK_SET) < 0)
+ return -1;
+ ret = sendfile(fd_out, fd_in, NULL, len);
+ if (ret > 0) {
+ /* The sendfile docs do not guarantee that the offset of fd_in
+ * will be adjusted by the number of bytes written, so use
+ * lseek to guarantee it.
+ */
+ if (lseek(fd_in, (*off_out) + ret, SEEK_SET) < 0)
+ return -1;
+ (*off_out) += ret;
+ }
+ return ret;
+}
+
+
+/**
* do_lseek_data - Adjust file offsets to the next location containing
* data, creating sparse empty blocks in the output file as needed.
* @fd_in: input file descriptor
@@ -250,7 +281,7 @@ _reflink_linux_file_copy(PyObject *self, PyObject *args)
* syscall is not available (less than Linux 4.5).
*/
error = 0;
- copyfunc = sendfile;
+ copyfunc = sf_wrapper;
copyfunc_ret = copyfunc(fd_out,
fd_in,
&offset_out,
@@ -293,10 +324,10 @@ _reflink_linux_file_copy(PyObject *self, PyObject *args)
error = errno;
} else {
while (offset_out < stat_in.st_size) {
- copyfunc_ret = sendfile(fd_out,
- fd_in,
- &offset_out,
- stat_in.st_size - offset_out);
+ copyfunc_ret = sf_wrapper(fd_out,
+ fd_in,
+ &offset_out,
+ stat_in.st_size - offset_out);
if (copyfunc_ret < 0) {
error = errno;
--
2.13.5
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [gentoo-portage-dev] [PATCH v4] file_copy: use sendfile return value to measure bytes copied (bug 635126)
2017-10-23 2:17 [gentoo-portage-dev] [PATCH] file_copy: use sendfile return value to measure bytes copied (bug 635126) Zac Medico
2017-10-23 7:22 ` [gentoo-portage-dev] [PATCH v2] " Zac Medico
2017-10-24 10:06 ` [gentoo-portage-dev] [PATCH v3] " Zac Medico
@ 2017-10-24 10:48 ` Zac Medico
2017-10-24 18:41 ` [gentoo-portage-dev] [PATCH v5] " Zac Medico
3 siblings, 0 replies; 7+ messages in thread
From: Zac Medico @ 2017-10-24 10:48 UTC (permalink / raw
To: gentoo-portage-dev; +Cc: Zac Medico
The sendfile *offset parameter refers to the input file offest, so
it cannot be used in the same way as the copy_file_range *off_out
parameter. Therefore, add sf_wrapper function which implements the
*off_out behavior for sendfile.
Bug: https://bugs.gentoo.org/635126
---
[PATCH v4] ensures that both input and output file offsets are aligned
prior to the sendfile call
src/portage_util_file_copy_reflink_linux.c | 43 +++++++++++++++++++++++++-----
1 file changed, 37 insertions(+), 6 deletions(-)
diff --git a/src/portage_util_file_copy_reflink_linux.c b/src/portage_util_file_copy_reflink_linux.c
index 4be9e0568..9d00ae733 100644
--- a/src/portage_util_file_copy_reflink_linux.c
+++ b/src/portage_util_file_copy_reflink_linux.c
@@ -56,7 +56,7 @@ initreflink_linux(void)
/**
* cfr_wrapper - A copy_file_range syscall wrapper function, having a
- * function signature that is compatible with sendfile.
+ * function signature that is compatible with sf_wrapper.
* @fd_out: output file descriptor
* @fd_in: input file descriptor
* @off_out: offset of the output file
@@ -79,6 +79,37 @@ cfr_wrapper(int fd_out, int fd_in, off_t *off_out, size_t len)
}
/**
+ * sf_wrapper - A sendfile wrapper function, having a function signature
+ * that is compatible with cfr_wrapper.
+ * @fd_out: output file descriptor
+ * @fd_in: input file descriptor
+ * @off_out: offset of the output file
+ * @len: number of bytes to copy between the file descriptors
+ *
+ * Return: Number of bytes written to out_fd on success, -1 on failure
+ * (errno is set appropriately).
+ */
+static ssize_t
+sf_wrapper(int fd_out, int fd_in, off_t *off_out, size_t len)
+{
+ ssize_t ret;
+ /* The sendfile docs do not guarantee that the offset of fd_in
+ * will be adjusted by the number of bytes written. Therefore,
+ * prior to each sendfile call, always use lseek to ensure that
+ * both file offsets are in alignment.
+ */
+ if (lseek(fd_in, *off_out, SEEK_SET) < 0)
+ return -1;
+ if (lseek(fd_out, *off_out, SEEK_SET) < 0)
+ return -1;
+ ret = sendfile(fd_out, fd_in, NULL, len);
+ if (ret > 0)
+ *off_out += ret;
+ return ret;
+}
+
+
+/**
* do_lseek_data - Adjust file offsets to the next location containing
* data, creating sparse empty blocks in the output file as needed.
* @fd_in: input file descriptor
@@ -250,7 +281,7 @@ _reflink_linux_file_copy(PyObject *self, PyObject *args)
* syscall is not available (less than Linux 4.5).
*/
error = 0;
- copyfunc = sendfile;
+ copyfunc = sf_wrapper;
copyfunc_ret = copyfunc(fd_out,
fd_in,
&offset_out,
@@ -293,10 +324,10 @@ _reflink_linux_file_copy(PyObject *self, PyObject *args)
error = errno;
} else {
while (offset_out < stat_in.st_size) {
- copyfunc_ret = sendfile(fd_out,
- fd_in,
- &offset_out,
- stat_in.st_size - offset_out);
+ copyfunc_ret = sf_wrapper(fd_out,
+ fd_in,
+ &offset_out,
+ stat_in.st_size - offset_out);
if (copyfunc_ret < 0) {
error = errno;
--
2.13.5
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [gentoo-portage-dev] [PATCH v5] file_copy: use sendfile return value to measure bytes copied (bug 635126)
2017-10-23 2:17 [gentoo-portage-dev] [PATCH] file_copy: use sendfile return value to measure bytes copied (bug 635126) Zac Medico
` (2 preceding siblings ...)
2017-10-24 10:48 ` [gentoo-portage-dev] [PATCH v4] " Zac Medico
@ 2017-10-24 18:41 ` Zac Medico
2017-11-01 16:31 ` Brian Dolbec
3 siblings, 1 reply; 7+ messages in thread
From: Zac Medico @ 2017-10-24 18:41 UTC (permalink / raw
To: gentoo-portage-dev; +Cc: Zac Medico
The sendfile *offset parameter refers to the input file offest, so
it cannot be used in the same way as the copy_file_range *off_out
parameter. Therefore, add sf_wrapper function which implements the
*off_out behavior for sendfile.
Also update cfr_wrapper so that it does not rely on the fd_in file
offset, and remove corresponding fd_in lseek calls which are no
longer needed.
The file offset of fd_in is now completely unused, except in the
plain read/write loop, where lseek is called prior to entering
the loop.
Bug: https://bugs.gentoo.org/635126
---
[PATCH v5] eliminates all reliance on the file offset of fd_in,
except in the plain read/write loop, where lseek is called prior
to entering the loop.
src/portage_util_file_copy_reflink_linux.c | 88 ++++++++++++++++++++----------
1 file changed, 59 insertions(+), 29 deletions(-)
diff --git a/src/portage_util_file_copy_reflink_linux.c b/src/portage_util_file_copy_reflink_linux.c
index 4be9e0568..4c3303181 100644
--- a/src/portage_util_file_copy_reflink_linux.c
+++ b/src/portage_util_file_copy_reflink_linux.c
@@ -56,12 +56,18 @@ initreflink_linux(void)
/**
* cfr_wrapper - A copy_file_range syscall wrapper function, having a
- * function signature that is compatible with sendfile.
+ * function signature that is compatible with sf_wrapper.
* @fd_out: output file descriptor
* @fd_in: input file descriptor
- * @off_out: offset of the output file
+ * @off_out: must point to a buffer that specifies the starting offset
+ * where bytes will be copied to fd_out, and this buffer is adjusted by
+ * the number of bytes copied.
* @len: number of bytes to copy between the file descriptors
*
+ * Bytes are copied from fd_in starting from *off_out, and the file
+ * offset of fd_in is not changed. Effects on the file offset of
+ * fd_out are undefined.
+ *
* Return: Number of bytes written to out_fd on success, -1 on failure
* (errno is set appropriately).
*/
@@ -69,7 +75,8 @@ static ssize_t
cfr_wrapper(int fd_out, int fd_in, off_t *off_out, size_t len)
{
#ifdef __NR_copy_file_range
- return syscall(__NR_copy_file_range, fd_in, NULL, fd_out,
+ off_t off_in = *off_out;
+ return syscall(__NR_copy_file_range, fd_in, &off_in, fd_out,
off_out, len, 0);
#else
/* This is how it fails at runtime when the syscall is not supported. */
@@ -79,18 +86,50 @@ cfr_wrapper(int fd_out, int fd_in, off_t *off_out, size_t len)
}
/**
+ * sf_wrapper - A sendfile wrapper function, having a function signature
+ * that is compatible with cfr_wrapper.
+ * @fd_out: output file descriptor
+ * @fd_in: input file descriptor
+ * @off_out: must point to a buffer that specifies the starting offset
+ * where bytes will be copied to fd_out, and this buffer is adjusted by
+ * the number of bytes copied.
+ * @len: number of bytes to copy between the file descriptors
+ *
+ * Bytes are copied from fd_in starting from *off_out, and the file
+ * offset of fd_in is not changed. Effects on the file offset of
+ * fd_out are undefined.
+ *
+ * Return: Number of bytes written to out_fd on success, -1 on failure
+ * (errno is set appropriately).
+ */
+static ssize_t
+sf_wrapper(int fd_out, int fd_in, off_t *off_out, size_t len)
+{
+ ssize_t ret;
+ off_t off_in = *off_out;
+ /* The sendfile docs do not specify behavior of the output file
+ * offset, therefore it must be adjusted with lseek.
+ */
+ if (lseek(fd_out, *off_out, SEEK_SET) < 0)
+ return -1;
+ ret = sendfile(fd_out, fd_in, &off_in, len);
+ if (ret > 0)
+ *off_out += ret;
+ return ret;
+}
+
+
+/**
* do_lseek_data - Adjust file offsets to the next location containing
* data, creating sparse empty blocks in the output file as needed.
* @fd_in: input file descriptor
* @fd_out: output file descriptor
* @off_out: offset of the output file
*
- * Use lseek SEEK_DATA to adjust the fd_in file offset to the next
- * location containing data, and adjust the fd_in file offset and
- * off_out to the same location (creating sparse empty blocks as
- * needed). On success, both fd_in and fd_out file offsets are
- * guaranteed to be exactly equal to the value that off_out points to.
- *
+ * Use lseek SEEK_DATA to adjust off_out to the next location from fd_in
+ * containing data (creates sparse empty blocks when appropriate). Effects
+ * on file offsets are undefined.
+ *
* Return: On success, the number of bytes to copy before the next hole,
* and -1 on failure (errno is set appropriately). Returns 0 when fd_in
* reaches EOF.
@@ -250,7 +289,7 @@ _reflink_linux_file_copy(PyObject *self, PyObject *args)
* syscall is not available (less than Linux 4.5).
*/
error = 0;
- copyfunc = sendfile;
+ copyfunc = sf_wrapper;
copyfunc_ret = copyfunc(fd_out,
fd_in,
&offset_out,
@@ -284,27 +323,18 @@ _reflink_linux_file_copy(PyObject *self, PyObject *args)
} else {
stat_in_acquired = 1;
- /* For the sendfile call, the fd_in file offset must be
- * exactly equal to offset_out. Use lseek to ensure
- * correct state, in case an EINTR retry caused it to
- * get out of sync somewhow.
- */
- if (lseek(fd_in, offset_out, SEEK_SET) < 0) {
- error = errno;
- } else {
- while (offset_out < stat_in.st_size) {
- copyfunc_ret = sendfile(fd_out,
- fd_in,
- &offset_out,
- stat_in.st_size - offset_out);
+ while (offset_out < stat_in.st_size) {
+ copyfunc_ret = sf_wrapper(fd_out,
+ fd_in,
+ &offset_out,
+ stat_in.st_size - offset_out);
- if (copyfunc_ret < 0) {
- error = errno;
- if (errno == EINVAL && !offset_out) {
- sendfile_works = 0;
- }
- break;
+ if (copyfunc_ret < 0) {
+ error = errno;
+ if (errno == EINVAL && !offset_out) {
+ sendfile_works = 0;
}
+ break;
}
}
}
--
2.13.5
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [gentoo-portage-dev] [PATCH v5] file_copy: use sendfile return value to measure bytes copied (bug 635126)
2017-10-24 18:41 ` [gentoo-portage-dev] [PATCH v5] " Zac Medico
@ 2017-11-01 16:31 ` Brian Dolbec
2017-11-01 16:33 ` Zac Medico
0 siblings, 1 reply; 7+ messages in thread
From: Brian Dolbec @ 2017-11-01 16:31 UTC (permalink / raw
To: gentoo-portage-dev
On Tue, 24 Oct 2017 11:41:37 -0700
Zac Medico <zmedico@gentoo.org> wrote:
> The sendfile *offset parameter refers to the input file offest, so
> it cannot be used in the same way as the copy_file_range *off_out
> parameter. Therefore, add sf_wrapper function which implements the
> *off_out behavior for sendfile.
>
> Also update cfr_wrapper so that it does not rely on the fd_in file
> offset, and remove corresponding fd_in lseek calls which are no
> longer needed.
>
> The file offset of fd_in is now completely unused, except in the
> plain read/write loop, where lseek is called prior to entering
> the loop.
>
> Bug: https://bugs.gentoo.org/635126
> ---
> [PATCH v5] eliminates all reliance on the file offset of fd_in,
> except in the plain read/write loop, where lseek is called prior
> to entering the loop.
>
> src/portage_util_file_copy_reflink_linux.c | 88
> ++++++++++++++++++++---------- 1 file changed, 59 insertions(+), 29
> deletions(-)
>
> diff --git a/src/portage_util_file_copy_reflink_linux.c
> b/src/portage_util_file_copy_reflink_linux.c index
> 4be9e0568..4c3303181 100644 ---
> a/src/portage_util_file_copy_reflink_linux.c +++
> b/src/portage_util_file_copy_reflink_linux.c @@ -56,12 +56,18 @@
> initreflink_linux(void)
> /**
> * cfr_wrapper - A copy_file_range syscall wrapper function, having a
> - * function signature that is compatible with sendfile.
> + * function signature that is compatible with sf_wrapper.
> * @fd_out: output file descriptor
> * @fd_in: input file descriptor
> - * @off_out: offset of the output file
> + * @off_out: must point to a buffer that specifies the starting
> offset
> + * where bytes will be copied to fd_out, and this buffer is adjusted
> by
> + * the number of bytes copied.
> * @len: number of bytes to copy between the file descriptors
> *
> + * Bytes are copied from fd_in starting from *off_out, and the file
> + * offset of fd_in is not changed. Effects on the file offset of
> + * fd_out are undefined.
> + *
> * Return: Number of bytes written to out_fd on success, -1 on
> failure
> * (errno is set appropriately).
> */
> @@ -69,7 +75,8 @@ static ssize_t
> cfr_wrapper(int fd_out, int fd_in, off_t *off_out, size_t len)
> {
> #ifdef __NR_copy_file_range
> - return syscall(__NR_copy_file_range, fd_in, NULL, fd_out,
> + off_t off_in = *off_out;
> + return syscall(__NR_copy_file_range, fd_in, &off_in, fd_out,
> off_out, len, 0);
> #else
> /* This is how it fails at runtime when the syscall is not
> supported. */ @@ -79,18 +86,50 @@ cfr_wrapper(int fd_out, int fd_in,
> off_t *off_out, size_t len) }
>
> /**
> + * sf_wrapper - A sendfile wrapper function, having a function
> signature
> + * that is compatible with cfr_wrapper.
> + * @fd_out: output file descriptor
> + * @fd_in: input file descriptor
> + * @off_out: must point to a buffer that specifies the starting
> offset
> + * where bytes will be copied to fd_out, and this buffer is adjusted
> by
> + * the number of bytes copied.
> + * @len: number of bytes to copy between the file descriptors
> + *
> + * Bytes are copied from fd_in starting from *off_out, and the file
> + * offset of fd_in is not changed. Effects on the file offset of
> + * fd_out are undefined.
> + *
> + * Return: Number of bytes written to out_fd on success, -1 on
> failure
> + * (errno is set appropriately).
> + */
> +static ssize_t
> +sf_wrapper(int fd_out, int fd_in, off_t *off_out, size_t len)
> +{
> + ssize_t ret;
> + off_t off_in = *off_out;
> + /* The sendfile docs do not specify behavior of the output file
> + * offset, therefore it must be adjusted with lseek.
> + */
> + if (lseek(fd_out, *off_out, SEEK_SET) < 0)
> + return -1;
> + ret = sendfile(fd_out, fd_in, &off_in, len);
> + if (ret > 0)
> + *off_out += ret;
> + return ret;
> +}
> +
> +
> +/**
> * do_lseek_data - Adjust file offsets to the next location
> containing
> * data, creating sparse empty blocks in the output file as needed.
> * @fd_in: input file descriptor
> * @fd_out: output file descriptor
> * @off_out: offset of the output file
> *
> - * Use lseek SEEK_DATA to adjust the fd_in file offset to the next
> - * location containing data, and adjust the fd_in file offset and
> - * off_out to the same location (creating sparse empty blocks as
> - * needed). On success, both fd_in and fd_out file offsets are
> - * guaranteed to be exactly equal to the value that off_out points
> to.
> - *
> + * Use lseek SEEK_DATA to adjust off_out to the next location from
> fd_in
> + * containing data (creates sparse empty blocks when appropriate).
> Effects
> + * on file offsets are undefined.
> + *
> * Return: On success, the number of bytes to copy before the next
> hole,
> * and -1 on failure (errno is set appropriately). Returns 0 when
> fd_in
> * reaches EOF.
> @@ -250,7 +289,7 @@ _reflink_linux_file_copy(PyObject *self, PyObject
> *args)
> * syscall is not available (less than Linux
> 4.5). */
> error = 0;
> - copyfunc = sendfile;
> + copyfunc = sf_wrapper;
> copyfunc_ret = copyfunc(fd_out,
> fd_in,
> &offset_out,
> @@ -284,27 +323,18 @@ _reflink_linux_file_copy(PyObject *self,
> PyObject *args) } else {
> stat_in_acquired = 1;
>
> - /* For the sendfile call, the fd_in file offset must
> be
> - * exactly equal to offset_out. Use lseek to ensure
> - * correct state, in case an EINTR retry caused it to
> - * get out of sync somewhow.
> - */
> - if (lseek(fd_in, offset_out, SEEK_SET) < 0) {
> - error = errno;
> - } else {
> - while (offset_out < stat_in.st_size) {
> - copyfunc_ret = sendfile(fd_out,
> - fd_in,
> - &offset_out,
> - stat_in.st_size -
> offset_out);
> + while (offset_out < stat_in.st_size) {
> + copyfunc_ret = sf_wrapper(fd_out,
> + fd_in,
> + &offset_out,
> + stat_in.st_size -
> offset_out);
> - if (copyfunc_ret < 0) {
> - error = errno;
> - if (errno == EINVAL && !offset_out) {
> - sendfile_works = 0;
> - }
> - break;
> + if (copyfunc_ret < 0) {
> + error = errno;
> + if (errno == EINVAL && !offset_out) {
> + sendfile_works = 0;
> }
> + break;
> }
> }
> }
Looks good, has been checked by chutzpah
--
Brian Dolbec <dolsen>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [gentoo-portage-dev] [PATCH v5] file_copy: use sendfile return value to measure bytes copied (bug 635126)
2017-11-01 16:31 ` Brian Dolbec
@ 2017-11-01 16:33 ` Zac Medico
0 siblings, 0 replies; 7+ messages in thread
From: Zac Medico @ 2017-11-01 16:33 UTC (permalink / raw
To: gentoo-portage-dev, Brian Dolbec
[-- Attachment #1.1: Type: text/plain, Size: 1248 bytes --]
On 11/01/2017 09:31 AM, Brian Dolbec wrote:
> On Tue, 24 Oct 2017 11:41:37 -0700
> Zac Medico <zmedico@gentoo.org> wrote:
>
>> The sendfile *offset parameter refers to the input file offest, so
>> it cannot be used in the same way as the copy_file_range *off_out
>> parameter. Therefore, add sf_wrapper function which implements the
>> *off_out behavior for sendfile.
>>
>> Also update cfr_wrapper so that it does not rely on the fd_in file
>> offset, and remove corresponding fd_in lseek calls which are no
>> longer needed.
>>
>> The file offset of fd_in is now completely unused, except in the
>> plain read/write loop, where lseek is called prior to entering
>> the loop.
>>
>> Bug: https://bugs.gentoo.org/635126
>> ---
>> [PATCH v5] eliminates all reliance on the file offset of fd_in,
>> except in the plain read/write loop, where lseek is called prior
>> to entering the loop.
>>
>> src/portage_util_file_copy_reflink_linux.c | 88
>> ++++++++++++++++++++---------- 1 file changed, 59 insertions(+), 29
>> deletions(-)
>
>
> Looks good, has been checked by chutzpah
>
Thanks, merged:
https://gitweb.gentoo.org/proj/portage.git/commit/?id=5b6cf172e378f6da88e9634aa4e89f2f34390659
--
Thanks,
Zac
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 224 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2017-11-01 16:34 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-23 2:17 [gentoo-portage-dev] [PATCH] file_copy: use sendfile return value to measure bytes copied (bug 635126) Zac Medico
2017-10-23 7:22 ` [gentoo-portage-dev] [PATCH v2] " Zac Medico
2017-10-24 10:06 ` [gentoo-portage-dev] [PATCH v3] " Zac Medico
2017-10-24 10:48 ` [gentoo-portage-dev] [PATCH v4] " Zac Medico
2017-10-24 18:41 ` [gentoo-portage-dev] [PATCH v5] " Zac Medico
2017-11-01 16:31 ` Brian Dolbec
2017-11-01 16:33 ` Zac Medico
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox