public inbox for gentoo-dev@lists.gentoo.org
 help / color / mirror / Atom feed
From: Oskari Pirhonen <xxc3ncoredxx@gmail.com>
To: gentoo-dev@lists.gentoo.org
Subject: Re: [gentoo-dev] [PATCH v2 1/1] esed.eclass: new eclass
Date: Fri, 3 Jun 2022 19:15:17 -0500	[thread overview]
Message-ID: <YpqkFR4FlC4BSWGO@dj3ntoo> (raw)
In-Reply-To: <20220603113646.25973-2-ionen@gentoo.org>

[-- Attachment #1: Type: text/plain, Size: 3683 bytes --]

On Fri, Jun 03, 2022 at 07:36:46AM -0400, Ionen Wolkens wrote:
> ... snip ...
>
> +	# Roughly attempt to find files in arguments by checking if it's a
> +	# readable file (aka s/// is not a file) and does not start with -
> +	# (unless after --), then store contents for comparing after sed.
> +	local contents=() endopts files=()
> +	for ((i=1; i<=${#}; i++)); do
> +		if [[ ${!i} == -- && ! -v endopts ]]; then
> +			endopts=1
> +		elif [[ ${!i} =~ ^(-i|--in-place)$ && ! -v endopts ]]; then
> +			# detect rushed sed -i -> esed -i, -i also silently breaks enewsed
> +			die "passing ${!i} to ${FUNCNAME[0]} is invalid"
> +		elif [[ ${!i} =~ ^(-f|--file)$ && ! -v endopts ]]; then
> +			i+=1 # ignore script files
> +		elif [[ ( ${!i} != -* || -v endopts ) && -f ${!i} && -r ${!i} ]]; then
> +			files+=( "${!i}" )
> +
> +			# 2>/dev/null to silence null byte warnings if sed binary files
> +			{ contents+=( "$(<"${!i}")" ); } 2>/dev/null \
> +				|| die "failed to read: ${!i}"
> +		fi
> +	done
> +	(( ${#files[@]} )) || die "no readable files found from '${*}' arguments"
> +
> +	local verbose
> +	[[ ${ESED_VERBOSE} ]] && type diff &>/dev/null && verbose=1
> +
> +	local changed newcontents
> +	if [[ -v _esed_output ]]; then
> +		[[ -v verbose ]] &&
> +			einfo "${FUNCNAME[0]}: sed ${*} > ${_esed_output} ..."
> +
> +		sed "${@}" > "${_esed_output}" \
> +			|| die "failed to run: sed ${*} > ${_esed_output}"
> +
> +		{ newcontents=$(<"${_esed_output}"); } 2>/dev/null \
> +			|| die "failed to read: ${_esed_output}"
> +
> +		local IFS=$'\n' # sed concats with newline even if none at EOF
> +		contents=${contents[*]}
> +		unset IFS
> +
> +		if [[ ${contents} != "${newcontents}" ]]; then
> +			 changed=1
> +
> +			[[ -v verbose ]] &&
> +				diff -u --color --label="${files[*]}" --label="${_esed_output}" \
> +					<(echo "${contents}") <(echo "${newcontents}")
> +		fi
>
> ... snip ...

I'm not 100% convinced that it will give you anything meaningful. The
warning about ignoring NULL is not so much noise as it is bash warning
you that you're probably not doing something correctly. In this case,
you're not pulling _all_ the contents of the file:
    
    [ /tmp ]
    oskari@dj3ntoo λ printf "ab\0cd" >test.dat
    [ /tmp ]
    oskari@dj3ntoo λ hd test.dat
    00000000  61 62 00 63 64                                    |ab.cd|
    00000005
    [ /tmp ]
    oskari@dj3ntoo λ var=$(< test.dat)
    bash: warning: command substitution: ignored null byte in input
    [ /tmp ]
    oskari@dj3ntoo λ printf "$var" | hd
    00000000  61 62 63 64                                       |abcd|
    00000004

If it's a binary file, there's a decent chance the NULL's are
significant. Now, consider the following hypothetical example where we
want to remove the NULL's:
    
    [ /tmp ]
    oskari@dj3ntoo λ printf "ab\0cd" | sed -e 's/\x00//' | hd
    00000000  61 62 63 64                                       |abcd|
    00000004

Testing for (in)equality between pre- and post-sed contents is
reasonable enough in most cases. This time, though, it would fail to
detect anything has changed since the pre-sed contents have their NULL's
unintentionally stripped, whereas the post-sed contents have them
intentionally stripped.

While I personally don't think that running sed on binary files is a
good idea in the first place, it's still relevant since the end result
would be an incorrect answer to the question of "Did sed actually do
anything?".

On the other hand, saving a set of pre- and post-sed hashes like Ulrich
suggested would give the expected result.

- Oskari

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

  reply	other threads:[~2022-06-04  0:15 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-31 11:23 [gentoo-dev] [PATCH 0/2] Add esed.eclass for sed that dies if caused no changes Ionen Wolkens
2022-05-31 11:23 ` [gentoo-dev] [PATCH 1/2] esed.eclass: new eclass Ionen Wolkens
2022-05-31 13:45   ` Ionen Wolkens
2022-06-03  7:14   ` Ulrich Mueller
2022-06-03  9:26     ` Ionen Wolkens
2022-06-03 11:19       ` Ionen Wolkens
2022-05-31 11:23 ` [gentoo-dev] [PATCH 2/2] eclass/tests/esed.sh: basic tests for esed.eclass Ionen Wolkens
2022-05-31 13:54 ` [gentoo-dev] [PATCH 0/2] Add esed.eclass for sed that dies if caused no changes Anna
2022-05-31 14:29   ` Ionen Wolkens
2022-05-31 14:44     ` Jaco Kroon
2022-06-03  4:09 ` Michał Górny
2022-06-03  4:45   ` Ionen Wolkens
2022-06-03  4:47     ` Anna V
2022-06-03  5:01       ` Ionen Wolkens
2022-06-03  5:00     ` Anna V
2022-06-03  5:05   ` Ionen Wolkens
2022-06-03  4:59 ` Sam James
2022-06-03  5:02   ` Sam James
2022-06-03 11:36 ` [gentoo-dev] [PATCH v2 0/1] esed Ionen Wolkens
2022-06-03 11:36   ` [gentoo-dev] [PATCH v2 1/1] esed.eclass: new eclass Ionen Wolkens
2022-06-04  0:15     ` Oskari Pirhonen [this message]
2022-06-04  7:00       ` Ionen Wolkens
2022-06-03 14:13   ` [gentoo-dev] [PATCH v2 0/1] esed Ionen Wolkens
2022-06-04 16:19 ` [gentoo-dev] [PATCH 0/2] Add esed.eclass for sed that dies if caused no changes Alessandro Barbieri
2022-06-04 16:53   ` Ionen Wolkens

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YpqkFR4FlC4BSWGO@dj3ntoo \
    --to=xxc3ncoredxx@gmail.com \
    --cc=gentoo-dev@lists.gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox