public inbox for gentoo-guru@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-guru] [RFC PATCH 0/3] eclass/R-packages: improvements
@ 2022-07-05  7:15 Robert Greener
  2022-07-06  0:24 ` Alessandro Barbieri
  0 siblings, 1 reply; 4+ messages in thread
From: Robert Greener @ 2022-07-05  7:15 UTC (permalink / raw
  To: gentoo-guru; +Cc: Robert Greener

Hello,

This is very much a RFC on some improvements to the R-packages eclass.

There is currently a problem with the SRC_URI in that it will only work
with up-to-date packages, for example dplyr 1.0.9 will work, however,
dplyr 1.0.8 will not. This significantly increases the maintenance
burden of dev-R/*, as all packages must be up-to-date at all times, in
order for them to build.

To fix this, I propose to change SRC_URI to use either src/contrib or
src/contrib/Archive (where old packages will be). However, the drawback
of this is that we only use the main CRAN. As there are many mirrors to
search, it would be impractial to search them all before searching the
archive.

I also suggest 2 further changes. Firstly, replacing _ with . in the PN
in SRC_URI and HOMEPAGE. This is already done elsewhere in the eclass,
and would mean that these don't need to be overrided in the ebuild.

Finally, I suggest adding a variable CRAN_PV, that defaults to PV, for
the case where the version is something like "1.2-24". This is used in
SRC_URI, meaning that this doesn't need to be overrided in the ebuild.

--
Robert

Robert Greener (3):
  eclass/R-packages: Use src/contrib or src/contrib/Archive from main
    CRAN
  eclass/R-packages: substitute _ with . in SRC_URI  and HOMEPAGE
  eclass/R-packages: Add CRAN_PV

 eclass/R-packages.eclass | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)


-- 
2.35.1



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [gentoo-guru] [RFC PATCH 0/3] eclass/R-packages: improvements
  2022-07-05  7:15 [gentoo-guru] [RFC PATCH 0/3] eclass/R-packages: improvements Robert Greener
@ 2022-07-06  0:24 ` Alessandro Barbieri
  2022-07-06  9:08   ` Robert Greener
  0 siblings, 1 reply; 4+ messages in thread
From: Alessandro Barbieri @ 2022-07-06  0:24 UTC (permalink / raw
  To: Robert Greener; +Cc: gentoo-guru

[-- Attachment #1: Type: text/plain, Size: 2327 bytes --]

Il giorno mar 5 lug 2022 alle ore 09:16 Robert Greener <me@r0bert.dev> ha
scritto:

> Hello,
>
> This is very much a RFC on some improvements to the R-packages eclass.
>
> There is currently a problem with the SRC_URI in that it will only work
> with up-to-date packages, for example dplyr 1.0.9 will work, however,
> dplyr 1.0.8 will not. This significantly increases the maintenance
> burden of dev-R/*, as all packages must be up-to-date at all times, in
> order for them to build.
>
> To fix this, I propose to change SRC_URI to use either src/contrib or
> src/contrib/Archive (where old packages will be). However, the drawback
> of this is that we only use the main CRAN. As there are many mirrors to
> search, it would be impractial to search them all before searching the
> archive.
>
> I also suggest 2 further changes. Firstly, replacing _ with . in the PN
> in SRC_URI and HOMEPAGE. This is already done elsewhere in the eclass,
> and would mean that these don't need to be overrided in the ebuild.
>
> Finally, I suggest adding a variable CRAN_PV, that defaults to PV, for
> the case where the version is something like "1.2-24". This is used in
> SRC_URI, meaning that this doesn't need to be overrided in the ebuild.
>
> --
> Robert
>
> Robert Greener (3):
>   eclass/R-packages: Use src/contrib or src/contrib/Archive from main
>     CRAN
>   eclass/R-packages: substitute _ with . in SRC_URI  and HOMEPAGE
>   eclass/R-packages: Add CRAN_PV
>
>  eclass/R-packages.eclass | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>
>
> --
> 2.35.1
>
>
>
 I think all the changes are fine.

The devmanual say:

There are two valid cases for using thirdpartymirrors:

   1. providing multiple download locations for mirror- or fetch-restricted
   packages,
   2. dealing with upstreams that distribute their distfiles via a network
   of mirrors without a primary download location or a bouncer service.

In any other case, the primary location must be used instead. The distfiles
will be mirrored onto Gentoo infrastructure
<https://devmanual.gentoo.org/general-concepts/mirrors/index.html>; in that
case, the benefit to using third-party mirror list does not outweigh the
burden of maintaining it.


So I think in this case we can drop the mirror list and go with the hack of
listing both live and archive

[-- Attachment #2: Type: text/html, Size: 2981 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [gentoo-guru] [RFC PATCH 0/3] eclass/R-packages: improvements
  2022-07-06  0:24 ` Alessandro Barbieri
@ 2022-07-06  9:08   ` Robert Greener
  2022-07-06 11:21     ` Anna “CyberTailor”
  0 siblings, 1 reply; 4+ messages in thread
From: Robert Greener @ 2022-07-06  9:08 UTC (permalink / raw
  To: Alessandro Barbieri; +Cc: gentoo-guru, Anna “CyberTailor”

[-- Attachment #1: Type: text/plain, Size: 3601 bytes --]

On Wed, 2022-07-06 at 02:24 +0200, Alessandro Barbieri wrote:
> Il giorno mar 5 lug 2022 alle ore 09:16 Robert Greener
> <me@r0bert.dev> ha
> scritto:
> > To fix this, I propose to change SRC_URI to use either src/contrib
> > or
> > src/contrib/Archive (where old packages will be). However, the
> > drawback
> > of this is that we only use the main CRAN. As there are many
> > mirrors to
> > search, it would be impractial to search them all before searching
> > the
> > archive.
>
> I think all the changes are fine.
> 
> The devmanual say:
> 
> There are two valid cases for using thirdpartymirrors:
> 
>    1. providing multiple download locations for mirror- or fetch-
> restricted
>    packages,
>    2. dealing with upstreams that distribute their distfiles via a
> network
>    of mirrors without a primary download location or a bouncer
> service.
> 
> In any other case, the primary location must be used instead. The
> distfiles
> will be mirrored onto Gentoo infrastructure
> <https://devmanual.gentoo.org/general-concepts/mirrors/index.html>;
> in that
> case, the benefit to using third-party mirror list does not outweigh
> the
> burden of maintaining it.
> 
> 
> So I think in this case we can drop the mirror list and go with the
> hack of
> listing both live and archive

Anna (CyberTailor) (cc'd), made the point off list, that the multiple
SRC_URIs as a backup is undefined in PMS, so it can't be relied on.

We could get the sources directly in src_unpack (it may be better
refactored into its own function, but just left as is for now). See the
patch at the end.

However, I don't really like this, as it makes the ebuild live, but
it's not really a live ebuild. Also we lose the manifest...

The other option would be to require an UPDATE_DATE variable to be set
in the ebuilds, this could then be used like so: 
SRC_URI="https://cran.microsoft.com/snapshot/${UPDATE_DATE}/src/contrib/${PN}_${PV}.tar.gz"
Where UPDATE_DATE=YYYY-MM-DD.

This is a service provided by Microsoft that snapshots CRAN everyday at
midnight UTC. This link will always resolve. I think this might the
easiest solution, though would require updating the ebuilds in dev-R/*.

What do you think?

diff --git a/eclass/R-packages.eclass b/eclass/R-packages.eclass
index aed8cce84..8a464c325 100644
--- a/eclass/R-packages.eclass
+++ b/eclass/R-packages.eclass
@@ -21,14 +21,14 @@ esac
 
 EXPORT_FUNCTIONS src_unpack src_prepare src_configure src_compile src_install pkg_postinst
 
-SRC_URI="mirror://cran/src/contrib/${PN}_${PV}.tar.gz"
 HOMEPAGE="https://cran.r-project.org/package=${PN}"
 
 SLOT="0"
 
 DEPEND="dev-lang/R"
 RDEPEND="${DEPEND}"
-BDEPEND="sys-apps/pkgcore"
+BDEPEND="sys-apps/pkgcore net-misc/wget"
+PROPERTIES="live"
 
 # @FUNCTION: _movelink
 # @INTERNAL
@@ -45,9 +45,16 @@ _movelink() {
 
 # @FUNCTION: R-packages_src_unpack
 # @DESCRIPTION:
-# function to unpack R packages into the right folder
+# function to download and unpack R packages into the right folder
 R-packages_src_unpack() {
-       unpack ${A}
+       einfo "Trying live CRAN"
+       wget "https://cran.r-project.org/src/contrib/${PN}_${PV}.tar.gz" || {
+               einfo "Trying CRAN Archive"
+               wget "https://cran.r-project.org/src/contrib/Archive/${PN}/${PN}_${PV}.tar.gz" || die
+       }
+
+       unpack "${WORKDIR}/${PN}_${PV}.tar.gz"
+
        if [[ -d "${PN//_/.}" ]] && [[ ! -d "${P}" ]]; then
                mv "${PN//_/.}" "${P}" || die
        fi
-- 
2.35.1

--
Robert


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 248 bytes --]

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [gentoo-guru] [RFC PATCH 0/3] eclass/R-packages: improvements
  2022-07-06  9:08   ` Robert Greener
@ 2022-07-06 11:21     ` Anna “CyberTailor”
  0 siblings, 0 replies; 4+ messages in thread
From: Anna “CyberTailor” @ 2022-07-06 11:21 UTC (permalink / raw
  To: gentoo-guru

On 2022-07-06 10:08, Robert Greener wrote:
> On Wed, 2022-07-06 at 02:24 +0200, Alessandro Barbieri wrote:
> > Il giorno mar 5 lug 2022 alle ore 09:16 Robert Greener
> > <me@r0bert.dev> ha
> > scritto:
> > > To fix this, I propose to change SRC_URI to use either src/contrib
> > > or
> > > src/contrib/Archive (where old packages will be). However, the
> > > drawback
> > > of this is that we only use the main CRAN. As there are many
> > > mirrors to
> > > search, it would be impractial to search them all before searching
> > > the
> > > archive.
> >
> > I think all the changes are fine.
> > 
> > The devmanual say:
> > 
> > There are two valid cases for using thirdpartymirrors:
> > 
> >    1. providing multiple download locations for mirror- or fetch-
> > restricted
> >    packages,
> >    2. dealing with upstreams that distribute their distfiles via a
> > network
> >    of mirrors without a primary download location or a bouncer
> > service.
> > 
> > In any other case, the primary location must be used instead. The
> > distfiles
> > will be mirrored onto Gentoo infrastructure
> > <https://devmanual.gentoo.org/general-concepts/mirrors/index.html>;
> > in that
> > case, the benefit to using third-party mirror list does not outweigh
> > the
> > burden of maintaining it.
> > 
> > 
> > So I think in this case we can drop the mirror list and go with the
> > hack of
> > listing both live and archive
> 
> Anna (CyberTailor) (cc'd), made the point off list, that the multiple
> SRC_URIs as a backup is undefined in PMS, so it can't be relied on.
> 
> We could get the sources directly in src_unpack (it may be better
> refactored into its own function, but just left as is for now). See the
> patch at the end.
> 
> However, I don't really like this, as it makes the ebuild live, but
> it's not really a live ebuild. Also we lose the manifest...
> 
> The other option would be to require an UPDATE_DATE variable to be set
> in the ebuilds, this could then be used like so: 
> SRC_URI="https://cran.microsoft.com/snapshot/${UPDATE_DATE}/src/contrib/${PN}_${PV}.tar.gz"
> Where UPDATE_DATE=YYYY-MM-DD.

I like this solution (but then we need to trust Microsoft). I'd prefix
the variable with "CRAN_" tho.
 
> This is a service provided by Microsoft that snapshots CRAN everyday at
> midnight UTC. This link will always resolve. I think this might the
> easiest solution, though would require updating the ebuilds in dev-R/*.

Just make it conditional for backwards compatibility.

if [[ ${CRAN_UPD_DATE} ]]; then
	SRC_URI="https://cran.microsoft.com/snapshot/${UPDATE_DATE}"
else
	SRC_URI="mirror://cran"
fi
SRC_URI+="/src/contrib/${PN}_${PV}.tar.gz"

You can make it required after all ebuilds have been edited.

> What do you think?
> 
> diff --git a/eclass/R-packages.eclass b/eclass/R-packages.eclass
> index aed8cce84..8a464c325 100644
> --- a/eclass/R-packages.eclass
> +++ b/eclass/R-packages.eclass
> @@ -21,14 +21,14 @@ esac
>  
>  EXPORT_FUNCTIONS src_unpack src_prepare src_configure src_compile src_install pkg_postinst
>  
> -SRC_URI="mirror://cran/src/contrib/${PN}_${PV}.tar.gz"
>  HOMEPAGE="https://cran.r-project.org/package=${PN}"
>  
>  SLOT="0"
>  
>  DEPEND="dev-lang/R"
>  RDEPEND="${DEPEND}"
> -BDEPEND="sys-apps/pkgcore"
> +BDEPEND="sys-apps/pkgcore net-misc/wget"

wget it in @system, no need to depend on it.

> +PROPERTIES="live"
>  
>  # @FUNCTION: _movelink
>  # @INTERNAL
> @@ -45,9 +45,16 @@ _movelink() {
>  
>  # @FUNCTION: R-packages_src_unpack
>  # @DESCRIPTION:
> -# function to unpack R packages into the right folder
> +# function to download and unpack R packages into the right folder
>  R-packages_src_unpack() {
> -       unpack ${A}
> +       einfo "Trying live CRAN"
> +       wget "https://cran.r-project.org/src/contrib/${PN}_${PV}.tar.gz" || {
> +               einfo "Trying CRAN Archive"
> +               wget "https://cran.r-project.org/src/contrib/Archive/${PN}/${PN}_${PV}.tar.gz" || die
> +       }
> +
> +       unpack "${WORKDIR}/${PN}_${PV}.tar.gz"
> +
>         if [[ -d "${PN//_/.}" ]] && [[ ! -d "${P}" ]]; then
>                 mv "${PN//_/.}" "${P}" || die
>         fi
> -- 
> 2.35.1
> 
> --
> Robert
> 




^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-11-24 22:47 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-07-05  7:15 [gentoo-guru] [RFC PATCH 0/3] eclass/R-packages: improvements Robert Greener
2022-07-06  0:24 ` Alessandro Barbieri
2022-07-06  9:08   ` Robert Greener
2022-07-06 11:21     ` Anna “CyberTailor”

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox