public inbox for gentoo-dev@lists.gentoo.org
 help / color / mirror / Atom feed
From: "Kevin F. Quinn" <kevquinn@gentoo.org>
To: gentoo-dev@lists.gentoo.org
Subject: Re: [gentoo-dev] Textrels in packages policy
Date: Wed, 14 Dec 2005 08:44:33 +0100	[thread overview]
Message-ID: <20051214084433.0429c1ea@c1358217.cas.dsae.finmeccanica.it> (raw)
In-Reply-To: <20051213205903.GA27045@aerie.halcy0n.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Tue, 13 Dec 2005 15:59:03 -0500
Mark Loeser <halcy0n@gentoo.org> wrote:

> Basically what I'm looking for here is an easy to understand
> explanation of what textrels are, why they are bad, and why they
> should hold back marking a package stable.  The only information I've
> been able to find states that they could cause a performance hit, but
> this doesn't seem to warrant banning them completely in my eyes.

The seriousness of the textrel issue is different for Hardened Gentoo
and normal Gentoo.  For Hardened Gentoo they cause real problems and
must to be fixed to avoid compromising the overall strategy.  For
non-hardened Gentoo it's not so serious.  I'll focus on non-hardened
Gentoo here.

Textrels (short for "text relocations") are function or data pointers
within the code of a shared library that need to be adjusted to take
account of the address at which a shared library is loaded.

Shared libraries can be loaded at varying addresses depending on how the
loader brings shared libraries into a process' memory image.  This can
be affected by factors including the size of the main executable, which
other shared libraries are loaded for the same process and in which
order.

When a shared library is loaded, any absolute address references within
the code have to be changed to reflect the actual address according to
the address at which the shared library is loaded, before that code is
executed.  Since the loader can't tell what will be executed and when,
it has to resolve all such references before the process can be
permitted to start execution (this is the performance overhead
incurred at load time).  It also means that the image of the shared
library is specific to that process - other processes can only use the
same image if they load the library at the same address which is
unlikely under normal circumstances.  Thus libraries that are truly
shared between applications, consume separate memory for each
application that is using them (which is therefore a resource overhead).
One workaround for this is to use prelink.  This examines your whole
system, and marks all shared libraries with hints for their
load addresses, which the loader follows if it can. The addresses are
chosen to minimise the amount of collisions, 

If a shared library has no text relocations, it is properly Position
Independent Code (PIC).  Instead of absolute addresses, the code always
uses relative addresses which work no matter at which address the code
is loaded.  Typically this is achieved via an indirection table (the
Global Offset Table) which contains the offsets for each symbol.  Thus
such code can be mapped without write permissions on the code, and you
only ever have one copy of the code in memory, no matter how many
applications are using that code.

The downside of position-independent code is that it does have a
performance impact as currently implemented (i.e. it generates slower
code) due to the indirection introduced.  This is made worse by the
bad use of shared libraries, and the fact that the visibility stuff
in the GNU tools is so recent (shockingly) and so libraries have far
too much stuff unnecessarily in their global offset tables.

Shared libraries (.so's) are used for various things:

1) To share code between multiple applications, the original intention
of shared libraries. Best example is libc which if it were to be
duplicated for each running application would have a significant impact
on memory usage. For such libraries, TEXTRELs are a problem; the
seriousness of the problem depends on how many applications use them.

2) To modularise an application; good examples here are X and
web-browser plugins.  Here, code is loaded if it is used, but not
otherwise.  Loading all the device drivers into X would be wasteful,
for example.  This is a secondary use of shared libraries that arises
from the way shared libraries are implemented, making them easy to use
for this purpose.  TEXTRELs here cause a penalty on load, but 

3) To break an application up into bits.  This is the worst reason to
create a shared library, as it gains nothing but loses much.  It means a
developer abstraction has leaked unnecessarily into the implementation.
These simply shouldn't be shared libraries, instead the code should
have been linked into the executable.


Textrels can be caused by:

1) asm code that isn't PIC

2) incorrect makefiles etc (if a build uses libtool properly, it'll
create shared libraries with no textrels)


Frequently these are easy to fix, and are simply oversights upstream.
There's no reason not to fix these.

Occasionally they're a pain to fix, especially non-PIC asm, and if
upstream are non-PIC on purpose (usually because they want to
extract every last cycle from the processor) we end up having to
maintain large patches which amongst other things is risky.


> Getting a clear cut policy on exactly what issues should hold a
> package back from being marked stable is what I'm looking for.
> Issues like textrels, executable stacks, etc is what I'm looking for
> to be defined and explained why we are to always avoid them.

Executable stack is a separate issue, and nothing to do with textrels.
For the moment it's not really relevant outside of Hardened Gentoo,
however it's quite possible that standard Linux may support some kind
of executable stack protection in the future once the venerable x86
architecture gives way to NX-enabled systems.

However, typically executable stack is caused by one or more of:

1) objects that aren't marked as needing or not needing executable
stack.  Typically this occurs with assembler code, but can also occur
with code built with compilers other than our gcc. Here, the linker
decides that the code might need executable stack, so cannot mark it as
not needing executable stack.  This is trivially fixed by adding the
appropriate markings to the asm code.

2) Nested functions.  These are a GCC extension to C, although the
underlying concept exists in other languages.  In GCC they are
implemented using a trick called a 'trampoline', which is a little
piece of code place on the stack at runtime.  Note that in C, it's
easy to pass around a pointer to a nested function (i.e. a pointer to
the trampoline) outside of scope; outside of scope a call through such
a pointer will have unpredictable behaviour - probably segfault.
Frequently, the nested functions do not really need to be nested and
the code can trivially be adjusted to avoid them.

3) Custom software that builds code dynamically in data local to a
function (i.e. data on the stack).  This is rare, unlikely to occur on
anything other than x86.  Dynamically-generated code should really be
created on the heap, with the appropriate mprotect() calls made to make
sure that malloc()'ed data is executable (by default it is not
executable in theory, however in practice on x86 it usually is because
x86 has a limited memory protection scheme).

Where the changes to eliminate executable stack are trivial, there's no
reason not to make them as they have no performance impact and
upstream maintainers are usually in favour, and in the long term it's
good to avoid executable stack where possible.

>  This
> should be added to existing documentation policy so it is somewhere
> for new devs to know about, and existing devs to have for a reference.

Agreed.

As far as normal Gentoo is concerned, I think policy should be to fix
textrels at least where it is simple to do so and upstream are happy to
have the issues fixed, and we should be most insistent for shared
libraries that are actually used as libraries shared between
applications. For profiles that must refuse textrels, unfixed packages
can be masked. Similarly with executable stack.

- -- 
Kevin F. Quinn
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)

iD8DBQFDn81l9G2S8dekcG0RAlz4AKDB8V2j9yJUaWmWRpn6vOXbHK34+wCeNLup
NgGh1BZemplmgQW3/2050Kw=
=XjZ8
-----END PGP SIGNATURE-----

-- 
gentoo-dev@gentoo.org mailing list



  parent reply	other threads:[~2005-12-14  7:41 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-12-13 20:59 [gentoo-dev] Textrels in packages policy Mark Loeser
2005-12-13 21:15 ` Mike Frysinger
2005-12-13 22:30 ` Saleem A.
2005-12-14  0:22   ` Mike Frysinger
2005-12-14  1:02     ` Mark Loeser
2005-12-14  1:16       ` Ciaran McCreesh
2005-12-14  1:20       ` Mike Frysinger
2005-12-14  1:37         ` Ciaran McCreesh
2005-12-14  1:43           ` Mike Frysinger
2005-12-14  1:39         ` Mark Loeser
2005-12-14  1:07     ` Ciaran McCreesh
2005-12-14  1:25       ` Mike Frysinger
2005-12-14  1:32         ` Mark Loeser
2005-12-13 23:59 ` Jory A. Pratt
2005-12-14  0:25   ` Mike Frysinger
2005-12-14  2:59     ` Jason Wever
2005-12-14  3:08       ` Mark Loeser
2005-12-14  3:50       ` Mike Frysinger
2005-12-14  6:59         ` Harald van Dijk
2005-12-14  7:51           ` Kevin F. Quinn
2005-12-14  8:19             ` Harald van Dijk
2005-12-14 13:43               ` Mike Frysinger
2005-12-14 14:27                 ` Harald van Dijk
2005-12-14 14:38                   ` Mike Frysinger
2005-12-14 15:04                     ` Harald van Dijk
2005-12-15  7:14               ` Kevin F. Quinn
2005-12-14 15:25     ` Chris Gianelloni
2005-12-14 15:27     ` Chris Gianelloni
2005-12-14 15:42       ` Mike Frysinger
2005-12-14  7:44 ` Kevin F. Quinn [this message]
2005-12-14  9:33   ` Henrik Brix Andersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20051214084433.0429c1ea@c1358217.cas.dsae.finmeccanica.it \
    --to=kevquinn@gentoo.org \
    --cc=gentoo-dev@lists.gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox