From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from pigeon.gentoo.org ([208.92.234.80] helo=lists.gentoo.org)
	by finch.gentoo.org with esmtp (Exim 4.60)
	(envelope-from <gentoo-dev+bounces-41566-garchives=archives.gentoo.org@lists.gentoo.org>)
	id 1OSt97-0006TZ-Us
	for garchives@archives.gentoo.org; Sun, 27 Jun 2010 14:47:46 +0000
Received: from pigeon.gentoo.org (localhost [127.0.0.1])
	by pigeon.gentoo.org (Postfix) with SMTP id 91093E0B6A;
	Sun, 27 Jun 2010 14:47:36 +0000 (UTC)
Received: from smtp.gentoo.org (smtp.gentoo.org [140.211.166.183])
	by pigeon.gentoo.org (Postfix) with ESMTP id 3C671E0943
	for <gentoo-dev@lists.gentoo.org>; Sun, 27 Jun 2010 14:47:02 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
	by smtp.gentoo.org (Postfix) with ESMTP id D92C21B4054
	for <gentoo-dev@lists.gentoo.org>; Sun, 27 Jun 2010 14:47:01 +0000 (UTC)
X-Virus-Scanned: amavisd-new at gentoo.org
X-Spam-Score: -0.228
X-Spam-Level: 
X-Spam-Status: No, score=-0.228 required=5.5 tests=[AWL=-3.089,
	BAYES_00=-2.599, RCVD_IN_BL_SPAMCOP_NET=1.96, RCVD_IN_NIX_SPAM=3.5]
Received: from smtp.gentoo.org ([127.0.0.1])
	by localhost (smtp.gentoo.org [127.0.0.1]) (amavisd-new, port 10024)
	with ESMTP id K4N0JfwkMlhs for <gentoo-dev@lists.gentoo.org>;
	Sun, 27 Jun 2010 14:46:54 +0000 (UTC)
Received: from lo.gmane.org (lo.gmane.org [80.91.229.12])
	by smtp.gentoo.org (Postfix) with ESMTP id 558811B4102
	for <gentoo-dev@gentoo.org>; Sun, 27 Jun 2010 14:46:50 +0000 (UTC)
Received: from list by lo.gmane.org with local (Exim 4.69)
	(envelope-from <lnx-gentoo-dev@m.gmane.org>)
	id 1OSt8C-0003F1-UK
	for gentoo-dev@gentoo.org; Sun, 27 Jun 2010 16:46:48 +0200
Received: from athedsl-376750.home.otenet.gr ([79.131.23.172])
        by main.gmane.org with esmtp (Gmexim 0.1 (Debian))
        id 1AlnuQ-0007hv-00
        for <gentoo-dev@gentoo.org>; Sun, 27 Jun 2010 16:46:48 +0200
Received: from realnc by athedsl-376750.home.otenet.gr with local (Gmexim 0.1 (Debian))
        id 1AlnuQ-0007hv-00
        for <gentoo-dev@gentoo.org>; Sun, 27 Jun 2010 16:46:48 +0200
X-Injected-Via-Gmane: http://gmane.org/
To: gentoo-dev@lists.gentoo.org
From: Nikos Chantziaras <realnc@arcor.de>
Subject: [gentoo-dev] Re: FYI:  Rules for distro-friendly packages
Date: Sun, 27 Jun 2010 17:46:52 +0300
Organization: Lucas Barks
Message-ID: <i07o8g$ug0$1@dough.gmane.org>
References: <20100625201738.GA4789@nibiru.local> <4C252C8E.8020408@gentoo.org> <20100626193915.GD4789@nibiru.local> <20100626205001.365b51cb@snowcone> <20100626195733.GF4789@nibiru.local> <20100626211254.002784d4@snowcone> <i06ouu$eos$1@dough.gmane.org> <20100627104724.GC23460@nibiru.local> <i07e9t$356$1@dough.gmane.org> <20100627122258.GA8754@boostbox>
Precedence: bulk
List-Post: <mailto:gentoo-dev@lists.gentoo.org>
List-Help: <mailto:gentoo-dev+help@lists.gentoo.org>
List-Unsubscribe: <mailto:gentoo-dev+unsubscribe@lists.gentoo.org>
List-Subscribe: <mailto:gentoo-dev+subscribe@lists.gentoo.org>
List-Id: Gentoo Linux mail <gentoo-dev.gentoo.org>
X-BeenThere: gentoo-dev@lists.gentoo.org
Reply-to: gentoo-dev@lists.gentoo.org
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
X-Complaints-To: usenet@dough.gmane.org
X-Gmane-NNTP-Posting-Host: athedsl-376750.home.otenet.gr
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.4) Gecko/20100626 Thunderbird/3.1
In-Reply-To: <20100627122258.GA8754@boostbox>
Content-Transfer-Encoding: quoted-printable
X-Archives-Salt: 3b0e1740-ce6e-4ef5-9173-e568bc779f44
X-Archives-Hash: e8177efb178b6a73729cbc3c4b0291b1

On 06/27/2010 03:23 PM, Harald van D=C4=B3k wrote:
> On Sun, Jun 27, 2010 at 02:56:33PM +0300, Nikos Chantziaras wrote:
>> On 06/27/2010 01:47 PM, Enrico Weigelt wrote:
>>> * Nikos Chantziaras<realnc@arcor.de>   schrieb:
>>>
>>>> Did it actually occur to anyone that warnings are not errors?
>>>> You can have them for correct code.  A warning means you might
>>>> want to look at the code to check whether there's some real
>>>> error there.  It doesn't mean the code is broken.
>>>
>>> In my personal experience, most times a warning comes it, the
>>> code *is* broken (but *might* work in most situations).
>>
>> That's the key to it: most times.  Granted, without -Wall (or any
>> other options that tweaks the default warning level) we can be very
>> sure that the warning is the result of a mistake by the developer.
>> But with -Wall, many warnings are totally not interesting ("unused
>> parameter") and some even try to outsmart the programmer even
>> though he/she knows better ("taking address of variable declared
>> register").  In that last example, fixing it would even be wrong
>> when you consider the optimizer and the fuzzy meaning of "register"
>> which the compiler is totally free to ignore.
>
> The compiler is not totally free to ignore the register keyword.
> Both the C and the C++ standards require that the compiler complain
> when taking the address of a register variable. Other compilers will
> issue a hard error for it. Fixing the code to not declare the
> variable as register would be the correct thing to do.

No, it would not be the correct thing to do, because of the following.=20
(This is part of a discussion between me and someone quite smarter than=20
me, who explained the issue in detail.)

The basic issue is that the code takes the address of the variable in
question in expressions passed as parameters to certain function calls.
These function calls all happen to be in-linable functions, and it
happens that in each function, the address operator is always canceled
out by a '*' dereference operator - in other words, we have '*&p', which
the compiler can turn into just plain 'p' when the calls are in-lined,
eliminating the need to actually take the address of 'p'.

A compiler is always free to ignore 'register' declarations *anyway*,
even if enregistration is possible.  Therefore a warning that it's not
possible to obey 'register' is unnecessary, because it's explicit in the
language definition that 'register' is not binding.  It simply is not
possible for an ignored 'register' attribute to cause unexpected
behavior.  Warnings really should only be generated for situations where
it is likely that the programmer expects different behavior than the
compiler will deliver; in the case of an ignored 'register' attribute,
the programmer is *required* to expect that the attribute might be
ignored, so a warning to this effect is superfluous.

Now, I understand why they generate the warning - it's because the
compiler believes that the program code itself makes enregistration
impossible, not because the compiler has chosen for optimization
purposes to ignore the 'register' request.  However, as we'll see
shortly, the program code doesn't truly make enregistration impossible;
it is merely impossible in some interpretations of the code.  Therefore
we really are back to the compiler choosing to ignore the 'register'
request due to its own optimization decisions; the 'register' request is
made impossible far downstream of the actual decisions that the compiler
makes (which have to do with in-line vs out-of-line calls), but it
really is compiler decisions that make it impossible, not the inherent
structure of the code.

When a function is in-lined, the compiler is not required to generate
the same code it would generate for the most general case of the same
function call, as long as the meaning is the same.

For example, suppose we have some code that contains a call to a
function like so:

    a =3D myFunc(a + 7, 3);

In the general out-of-line case, the compiler must generate some
machine-code instructions like this:

    push #3
    mov [a], d0
    add #7, d0
    push d0
    call #myFunc
    mov d0, [a]

The compiler doesn't have access to the inner workings of myFunc, so it
must generate the appropriate code for the generic interface to an
external function.

Now, suppose the function is defined like so:

   int myFunc(int a, int b) { return a - 6; }

and further suppose that the compiler decides to in-line this function.
In-lining means the compiler will generate the code that implements the
function directly in the caller; there will be no call to an external
linkage point.  This means the compiler can implement the linkage to the
function with a custom one-off interface for this particular invocation
- every in-line invocation can be customized to the exact context where
it appears.  So, for example, if we call myFunc right now and registers
d1 and d2 happens to be available, we can put the parameters in d1 and
d2, and the generated function will refer to those registers for the
parameters rather than having to look in the stack.  Later on, if we
generate a separate call to the same function, but registers d3 and d7
are the ones available, we can use those instead.  Each generated copy
of the function can fit its exact context.

Furthermore, looking at this function and at the arguments passed, we
can see that the formal parameter 'b' has no effect on the function's
results, and the actual parameter '3' passed for 'b' has no side
effects.  Therefore, the compiler is free to completely ignore this
parameter - there's no need to generate any code for it at all, since we
have sufficient knowledge to see that it has no effect on the meaning of
the code.

Further still, we can globally optimize the entire function.  So, we can
see that myFunc(a+7, 3) is going to turn into the expression (a+7-6).
We can fold constants to arrive at (a+1) as the result of the function.
We can therefore generate the entire code for the function's invocation
like so:

    inc [a]

Okay, now let's look at the &p case.  In the specific examples in
vmrun.cpp, we have a bunch of function invocations like this:

   register const char *p;
   int x =3D myfunc(&p);

In the most general case, we have to generate code like this:

   lea [p], d0        ; load effective address
   push d0
   call #myfunc
   mov d0, [x]

So, in the most general case of a call with external linkage, we need
'p' to have a main memory address so that we can push it on the stack as
the parameter to this call.  Registers don't have main memory addresses,
so 'p' can't go in a register.

However, we know what myfunc() looks like:

   char myfunc(const char **p)
   {
       char c =3D **p;
       *p +=3D 1;
       return c;
   }

If the compiler chooses to in-line this function, it can globally
optimize its linkage and implementation as we saw earlier.  So, the
compiler can rewrite the code like so:

   register const char *p;
   int x =3D **(&p);
   *(&p) +=3D 1;

which can be further rewritten to:

   register const char *p;
   int x =3D *p;
   p +=3D 1;

Now we can generate the machine code for the final optimized form:

   mov [p], a0         ; get the *value* of p into index register 0
   mov.byte [a0+0], d0 ; get the value index register 0 points to
   mov.byte d0, [x]    ; store it in x
   inc [p]             ; inc the value of p

do we need a main memory address for p.  This means the compiler
can keep p in a register, say d5:

   mov d5, a0
   mov.byte [a0+0], d0
   mov.byte d0, [x]
   inc d5

And this is indeed exactly what the code that comes out of most=20
compilers looks like (changed from my abstract machine to 32-bit x86, of=20
course).

So: if the compiler chooses to in-line the functions that are called
with '&p' as a parameter, and the compiler performs the available
optimizations on those calls once they're in-lined, then a memory
address for 'p' is never needed.  Thus there is a valid interpretation
of the code where 'register p' can be obeyed.  If the compiler doesn't
choose to in-line the functions or make those optimizations, then the
compiler will be unable to satisfy the 'register p' request and will be
forced to put 'p' in addressable main memory.  But it really is entirely
up to the compiler whether to obey the 'register p' request; the
program's structure does not make the request impossible to satisfy.
Therefore there is no reason for the compiler to warn about this, any
more than there would be if the compiler chose not to obey the 'register
p' simply because it thought it could make more optimal use of the
available registers.  That GCC warns is understandable, in that a
superficial reading of the code would not reveal the optimization
opportunity; but the warning is nonetheless unnecessary, and the
'register' does provide useful optimization hinting.


OK, long read, but the the conclusion is that "fixing the code to not=20
declare the variable as register would be the correct thing to do" it=20
*not* the correct thing to do.  The correct thing to do is to ignore the=20
warning, which is not possible if warnings are turned into errors.

You also mentioned that "other compilers will issue a hard error for=20
it."  That sounds rather strange, and I wonder which compilers that=20
might be; someone should file a bug report against them ;)