* [gentoo-user] CFLAGs for kernel compilation
@ 2015-04-29 11:31 Ralf
2015-04-29 12:33 ` Neil Bothwick
` (2 more replies)
0 siblings, 3 replies; 26+ messages in thread
From: Ralf @ 2015-04-29 11:31 UTC (permalink / raw
To: gentoo-user
Hi,
just a short question: I don't like genkernel, I always compile my
kernel manually using menuconfig.
So the CFLAGs of my make.conf won't get applied.
What is the best way to (persistently) set the CFLAGs for the kernel
compilation?
- I don't like invoking 'CFLAGS="-O2 -march=foo"make'
- I don't want to set CFLAGS as a persistent environment variable.
- I don't want to modify the kernel Makefile
Does it actually make sense to set an optimization level and -march?
Cheers
Ralf
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [gentoo-user] CFLAGs for kernel compilation
2015-04-29 11:31 [gentoo-user] CFLAGs for kernel compilation Ralf
@ 2015-04-29 12:33 ` Neil Bothwick
2015-04-29 12:41 ` Emanuele Rusconi
2015-04-30 9:38 ` Andrew Savchenko
2015-04-30 16:26 ` [gentoo-user] " Volker Armin Hemmann
2 siblings, 1 reply; 26+ messages in thread
From: Neil Bothwick @ 2015-04-29 12:33 UTC (permalink / raw
To: gentoo-user
[-- Attachment #1: Type: text/plain, Size: 691 bytes --]
On Wed, 29 Apr 2015 13:31:13 +0200, Ralf wrote:
> just a short question: I don't like genkernel, I always compile my
> kernel manually using menuconfig.
> So the CFLAGs of my make.conf won't get applied.
>
> What is the best way to (persistently) set the CFLAGs for the kernel
> compilation?
>
> - I don't like invoking 'CFLAGS="-O2 -march=foo"make'
> - I don't want to set CFLAGS as a persistent environment variable.
> - I don't want to modify the kernel Makefile
Use a script
#!/bin/sh
source /etc/portage/make.conf
cd /usr/src/linux
make && make modules_install && make install
--
Neil Bothwick
... if (pot.coffee == EMPTY) { programmer->brain = OFF };
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [gentoo-user] CFLAGs for kernel compilation
2015-04-29 12:33 ` Neil Bothwick
@ 2015-04-29 12:41 ` Emanuele Rusconi
2015-04-29 13:18 ` Ralf
2015-04-30 7:11 ` Adam Carter
0 siblings, 2 replies; 26+ messages in thread
From: Emanuele Rusconi @ 2015-04-29 12:41 UTC (permalink / raw
To: gentoo-user
[-- Attachment #1: Type: text/plain, Size: 397 bytes --]
> - I don't like invoking 'CFLAGS="-O2 -march=foo"make'
> - I don't want to set CFLAGS as a persistent environment variable.
Does the kernel building use the CFLAGS at all?
The arch is set during the configuration step ("Processor type and
features" / "Processor family"),
and there's an "optimize for size" option under "General setup", which I
suppose corresponds to -Os.
-- Emanuele Rusconi
[-- Attachment #2: Type: text/html, Size: 671 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [gentoo-user] CFLAGs for kernel compilation
2015-04-29 12:41 ` Emanuele Rusconi
@ 2015-04-29 13:18 ` Ralf
2015-04-29 13:35 ` [gentoo-user] " Holger Hoffstätte
2015-04-30 16:27 ` [gentoo-user] " Volker Armin Hemmann
2015-04-30 7:11 ` Adam Carter
1 sibling, 2 replies; 26+ messages in thread
From: Ralf @ 2015-04-29 13:18 UTC (permalink / raw
To: gentoo-user
Damn, you're absolutely right.
I just tested it using make V=1.
kernel make does override CFLAGs from the outside.
But that's interesting: my processor supports -march=core-avx2 and none
of the linux kernel processor family uses this flag...
Thx
Ralf
On 04/29/2015 02:41 PM, Emanuele Rusconi wrote:
> Does the kernel building use the CFLAGS at all?
> The arch is set during the configuration step ("Processor type and
> features" / "Processor family"),
> and there's an "optimize for size" option under "General setup", which
> I suppose corresponds to -Os.
^ permalink raw reply [flat|nested] 26+ messages in thread
* [gentoo-user] Re: CFLAGs for kernel compilation
2015-04-29 13:18 ` Ralf
@ 2015-04-29 13:35 ` Holger Hoffstätte
2015-04-29 23:52 ` Nikos Chantziaras
2015-04-30 16:27 ` [gentoo-user] " Volker Armin Hemmann
1 sibling, 1 reply; 26+ messages in thread
From: Holger Hoffstätte @ 2015-04-29 13:35 UTC (permalink / raw
To: gentoo-user
On Wed, 29 Apr 2015 15:18:23 +0200, Ralf wrote:
> Damn, you're absolutely right.
>
> I just tested it using make V=1.
> kernel make does override CFLAGs from the outside.
>
> But that's interesting: my processor supports -march=core-avx2 and none
> of the linux kernel processor family uses this flag...
https://github.com/graysky2/kernel_gcc_patch
-h
^ permalink raw reply [flat|nested] 26+ messages in thread
* [gentoo-user] Re: CFLAGs for kernel compilation
2015-04-29 13:35 ` [gentoo-user] " Holger Hoffstätte
@ 2015-04-29 23:52 ` Nikos Chantziaras
2015-04-30 0:01 ` Nikos Chantziaras
0 siblings, 1 reply; 26+ messages in thread
From: Nikos Chantziaras @ 2015-04-29 23:52 UTC (permalink / raw
To: gentoo-user
On 29/04/15 16:35, Holger Hoffstätte wrote:
> On Wed, 29 Apr 2015 15:18:23 +0200, Ralf wrote:
>
>> Damn, you're absolutely right.
>>
>> I just tested it using make V=1.
>> kernel make does override CFLAGs from the outside.
>>
>> But that's interesting: my processor supports -march=core-avx2 and none
>> of the linux kernel processor family uses this flag...
>
> https://github.com/graysky2/kernel_gcc_patch
This is already applied when enabling the "experimental" USE flag. At
least, that's what the docs claim:
$ equery uses gentoo-sources
^ permalink raw reply [flat|nested] 26+ messages in thread
* [gentoo-user] Re: CFLAGs for kernel compilation
2015-04-29 23:52 ` Nikos Chantziaras
@ 2015-04-30 0:01 ` Nikos Chantziaras
0 siblings, 0 replies; 26+ messages in thread
From: Nikos Chantziaras @ 2015-04-30 0:01 UTC (permalink / raw
To: gentoo-user
On 30/04/15 02:52, Nikos Chantziaras wrote:
> On 29/04/15 16:35, Holger Hoffstätte wrote:
>> On Wed, 29 Apr 2015 15:18:23 +0200, Ralf wrote:
>>
>>> Damn, you're absolutely right.
>>>
>>> I just tested it using make V=1.
>>> kernel make does override CFLAGs from the outside.
>>>
>>> But that's interesting: my processor supports -march=core-avx2 and none
>>> of the linux kernel processor family uses this flag...
>>
>> https://github.com/graysky2/kernel_gcc_patch
>
> This is already applied when enabling the "experimental" USE flag. At
> least, that's what the docs claim:
>
> $ equery uses gentoo-sources
However, I just checked and it's not being applied. So either the
documentation is wrong, or the ebuild/eclass has a bug.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [gentoo-user] CFLAGs for kernel compilation
2015-04-29 12:41 ` Emanuele Rusconi
2015-04-29 13:18 ` Ralf
@ 2015-04-30 7:11 ` Adam Carter
1 sibling, 0 replies; 26+ messages in thread
From: Adam Carter @ 2015-04-30 7:11 UTC (permalink / raw
To: gentoo-user@lists.gentoo.org
[-- Attachment #1: Type: text/plain, Size: 321 bytes --]
On Wed, Apr 29, 2015 at 10:41 PM, Emanuele Rusconi <emarsk@gmail.com> wrote:
>
> > - I don't like invoking 'CFLAGS="-O2 -march=foo"make'
> > - I don't want to set CFLAGS as a persistent environment variable.
>
> Does the kernel building use the CFLAGS at all?
>
You probably want CFLAGS_KERNEL and maybe CFLAGS_MODULE
[-- Attachment #2: Type: text/html, Size: 774 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [gentoo-user] CFLAGs for kernel compilation
2015-04-29 11:31 [gentoo-user] CFLAGs for kernel compilation Ralf
2015-04-29 12:33 ` Neil Bothwick
@ 2015-04-30 9:38 ` Andrew Savchenko
2015-05-01 5:09 ` [gentoo-user] " Martin Vaeth
2015-04-30 16:26 ` [gentoo-user] " Volker Armin Hemmann
2 siblings, 1 reply; 26+ messages in thread
From: Andrew Savchenko @ 2015-04-30 9:38 UTC (permalink / raw
To: gentoo-user; +Cc: Ralf
[-- Attachment #1: Type: text/plain, Size: 1553 bytes --]
On Wed, 29 Apr 2015 13:31:13 +0200 Ralf wrote:
> Hi,
>
> just a short question: I don't like genkernel, I always compile my
> kernel manually using menuconfig.
> So the CFLAGs of my make.conf won't get applied.
>
> What is the best way to (persistently) set the CFLAGs for the kernel
> compilation?
>
> - I don't like invoking 'CFLAGS="-O2 -march=foo"make'
> - I don't want to set CFLAGS as a persistent environment variable.
> - I don't want to modify the kernel Makefile
>
> Does it actually make sense to set an optimization level and -march?
Short answer: don't even try to use general CFLAGS for a kernel,
you'll badly damage its performance.
Long answer: context switching between integer and floating point
is very expensive, that's why kernel is integer only, any
non-integer calculations are implemented using fixed point (integer
numbers from CPU's POW). That's why kernel makes sure that no
floating point instructions sneaks in using CFLAGS, you may see a
lot of -mno-${intrucion_set} flags when running make -V. Futhermore
kernel needs several memory alignment flags which should not be
removed as well.
The proper way to fine-tune CFLAGS for a local CPU support will be
to use kernel-gcc-patches[1], as was pointed in other reply already.
This code will ensure that proper CPU support is enabled while
keeping all floating point instructions disabled. Just apply a
patch and select native arch in CPU arch menu.
[1] https://github.com/graysky2/kernel_gcc_patch
Best regards,
Andrew Savchenko
[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [gentoo-user] CFLAGs for kernel compilation
2015-04-29 11:31 [gentoo-user] CFLAGs for kernel compilation Ralf
2015-04-29 12:33 ` Neil Bothwick
2015-04-30 9:38 ` Andrew Savchenko
@ 2015-04-30 16:26 ` Volker Armin Hemmann
2015-04-30 17:45 ` Andrew Savchenko
2 siblings, 1 reply; 26+ messages in thread
From: Volker Armin Hemmann @ 2015-04-30 16:26 UTC (permalink / raw
To: gentoo-user
Am 29.04.2015 um 13:31 schrieb Ralf:
> Hi,
>
> just a short question: I don't like genkernel, I always compile my
> kernel manually using menuconfig.
> So the CFLAGs of my make.conf won't get applied.
as it should be.
>
> What is the best way to (persistently) set the CFLAGs for the kernel
> compilation?
you don't touch kernel cflags.
That simple. The kernel is too important and the people programming it
know what they are doing. Don't set anything. It is retarded.
>
> - I don't like invoking 'CFLAGS="-O2 -march=foo"make'
> - I don't want to set CFLAGS as a persistent environment variable.
> - I don't want to modify the kernel Makefile
>
> Does it actually make sense to set an optimization level and -march?
no
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [gentoo-user] CFLAGs for kernel compilation
2015-04-29 13:18 ` Ralf
2015-04-29 13:35 ` [gentoo-user] " Holger Hoffstätte
@ 2015-04-30 16:27 ` Volker Armin Hemmann
1 sibling, 0 replies; 26+ messages in thread
From: Volker Armin Hemmann @ 2015-04-30 16:27 UTC (permalink / raw
To: gentoo-user
Am 29.04.2015 um 15:18 schrieb Ralf:
> Damn, you're absolutely right.
>
> I just tested it using make V=1.
> kernel make does override CFLAGs from the outside.
>
> But that's interesting: my processor supports -march=core-avx2 and none
> of the linux kernel processor family uses this flag...
that does not matter.
Just say no to stupid flags and unnecessary 'optimizations'.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [gentoo-user] CFLAGs for kernel compilation
2015-04-30 16:26 ` [gentoo-user] " Volker Armin Hemmann
@ 2015-04-30 17:45 ` Andrew Savchenko
2015-04-30 18:11 ` Volker Armin Hemmann
0 siblings, 1 reply; 26+ messages in thread
From: Andrew Savchenko @ 2015-04-30 17:45 UTC (permalink / raw
To: gentoo-user
[-- Attachment #1: Type: text/plain, Size: 871 bytes --]
Hi,
On Thu, 30 Apr 2015 18:26:22 +0200 Volker Armin Hemmann wrote:
> That simple. The kernel is too important and the people programming it
> know what they are doing. Don't set anything. It is retarded.
> >
> > - I don't like invoking 'CFLAGS="-O2 -march=foo"make'
> > - I don't want to set CFLAGS as a persistent environment variable.
> > - I don't want to modify the kernel Makefile
> >
> > Does it actually make sense to set an optimization level and -march?
>
> no
While I completely agree with you that kernel CFLAGS should not be
randomly tampered with, I can't agree that -march itself is useless.
Tests and results are available here:
https://github.com/graysky2/kernel_gcc_patch
Optimization is a very powerful tool if taken with care. Of course
it may lead to a disastrous result if mindlessly used.
Best regards,
Andrew Savchenko
[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [gentoo-user] CFLAGs for kernel compilation
2015-04-30 17:45 ` Andrew Savchenko
@ 2015-04-30 18:11 ` Volker Armin Hemmann
2015-04-30 18:28 ` Andrew Savchenko
0 siblings, 1 reply; 26+ messages in thread
From: Volker Armin Hemmann @ 2015-04-30 18:11 UTC (permalink / raw
To: gentoo-user
Am 30.04.2015 um 19:45 schrieb Andrew Savchenko:
> Hi,
>
> On Thu, 30 Apr 2015 18:26:22 +0200 Volker Armin Hemmann wrote:
>> That simple. The kernel is too important and the people programming it
>> know what they are doing. Don't set anything. It is retarded.
>>> - I don't like invoking 'CFLAGS="-O2 -march=foo"make'
>>> - I don't want to set CFLAGS as a persistent environment variable.
>>> - I don't want to modify the kernel Makefile
>>>
>>> Does it actually make sense to set an optimization level and -march?
>> no
> While I completely agree with you that kernel CFLAGS should not be
> randomly tampered with, I can't agree that -march itself is useless.
> Tests and results are available here:
> https://github.com/graysky2/kernel_gcc_patch
>
> Optimization is a very powerful tool if taken with care. Of course
> it may lead to a disastrous result if mindlessly used.
>
> Best regards,
> Andrew Savchenko
if your mail client or browser is miscompiled, it is crashy, but worst
case, a bunch of emails or bookmarks are lost.
If the kernel fucks up, it might write across partition boundaries and
destroy ALL your data. Or writes garbage instead of data.
Don't f* with the kernel.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [gentoo-user] CFLAGs for kernel compilation
2015-04-30 18:11 ` Volker Armin Hemmann
@ 2015-04-30 18:28 ` Andrew Savchenko
0 siblings, 0 replies; 26+ messages in thread
From: Andrew Savchenko @ 2015-04-30 18:28 UTC (permalink / raw
To: gentoo-user
[-- Attachment #1: Type: text/plain, Size: 1730 bytes --]
On Thu, 30 Apr 2015 20:11:52 +0200 Volker Armin Hemmann wrote:
> Am 30.04.2015 um 19:45 schrieb Andrew Savchenko:
> > Hi,
> >
> > On Thu, 30 Apr 2015 18:26:22 +0200 Volker Armin Hemmann wrote:
> >> That simple. The kernel is too important and the people programming it
> >> know what they are doing. Don't set anything. It is retarded.
> >>> - I don't like invoking 'CFLAGS="-O2 -march=foo"make'
> >>> - I don't want to set CFLAGS as a persistent environment variable.
> >>> - I don't want to modify the kernel Makefile
> >>>
> >>> Does it actually make sense to set an optimization level and -march?
> >> no
> > While I completely agree with you that kernel CFLAGS should not be
> > randomly tampered with, I can't agree that -march itself is useless.
> > Tests and results are available here:
> > https://github.com/graysky2/kernel_gcc_patch
> >
> > Optimization is a very powerful tool if taken with care. Of course
> > it may lead to a disastrous result if mindlessly used.
> >
> > Best regards,
> > Andrew Savchenko
>
> if your mail client or browser is miscompiled, it is crashy, but worst
> case, a bunch of emails or bookmarks are lost.
>
> If the kernel fucks up, it might write across partition boundaries and
> destroy ALL your data. Or writes garbage instead of data.
>
> Don't f* with the kernel.
That's why we have tests. Follow the link above. As for a personal
experience: we have kernels with this patch and gcc native
optimization in production for several years. Results are fine
(no kernel related issues).
In order not to crash kernel, do not add -ffast-math there. You
need to have some understanding before touching such stuff.
Best regards,
Andrew Savchenko
[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread
* [gentoo-user] Re: CFLAGs for kernel compilation
2015-04-30 9:38 ` Andrew Savchenko
@ 2015-05-01 5:09 ` Martin Vaeth
2015-05-01 7:44 ` Andrew Savchenko
0 siblings, 1 reply; 26+ messages in thread
From: Martin Vaeth @ 2015-05-01 5:09 UTC (permalink / raw
To: gentoo-user
Andrew Savchenko <bircoph@gentoo.org> wrote:
>
> That's why kernel makes sure that no floating point instructions
> sneaks in using CFLAGS, you may see a lot of -mno-${intrucion_set}
> flags when running make -V.
So it should be sufficient that the kernel does not use "float"
or "double", shouldn't it?
I can hardly imagine that otherwise the compiler converts integer
or pointer arithmetic into floating point arithmetics, or is
this really the case for certain flags? If yes, why should these
flags *ever* be useful?
I mean: The context switching happens for non-kernel code as well,
doesn't it?
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [gentoo-user] Re: CFLAGs for kernel compilation
2015-05-01 5:09 ` [gentoo-user] " Martin Vaeth
@ 2015-05-01 7:44 ` Andrew Savchenko
2015-05-01 17:21 ` James
2015-05-02 5:04 ` Nikos Chantziaras
0 siblings, 2 replies; 26+ messages in thread
From: Andrew Savchenko @ 2015-05-01 7:44 UTC (permalink / raw
To: gentoo-user
[-- Attachment #1: Type: text/plain, Size: 1679 bytes --]
On Fri, 1 May 2015 05:09:51 +0000 (UTC) Martin Vaeth wrote:
> Andrew Savchenko <bircoph@gentoo.org> wrote:
> >
> > That's why kernel makes sure that no floating point instructions
> > sneaks in using CFLAGS, you may see a lot of -mno-${intrucion_set}
> > flags when running make -V.
>
> So it should be sufficient that the kernel does not use "float"
> or "double", shouldn't it?
No. Optimizer paths may be very unobvious, i.e. I'll not be
surprised if under some conditions vectorizer may use float
instructions for int code.
> I can hardly imagine that otherwise the compiler converts integer
> or pointer arithmetic into floating point arithmetics, or is
> this really the case for certain flags? If yes, why should these
> flags *ever* be useful?
> I mean: The context switching happens for non-kernel code as well,
> doesn't it?
Yes, context switching happens for all code and have its costs. But
for userspace code context switching happens for many other
reasons, e.g. on each syscall (userspace <-> kernelspace switching).
Also some user applications may need high precision or context
switching pays off due to mass parallel data processing, e.g. SIMD
instructions in scientific or multimedia applications. But unless
special conditions mentioned above, fixed point is still faster in
userspace, some ffmpeg codecs have both fixed and floating point
implementations, you may compare them. Programming in fixed point
is much harder, so most people avoid it unless they have a very
goode reason to use it. And dont't forget that kernel is
performance critical unlike most of userspace applications.
Best regards,
Andrew Savchenko
[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread
* [gentoo-user] Re: CFLAGs for kernel compilation
2015-05-01 7:44 ` Andrew Savchenko
@ 2015-05-01 17:21 ` James
2015-05-02 5:04 ` Nikos Chantziaras
1 sibling, 0 replies; 26+ messages in thread
From: James @ 2015-05-01 17:21 UTC (permalink / raw
To: gentoo-user
Andrew Savchenko <bircoph <at> gentoo.org> writes:
> > I can hardly imagine that otherwise the compiler converts integer
> > or pointer arithmetic into floating point arithmetics, or is
> > this really the case for certain flags? If yes, why should these
> > flags *ever* be useful?
> > I mean: The context switching happens for non-kernel code as well,
> > doesn't it?
First off, reading this thread, I cannot really tell what the intended use
of the the "highly tuned" kernels is to be. For almost all workstation
and server proposes, what has been previously stated is mostly correct. If
you really want test these waters, do it on a system that is not in your
critical path. You tune and experiment, you are going to bork your box.
Water coolers on the CPUs is a good idea when taxing FPU and other simd
hareware on the CPU, imho. sys-power/Powertop is your friend.
> Yes, context switching happens for all code and have its costs. But
> for userspace code context switching happens for many other
> reasons, e.g. on each syscall (userspace <-> kernelspace switching).
> Also some user applications may need high precision or context
> switching pays off due to mass parallel data processing, e.g. SIMD
> instructions in scientific or multimedia applications.
(
Here here, I knew we had an LU expert int he crowd. Most scientific
or highly parallelized number cruncing does benefit from experimenting
with settings and *profiling* the results (trace-cdm + kernelshark)
are in portage and are very useful for analysis of hardware timings,
context switching and a myriad of other issues. Be careful, you can
sink a lifetime into such efforts with little to show for your efforts.
The best thing is to read up on specific optimizations for specific
codes as vetted by the specific hardware in your processors. Tuning for
one need will most likely retard other types of performances; that is
why before you delve into these waters, you really need to learn about
profiling both target (applicattion) and kernel codes, *BEFORE* randomly
tuning the advanced numerical intricacies of your hardware resources.
Start with memory and cgroups before worrying about the hardware inside
your processors (cpu and gpu).
> But unless special conditions mentioned above, fixed point is still
> faster in userspace, some ffmpeg codecs have both fixed and floating
> point implementations, you may compare them. Programming in fixed point
> is much harder, so most people avoid it unless they have a very
> goode reason to use it. And dont't forget that kernel is
> performance critical unlike most of userspace applications.
Video (mpeg, h.264 and such) massively benefits from the enhanced matrix
abilities of the simd hardware in your video card's GPU. These bare metal
resources are being integrated into gcc-5.1+ for experimentation. But,
it is likely going to take a year or so before ordinary users of linux
resources see these performance gains. I would encourage you
to experiment, but *never on your main workstation*. I'm purchasing
a new nvidia video card just to benchmark and tune some numerically
intesive codes that use sci-libs/magma. Although this will be my
currently fastest video card, it will sit in a box that not used
for visual eye candy (gaming, anime, ray_traces etc).
The mesos clustering codes (shark, storm, tachyon etc) and MP(I) codes are
going to fundamentally change the numerical processing landscape for even
small linux clusters. An excellent bit of code to get your feet_wet is
sys-apps/hwloc. More than FPU, MP(I) {sys-cluster/openmpi} and other
clustering codes are going to allow you to use the DDR(4|5) memory found in
many video cards (GPU) via *RDMA*. The world is rapidly changing and many
old "fixed point integer" folks do not see the Tsunami that is just
off_shore. Many computationally expensive codes have development project to
move to an "in-memory" [1] environment where HD resources are avoided as
much as possible in a cluster environment. Clustered resources "tuned" for
such things as a video rendering farm, will have very different optimized
kernels than your KDE(G*) workstation or web server. medica-gfx/Blender is
another excellent collection of codes that benefits from all sorts of tuning
on a special_purpose system.
So do you really have a valid need to tune the FPU performance due to a
numerically demanding applications? YMMV
> Best regards,
> Andrew Savchenko
hth,
James
[1] https://amplab.cs.berkeley.edu/
^ permalink raw reply [flat|nested] 26+ messages in thread
* [gentoo-user] Re: CFLAGs for kernel compilation
2015-05-01 7:44 ` Andrew Savchenko
2015-05-01 17:21 ` James
@ 2015-05-02 5:04 ` Nikos Chantziaras
2015-05-02 11:19 ` Volker Armin Hemmann
1 sibling, 1 reply; 26+ messages in thread
From: Nikos Chantziaras @ 2015-05-02 5:04 UTC (permalink / raw
To: gentoo-user
On 01/05/15 10:44, Andrew Savchenko wrote:
> On Fri, 1 May 2015 05:09:51 +0000 (UTC) Martin Vaeth wrote:
>> Andrew Savchenko <bircoph@gentoo.org> wrote:
>>>
>>> That's why kernel makes sure that no floating point instructions
>>> sneaks in using CFLAGS, you may see a lot of -mno-${intrucion_set}
>>> flags when running make -V.
>>
>> So it should be sufficient that the kernel does not use "float"
>> or "double", shouldn't it?
>
> No. Optimizer paths may be very unobvious, i.e. I'll not be
> surprised if under some conditions vectorizer may use float
> instructions for int code.
The kernel uses -O2 and several -march variants (e.g. -march=core2).
Several other options are used to prevent GCC from generating unsuitable
code.
Specifying another -march variant does not affect the optimizer though.
It only affects the code generator. If you don't modify the other CFLAGS
and only change -march, you will not get FP instructions unless you use
FP in the code.
Also, I'd be very interested to see *any* optimization that would
somehow transform integer code to FP code (note that SIMD is not FP and
is perfectly fine in the kernel.) In fact, optimizers tend to transform
FP into SIMD, at least on x86 (and other architectures that have fast
SIMD instructions.) If I inspect the generated assembly from GCC or
Clang, I cannot find FP anywhere, even for code using "float" and
"double" operations. They get converted to SIMD on modern CPUs (unless
you specify a compiler flag that tells it to use the FPU, for example if
you need 80-bit extended precision, which is supported by the x86 FPU.)
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [gentoo-user] Re: CFLAGs for kernel compilation
2015-05-02 5:04 ` Nikos Chantziaras
@ 2015-05-02 11:19 ` Volker Armin Hemmann
2015-05-02 11:25 ` Nikos Chantziaras
0 siblings, 1 reply; 26+ messages in thread
From: Volker Armin Hemmann @ 2015-05-02 11:19 UTC (permalink / raw
To: gentoo-user
Am 02.05.2015 um 07:04 schrieb Nikos Chantziaras:
> On 01/05/15 10:44, Andrew Savchenko wrote:
>> On Fri, 1 May 2015 05:09:51 +0000 (UTC) Martin Vaeth wrote:
>>> Andrew Savchenko <bircoph@gentoo.org> wrote:
>>>>
>>>> That's why kernel makes sure that no floating point instructions
>>>> sneaks in using CFLAGS, you may see a lot of -mno-${intrucion_set}
>>>> flags when running make -V.
>>>
>>> So it should be sufficient that the kernel does not use "float"
>>> or "double", shouldn't it?
>>
>> No. Optimizer paths may be very unobvious, i.e. I'll not be
>> surprised if under some conditions vectorizer may use float
>> instructions for int code.
>
> The kernel uses -O2 and several -march variants (e.g. -march=core2).
> Several other options are used to prevent GCC from generating
> unsuitable code.
>
> Specifying another -march variant does not affect the optimizer
> though. It only affects the code generator. If you don't modify the
> other CFLAGS and only change -march, you will not get FP instructions
> unless you use FP in the code.
>
> Also, I'd be very interested to see *any* optimization that would
> somehow transform integer code to FP code (note that SIMD is not FP
> and is perfectly fine in the kernel.) In fact, optimizers tend to
> transform FP into SIMD, at least on x86 (and other architectures that
> have fast SIMD instructions.) If I inspect the generated assembly from
> GCC or Clang, I cannot find FP anywhere, even for code using "float"
> and "double" operations. They get converted to SIMD on modern CPUs
> (unless you specify a compiler flag that tells it to use the FPU, for
> example if you need 80-bit extended precision, which is supported by
> the x86 FPU.)
>
>
>
http://www.agner.org/optimize/calling_conventions.pdf
Device drivers under Linux
Linux systems use lazy saving of floating point registers and vector
registers. This means
that these registers are not saved and restored on every task switch.
Instead they are
saved/restored on the first access after a task switch. This method
saves time in case no
more than one thread uses these registers. The lazy saving scheme is not
supported in
kernel mode. Any device driver that attempts to use these registers
improperly will cause an
exception that will probably make the system crash. A device driver that
needs to use vector
registers must first save these registers by calling the function
kernel_fpu_begin() and
restore the registers by calling kernel_fpu_end() before returning or
sleeping. These
functions also prevent pre-emptive interruption of the device driver
which could otherwise
mess up the registers. kernel_fpu_begin() saves all floating point
registers and vector
registers if available.
There is no red zone in 64-bit Linux kernel mode.
The programmer should be aware of these restrictions if calling any
other library than the
system kernel libraries from a device driver.
^ permalink raw reply [flat|nested] 26+ messages in thread
* [gentoo-user] Re: CFLAGs for kernel compilation
2015-05-02 11:19 ` Volker Armin Hemmann
@ 2015-05-02 11:25 ` Nikos Chantziaras
2015-05-02 11:37 ` Volker Armin Hemmann
0 siblings, 1 reply; 26+ messages in thread
From: Nikos Chantziaras @ 2015-05-02 11:25 UTC (permalink / raw
To: gentoo-user
On 02/05/15 14:19, Volker Armin Hemmann wrote:
> Am 02.05.2015 um 07:04 schrieb Nikos Chantziaras:
>> On 01/05/15 10:44, Andrew Savchenko wrote:
>>> On Fri, 1 May 2015 05:09:51 +0000 (UTC) Martin Vaeth wrote:
>>>> Andrew Savchenko <bircoph@gentoo.org> wrote:
>>>>>
>>>>> That's why kernel makes sure that no floating point instructions
>>>>> sneaks in using CFLAGS, you may see a lot of -mno-${intrucion_set}
>>>>> flags when running make -V.
>>>>
>>>> So it should be sufficient that the kernel does not use "float"
>>>> or "double", shouldn't it?
>>>
>>> No. Optimizer paths may be very unobvious, i.e. I'll not be
>>> surprised if under some conditions vectorizer may use float
>>> instructions for int code.
>>
>> The kernel uses -O2 and several -march variants (e.g. -march=core2).
>> Several other options are used to prevent GCC from generating
>> unsuitable code.
>>
>> Specifying another -march variant does not affect the optimizer
>> though. It only affects the code generator. If you don't modify the
>> other CFLAGS and only change -march, you will not get FP instructions
>> unless you use FP in the code.
>>
>> Also, I'd be very interested to see *any* optimization that would
>> somehow transform integer code to FP code (note that SIMD is not FP
>> and is perfectly fine in the kernel.) In fact, optimizers tend to
>> transform FP into SIMD, at least on x86 (and other architectures that
>> have fast SIMD instructions.) If I inspect the generated assembly from
>> GCC or Clang, I cannot find FP anywhere, even for code using "float"
>> and "double" operations. They get converted to SIMD on modern CPUs
>> (unless you specify a compiler flag that tells it to use the FPU, for
>> example if you need 80-bit extended precision, which is supported by
>> the x86 FPU.)
>>
>>
>>
>
> http://www.agner.org/optimize/calling_conventions.pdf
Not sure what you're trying to say.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [gentoo-user] Re: CFLAGs for kernel compilation
2015-05-02 11:25 ` Nikos Chantziaras
@ 2015-05-02 11:37 ` Volker Armin Hemmann
2015-05-02 12:06 ` Nikos Chantziaras
0 siblings, 1 reply; 26+ messages in thread
From: Volker Armin Hemmann @ 2015-05-02 11:37 UTC (permalink / raw
To: gentoo-user
Am 02.05.2015 um 13:25 schrieb Nikos Chantziaras:
> On 02/05/15 14:19, Volker Armin Hemmann wrote:
>> Am 02.05.2015 um 07:04 schrieb Nikos Chantziaras:
>>> On 01/05/15 10:44, Andrew Savchenko wrote:
>>>> On Fri, 1 May 2015 05:09:51 +0000 (UTC) Martin Vaeth wrote:
>>>>> Andrew Savchenko <bircoph@gentoo.org> wrote:
>>>>>>
>>>>>> That's why kernel makes sure that no floating point instructions
>>>>>> sneaks in using CFLAGS, you may see a lot of -mno-${intrucion_set}
>>>>>> flags when running make -V.
>>>>>
>>>>> So it should be sufficient that the kernel does not use "float"
>>>>> or "double", shouldn't it?
>>>>
>>>> No. Optimizer paths may be very unobvious, i.e. I'll not be
>>>> surprised if under some conditions vectorizer may use float
>>>> instructions for int code.
>>>
>>> The kernel uses -O2 and several -march variants (e.g. -march=core2).
>>> Several other options are used to prevent GCC from generating
>>> unsuitable code.
>>>
>>> Specifying another -march variant does not affect the optimizer
>>> though. It only affects the code generator. If you don't modify the
>>> other CFLAGS and only change -march, you will not get FP instructions
>>> unless you use FP in the code.
>>>
>>> Also, I'd be very interested to see *any* optimization that would
>>> somehow transform integer code to FP code (note that SIMD is not FP
>>> and is perfectly fine in the kernel.) In fact, optimizers tend to
>>> transform FP into SIMD, at least on x86 (and other architectures that
>>> have fast SIMD instructions.) If I inspect the generated assembly from
>>> GCC or Clang, I cannot find FP anywhere, even for code using "float"
>>> and "double" operations. They get converted to SIMD on modern CPUs
>>> (unless you specify a compiler flag that tells it to use the FPU, for
>>> example if you need 80-bit extended precision, which is supported by
>>> the x86 FPU.)
>>>
>>>
>>>
>>
>> http://www.agner.org/optimize/calling_conventions.pdf
>
> Not sure what you're trying to say.
>
>
>
that simd is not save in kernel if not carefully guarded.
Really people, just don't fuck around with the cflags.
^ permalink raw reply [flat|nested] 26+ messages in thread
* [gentoo-user] Re: CFLAGs for kernel compilation
2015-05-02 11:37 ` Volker Armin Hemmann
@ 2015-05-02 12:06 ` Nikos Chantziaras
2015-05-02 12:10 ` Volker Armin Hemmann
0 siblings, 1 reply; 26+ messages in thread
From: Nikos Chantziaras @ 2015-05-02 12:06 UTC (permalink / raw
To: gentoo-user
On 02/05/15 14:37, Volker Armin Hemmann wrote:
> Am 02.05.2015 um 13:25 schrieb Nikos Chantziaras:
>>>>
>>>> The kernel uses -O2 and several -march variants (e.g. -march=core2).
>>>> Several other options are used to prevent GCC from generating
>>>> unsuitable code.
>>>>
>>>> Specifying another -march variant does not affect the optimizer
>>>> though. It only affects the code generator. If you don't modify the
>>>> other CFLAGS and only change -march, you will not get FP instructions
>>>> unless you use FP in the code.
>>>
>>> http://www.agner.org/optimize/calling_conventions.pdf
>>
>> Not sure what you're trying to say.
>>
>
> that simd is not save in kernel if not carefully guarded.
>
> Really people, just don't fuck around with the cflags.
I still fail to see the relevance. Unless you mean using a different -O
level. In that case, yes. You shouldn't. But I was talking about -march.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [gentoo-user] Re: CFLAGs for kernel compilation
2015-05-02 12:06 ` Nikos Chantziaras
@ 2015-05-02 12:10 ` Volker Armin Hemmann
2015-05-02 12:38 ` Nikos Chantziaras
0 siblings, 1 reply; 26+ messages in thread
From: Volker Armin Hemmann @ 2015-05-02 12:10 UTC (permalink / raw
To: gentoo-user
Am 02.05.2015 um 14:06 schrieb Nikos Chantziaras:
> On 02/05/15 14:37, Volker Armin Hemmann wrote:
>> Am 02.05.2015 um 13:25 schrieb Nikos Chantziaras:
>>>>>
>>>>> The kernel uses -O2 and several -march variants (e.g. -march=core2).
>>>>> Several other options are used to prevent GCC from generating
>>>>> unsuitable code.
>>>>>
>>>>> Specifying another -march variant does not affect the optimizer
>>>>> though. It only affects the code generator. If you don't modify the
>>>>> other CFLAGS and only change -march, you will not get FP instructions
>>>>> unless you use FP in the code.
>>>>
>>>> http://www.agner.org/optimize/calling_conventions.pdf
>>>
>>> Not sure what you're trying to say.
>>>
>>
>> that simd is not save in kernel if not carefully guarded.
>>
>> Really people, just don't fuck around with the cflags.
>
> I still fail to see the relevance. Unless you mean using a different
> -O level. In that case, yes. You shouldn't. But I was talking about
> -march.
>
you said this
>
> (note that SIMD is not FP and is perfectly fine in the kernel.)
and I have shown you that you are wrong.
^ permalink raw reply [flat|nested] 26+ messages in thread
* [gentoo-user] Re: CFLAGs for kernel compilation
2015-05-02 12:10 ` Volker Armin Hemmann
@ 2015-05-02 12:38 ` Nikos Chantziaras
2015-05-02 17:04 ` Volker Armin Hemmann
0 siblings, 1 reply; 26+ messages in thread
From: Nikos Chantziaras @ 2015-05-02 12:38 UTC (permalink / raw
To: gentoo-user
On 02/05/15 15:10, Volker Armin Hemmann wrote:
> Am 02.05.2015 um 14:06 schrieb Nikos Chantziaras:
>> On 02/05/15 14:37, Volker Armin Hemmann wrote:
>>> Am 02.05.2015 um 13:25 schrieb Nikos Chantziaras:
>>>>>>
>>>>>> The kernel uses -O2 and several -march variants (e.g. -march=core2).
>>>>>> Several other options are used to prevent GCC from generating
>>>>>> unsuitable code.
>>>>>>
>>>>>> Specifying another -march variant does not affect the optimizer
>>>>>> though. It only affects the code generator. If you don't modify the
>>>>>> other CFLAGS and only change -march, you will not get FP instructions
>>>>>> unless you use FP in the code.
>>>>>
>>>>> http://www.agner.org/optimize/calling_conventions.pdf
>>>>
>>>> Not sure what you're trying to say.
>>>>
>>>
>>> that simd is not save in kernel if not carefully guarded.
>>>
>>> Really people, just don't fuck around with the cflags.
>>
>> I still fail to see the relevance. Unless you mean using a different
>> -O level. In that case, yes. You shouldn't. But I was talking about
>> -march.
>>
>
> you said this
>
>>
>> (note that SIMD is not FP and is perfectly fine in the kernel.)
>
> and I have shown you that you are wrong.
Not sure why you think that. The kernel crypto routines are full of SIMD
code (like SSE and AVX.) Automatic vectorization wouldn't work. But
-march is not going to introduce that.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [gentoo-user] Re: CFLAGs for kernel compilation
2015-05-02 12:38 ` Nikos Chantziaras
@ 2015-05-02 17:04 ` Volker Armin Hemmann
2015-05-02 18:34 ` James
0 siblings, 1 reply; 26+ messages in thread
From: Volker Armin Hemmann @ 2015-05-02 17:04 UTC (permalink / raw
To: gentoo-user
Am 02.05.2015 um 14:38 schrieb Nikos Chantziaras:
> On 02/05/15 15:10, Volker Armin Hemmann wrote:
>> Am 02.05.2015 um 14:06 schrieb Nikos Chantziaras:
>>> On 02/05/15 14:37, Volker Armin Hemmann wrote:
>>>> Am 02.05.2015 um 13:25 schrieb Nikos Chantziaras:
>>>>>>>
>>>>>>> The kernel uses -O2 and several -march variants (e.g.
>>>>>>> -march=core2).
>>>>>>> Several other options are used to prevent GCC from generating
>>>>>>> unsuitable code.
>>>>>>>
>>>>>>> Specifying another -march variant does not affect the optimizer
>>>>>>> though. It only affects the code generator. If you don't modify the
>>>>>>> other CFLAGS and only change -march, you will not get FP
>>>>>>> instructions
>>>>>>> unless you use FP in the code.
>>>>>>
>>>>>> http://www.agner.org/optimize/calling_conventions.pdf
>>>>>
>>>>> Not sure what you're trying to say.
>>>>>
>>>>
>>>> that simd is not save in kernel if not carefully guarded.
>>>>
>>>> Really people, just don't fuck around with the cflags.
>>>
>>> I still fail to see the relevance. Unless you mean using a different
>>> -O level. In that case, yes. You shouldn't. But I was talking about
>>> -march.
>>>
>>
>> you said this
>>
>>>
>>> (note that SIMD is not FP and is perfectly fine in the kernel.)
>>
>> and I have shown you that you are wrong.
>
> Not sure why you think that. The kernel crypto routines are full of
> SIMD code (like SSE and AVX.) Automatic vectorization wouldn't work.
> But -march is not going to introduce that
and never used in interrupt context and carefully guarded. You act like
'oh, you can use simd instructions without any consideration' and that
is just not true.
^ permalink raw reply [flat|nested] 26+ messages in thread
* [gentoo-user] Re: CFLAGs for kernel compilation
2015-05-02 17:04 ` Volker Armin Hemmann
@ 2015-05-02 18:34 ` James
0 siblings, 0 replies; 26+ messages in thread
From: James @ 2015-05-02 18:34 UTC (permalink / raw
To: gentoo-user
Volker Armin Hemmann <volkerarmin <at> googlemail.com> writes:
> >>>>>> http://www.agner.org/optimize/calling_conventions.pdf
> >>>>>
> >>>>> Not sure what you're trying to say.
> >>>>>
> >>>>
> >>>> that simd is not save in kernel if not carefully guarded.
> >>>>
> >>>> Really people, just don't fuck around with the cflags.
> >>>
> >>> I still fail to see the relevance. Unless you mean using a different
> >>> -O level. In that case, yes. You shouldn't. But I was talking about
> >>> -march.
> >>>
> >>
> >> you said this
> >>
> >>>
> >>> (note that SIMD is not FP and is perfectly fine in the kernel.)
> >>
> >> and I have shown you that you are wrong.
> >
> > Not sure why you think that. The kernel crypto routines are full of
> > SIMD code (like SSE and AVX.) Automatic vectorization wouldn't work.
> > But -march is not going to introduce that
>
> and never used in interrupt context and carefully guarded. You act like
> 'oh, you can use simd instructions without any consideration' and that
> is just not true.
Volker,
Historically, you are correct. Looking forward, GCC-5.x will (can?) change
this as the simd and other hardware, including (DDR_5) memory all become
available for (compiler) usage. For the longest time, we the FOSS
communities, have at best been given access to low lever APIs for access to
some of these hardware resources. All processor architectures are at war.
Intel (the bastards) have had FPGA and tools to reconfigure the amount and
types of hardwware in some of their processors for quite some time.
The Arm64 cores have simd (GPU if you like) centric cores on the same SOC as
the arm64 bit licensed CPU cores. The new gpu has already been integrated
into the processor cores (same substrate) just the the i387 FPU was some
decades ago. So Arm is providing 'bare metal' access to various customers
and compilers Since there are thousands of vendors building up customer
arm64 SOCs there is no way for Arm to constrict, like Intel, Nvidia and AMD
have historically done. Game_set_match.
Even though those GPU cores available via arm64 are very weak compared to
Nvidia and AMD; bare metal access to those (gpu) resources if far superior
to what Intel (dragging their feet), Nvidia or AMD are offering. Just look
at how AMD's Mantle has stalled for the FOSS communities. Amd, via
competition from a myriad of arm SOC vendors, is being forced to roll out
Arm64 bit server chips, just to stay relevant. Both of you guys are looking
at this issue, from historically color-coded sunglasses. Change is here; get
onboard with helping the masses help themselves to the feeding (coding) freenzy.
What a pair of really smart guys like you (2) should be doing is setting up
a gentoo wiki listing and demonstrating for others how to "profile" low
level codes: both kernel and system level, so these other gentoo folks *can
learn* about what you are saying by example; running tools such as
kernelshark, and other performance/profiling types of analysis. Providing
seemless and generic access to the gpu resources will go a long way towards
revitalizing FOSS cryptographic dominance; and that is a very good thing. ymmv.
For the record, most simd hardware really sucks for dense_matrix
requirements. Most simd hardware only really works for sparse matrix
apps, like x.264 because the overlying (embedded) algorithms used are poorly
documented by intention from the hardware vendors. I do not have direct
proof; but I strongly suspect this is the case because the simd pipelined
memory that these low level APIs give to FOSS community, are memory
constricted by design.
peace,
James
^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2015-05-02 18:35 UTC | newest]
Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-04-29 11:31 [gentoo-user] CFLAGs for kernel compilation Ralf
2015-04-29 12:33 ` Neil Bothwick
2015-04-29 12:41 ` Emanuele Rusconi
2015-04-29 13:18 ` Ralf
2015-04-29 13:35 ` [gentoo-user] " Holger Hoffstätte
2015-04-29 23:52 ` Nikos Chantziaras
2015-04-30 0:01 ` Nikos Chantziaras
2015-04-30 16:27 ` [gentoo-user] " Volker Armin Hemmann
2015-04-30 7:11 ` Adam Carter
2015-04-30 9:38 ` Andrew Savchenko
2015-05-01 5:09 ` [gentoo-user] " Martin Vaeth
2015-05-01 7:44 ` Andrew Savchenko
2015-05-01 17:21 ` James
2015-05-02 5:04 ` Nikos Chantziaras
2015-05-02 11:19 ` Volker Armin Hemmann
2015-05-02 11:25 ` Nikos Chantziaras
2015-05-02 11:37 ` Volker Armin Hemmann
2015-05-02 12:06 ` Nikos Chantziaras
2015-05-02 12:10 ` Volker Armin Hemmann
2015-05-02 12:38 ` Nikos Chantziaras
2015-05-02 17:04 ` Volker Armin Hemmann
2015-05-02 18:34 ` James
2015-04-30 16:26 ` [gentoo-user] " Volker Armin Hemmann
2015-04-30 17:45 ` Andrew Savchenko
2015-04-30 18:11 ` Volker Armin Hemmann
2015-04-30 18:28 ` Andrew Savchenko
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox