On 07/01/2013 09:36 PM, Richard Yao wrote: > On 07/01/2013 03:23 PM, Greg KH wrote: >> On Mon, Jul 01, 2013 at 08:45:16PM +0200, Tom Wijsman wrote: >>>>> Q: What about my stable server? I really don't want to run this >>>>> stuff! >>>>> >>>>> A: These options would depend on !CONFIG_VANILLA or >>>>> CONFIG_EXPERIMENTAL >>>> >>>> What is CONFIG_VANILLA? I don't see that in the upstream kernel tree >>>> at all. >>>> >>>> CONFIG_EXPERIMENTAL is now gone from upstream, so you are going to >>>> have a problem with this. >>> >>> Earlier I mentioned "2) These feature should depend on a non-vanilla / >>> experimental option." which is an option we would introduce under the >>> Gentoo distribution menu section. >> >> Distro-specific config options, great :( >> >>>>> which would be disabled by default, therefore if you keep this >>>>> option the way it is on your stable server; it won't affect you. >>>> >>>> Not always true. Look at aufs as an example. It patches the core >>>> kernel code in ways that are _not_ accepted upstream yet. Now you all >>>> are running that modified code, even if you don't want aufs. >>> >>> Earlier I mentioned "3) The patch should not affect the build by >>> default."; if it does, we have to adjust it to not do that, this is >>> something that can be easily scripted. It's just a matter of embedding >>> each + block in the diff with a config check and updating the counts. >> >> Look at aufs as a specific example of why you can't do that, otherwise, >> don't you think that the aufs developer(s) wouldn't have done so? > > I am accquainted with the developer of a stackable filesystem developer. I should probably proofread multiple times before I send emails. Anyway, that should have been: > I am acquainted with the developer of a stackable filesystem. > According to what he has told me in person offline, the developers on > the LKML cannot decide on how a stackable filesystem should be > implemented. I was told three different variations on the design that > some people liked and others didn't, which ultimately kept the upstream > kernel from adopting anything. I specifically recall two variations, > which were doing it as part of the VFS and doing it as part of ext4. If > you want to criticize stackable filesystems, would you lay out a > groundwork for getting one implemented upon which people will agree? > >> The goal of "don't touch any other kernel code" is a very good one, but >> not always true for these huge out-of-tree kernel patches. Usually that >> is the main reason why these patches aren't merged upstream, because >> those changes are not acceptable. > > I was under the impression that there were several reasons for patches > not being merged upstream: > > 1. Lack of signed-off > 2. Code drop that no one will maintain > 3. Subsystem maintainers saying no simply because they do not like > . > 4. Risk of patent trolls > 5. Actual technical reasons > >> So be very careful here, you are messing with things that are rejected >> by upstream. >> >> greg k-h >> > > Only some of the patches were rejected. Others were never submitted. The > PaX/GrSecurity developers prefer their code to stay out-of-tree. As one > of the people hacking on ZFSOnLinux, I prefer that the code be > out-of-tree. That is because fixes for other filesystems are either held > back by a lack of system kernel updates or held hostage by regressions > in newer kernels on certain hardware. > > With that said, being in Linus' tree does not make code fall under some > golden standard for quality. There are many significant issues in code > committed to Linus' the kernel, some of which have been problems for > years. Just to name a few: > > 1. Doing `rm -r /dir` on a directory tree containing millions of inodes > (e.g. ccache) on an ext4 filesystem mounted with discard with the CFQ IO > elevator will cause a system to hang for hours on pre-SATA 3.1 hardware. > This is because TRIM is a non-queued command and is being interleaved > with writes for "fairness". Incidentally, using noop turns a multiple > hour hang into a laggy experience of a few minutes. > > 2. aio_sync() is unimplemented, which means that there is no sane way > for userland software like QEMU and TGT to be both fast and guarantee > data integrity. A single crash and your guest is corrupted. It would > have been better had AIO never been implemented. > > 3. dm-crypt will reorder write requests across flushes. That is because > upon seeing a write, it sends it to a work queue to be processed > asynchronously and upon seeing a flush, it immediately processes it. A > single kernel panic or sudden power loss can damage filesystems stored > on it. > > 4. Under low memory conditions with hundreds of concurrent threads (e.g. > package builds), every thread will enter direct reclaim and there will > be a remarkable drop in system throughput, assuming that the system does > not lockup. There is a fairly substantial amount of time wasted after > one thread finishes direct reclaim in other threads because they will > still be performing direct reclaim afterward. > > 5. The Linux 3.7 nouveau rewrite broke kexec support. The graphics > hardware will not reinitialize properly. > > 6. A throttle mechanism introduced for memory cgroups can cause the > system to deadlock whenever it is holding a lock needed for swap and > enters direct reclaim with a significant number of dirty pages. > > 7. Code has been accepted on multiple occasions that does not compile > and the build failures persist for weeks if not months after Linus' tag. > I sent a patch to fix one failure. It was rejected because I had fixed > code to compile with -Werror, people thought that -Werror should be > removed (and therefore was no reason to fix the warnings) and we went 2 > months until someone wrote a patch that people liked to fix it. For a > current example of accepted code failing to build, look here: > > https://bugzilla.kernel.org/show_bug.cgi?id=38052 > > Note that I have not checked Linus' tree to see if that bug is still > current, but the bug itself appears to be open as of this writing. > > There are plenty more technical issues, but these are just my pet > peeves. If you want more examples, you could look at the patches people > send you each day and ask yourself how many are things that could have > been caught had people been more careful during review. For instance, > look at the barrier patches that were done around Linux 2.6.30. What > prevented those from being caught by review years earlier? > > Being outside Linus' tree is not synonymous with being bad and being bad > is not synonymous with being rejected. It is perfectly reasonable to > think that there are examples of good code outside Linus' tree. > Furthermore, should the kernel kernel choose to engage that out-of-tree That should have been: > Furthermore, should the kernel team choose to engage that out-of-tree > code, my expectation is that its quality will improve as they do testing > and write patches. >