From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from lists.gentoo.org ([140.105.134.102] helo=robin.gentoo.org) by nuthatch.gentoo.org with esmtp (Exim 4.50) id 1ERCKM-0000qd-TN for garchives@archives.gentoo.org; Sun, 16 Oct 2005 17:29:43 +0000 Received: from robin.gentoo.org (localhost [127.0.0.1]) by robin.gentoo.org (8.13.5/8.13.5) with SMTP id j9GHQfO9013918; Sun, 16 Oct 2005 17:26:41 GMT Received: from ciao.gmane.org (main.gmane.org [80.91.229.2]) by robin.gentoo.org (8.13.5/8.13.5) with ESMTP id j9GHQfB2019320 for ; Sun, 16 Oct 2005 17:26:41 GMT Received: from list by ciao.gmane.org with local (Exim 4.43) id 1ERCHs-0003Cw-6n for gentoo-amd64@lists.gentoo.org; Sun, 16 Oct 2005 19:27:08 +0200 Received: from ip68-230-97-182.ph.ph.cox.net ([68.230.97.182]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sun, 16 Oct 2005 19:27:08 +0200 Received: from 1i5t5.duncan by ip68-230-97-182.ph.ph.cox.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sun, 16 Oct 2005 19:27:08 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: gentoo-amd64@lists.gentoo.org From: Duncan <1i5t5.duncan@cox.net> Subject: [gentoo-amd64] Re: mtrr: base is not aligned Date: Sun, 16 Oct 2005 10:26:08 -0700 Organization: Sometimes Message-ID: References: <5bdc1c8b0510151118p447b72a9la959017a0de1dd08@mail.gmail.com> Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-amd64@gentoo.org Reply-to: gentoo-amd64@lists.gentoo.org Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Complaints-To: usenet@sea.gmane.org X-Gmane-NNTP-Posting-Host: ip68-230-97-182.ph.ph.cox.net User-Agent: Pan/0.14.2.91 (As She Crawled Across the Table) Sender: news X-Archives-Salt: 964beb4e-0d24-41b8-8c2e-e345104551d3 X-Archives-Hash: 1a99cf25c274099fcc25141307f9c2ce Mark Knecht posted <5bdc1c8b0510151118p447b72a9la959017a0de1dd08@mail.gmail.com>, excerpted below, on Sat, 15 Oct 2005 11:18:12 -0700: > Is anyone else seeign these troubling messages from tie to time? > > mtrr: base(0xe8020000) is not aligned on a size(0x400000) boundary > mtrr: type mismatch for e8000000,8000000 old: write-back new: write-combining > mtrr: type mismatch for e8000000,8000000 old: write-back new: write-combining > mtrr: type mismatch for e8000000,8000000 old: write-back new: write-combining > mtrr: base(0xe8020000) is not aligned on a size(0x400000) boundary > mtrr: type mismatch for e8000000,8000000 old: write-back new: write-combining > mtrr: base(0xe8020000) is not aligned on a size(0x400000) boundary > mtrr: type mismatch for e8000000,8000000 old: write-back new: write-combining > mtrr: base(0xe8020000) is not aligned on a size(0x400000) boundary > mtrr: type mismatch for e8000000,8000000 old: write-back new: write-combining > > Is there somethign I should be doing to fix this? You may well know more about this than I do, but on the off chance this may be new to you and for others... (and because I'm googling and learning a bit in the process myself... =8^) mtrr=memory type range (or region) register. This is definitely kernel domain we are talking about here, often video drivers (so xorg related as well). Here's a link to an old but decently "Engish" explanation of what MTRRs are that doesn't get too technical, but gives you some idea of not only what the theory does, but the effects on real-world performance. (It's talking PentiumPro/PII CPUs and kernel 2.2.0 or later... I /said/ it was old! =8^) http://www.meduna.org/txt_mtrr_en.html Paraphrasing from the above link... Basically, the MTRRs determine the behavior of cache vs regular memory on memory-write, for various memory regions/ranges. * Write-thru means the write is to cache and main memory together, slower than Write-back or Write-combining below but most reliable as any DMAs from main memory are guaranteed to be up-to-date. * Write-back is far faster, allowing the CPU to write to cache only then go about its business, with the update to main memory happening as memory bandwidth permits. The catch is that cache (play on words there! ) and main memory can be out of sync momentarily, which could mean either longer waits for the update to happen when it /has/ to happen (if everything is working right and the out-of-date-data is caught and a wait forced until it's updated to valid data when needed), or in the worst-case scenario, if something's not quite working right, glitches, instability and crashes because old and invalid data was used where updated data was expected. * Write-combining is sort of in-between the two above choices. The data is allowed to sit in cache without updating main memory only a comparatively short period, in ordered to allow the possibility of combining several smaller writes into a larger, single, more efficient write. (More efficient because each write has a set amount of overhead in setup and take-down time and data. Thus, combining 8 128-byte transactions into a single 1KB transaction means 1/8 the overhead, thus more effective payload bandwidth, at a cost of more latency, due to waiting for several transactions to accumulate, provided of course they come in before the expiry time forces what's there to be written even if it's not yet a full size transfer.) * There's also uncachable, which turns off caching for reads as well as writes. This will be /very/ slow. Where graphics gets involved is that these MTRRs are often used to program access to video memory over the AGP or whatever bus. The link above lists some of the (then) xfree86 operations that MTRR settings affect and by how much. Another link with some interesting info on the kernel config option (CONFIG_MTRR) and the userland interface to MTRRs (/proc/mtrr) once enabled. The two paragraphs below are excerpted: http://developer.osdl.org/dev/robustmutexes/src/fusyn.hg/Documentation/mtrr.txt The CONFIG_MTRR option creates a /proc/mtrr file which may be used to manipulate your MTRRs. Typically the X server should use this. This should have a reasonably generic interface so that similar control registers on other processors can be easily supported. There are two interfaces to /proc/mtrr: one is an ASCII interface which allows you to read and write. The other is an ioctl() interface. The ASCII interface is meant for administration. The ioctl() interface is meant for C programs (i.e. the X server). The interfaces are described below, with sample commands and C code. Finally, this, on the mtrr_add command from kernelnewbies: http://kernelnewbies.org/documents/kdoc/kernel-api/r7666.html Memory type region registers control the caching on newer Intel and non Intel processors. This function allows drivers to request an MTRR is added. The details and hardware specifics of each processor's implementation are hidden from the caller, but nevertheless the caller should expect to need to provide a power of two size on an equivalent power of two boundary. If the region cannot be added either because all regions are in use or the CPU cannot support it a negative value is returned. On success the register number for this entry is returned, but should be treated as a cookie only. On a multiprocessor machine the changes are made to all processors. This is required on x86 by the Intel processors. The available types are MTRR_TYPE_UNCACHABLE - No caching MTRR_TYPE_WRBACK - Write data back in bursts whenever MTRR_TYPE_WRCOMB - Write data back soon but allow bursts MTRR_TYPE_WRTHROUGH - Cache reads but not writes That last contains a mention of boundaries ("power of two size on an equivalent power of two boundary") that appears to pertain to your problem. So... now we know a bit about what MTRRs actually do (control the interaction between cache and a specified portion of memory for write transactions), what they are most often adjusted for (to increase graphics performance, by changing the way writes to graphics memory are cached), and can make a bit of sense out of the messages (the size doesn't match the required base address for the MTRR, something's trying to change the caching method, but using the wrong address and the adjustment is therefore getting a type mismatch). How does that translate into something you can do to fix it? Well, first, you can actually go take a look at /proc/mtrr (don't try to write anything to it, unless you are sure you know what you are doing, but reading it should be fine), and see if you can figure out which entry it's supposed to be changing, if there's one close to that address at all, or if not, what needs created. Beyond that, it depends on what is actually using the MTRR. It's probably your video card driver, but it could be something else. You don't say which card you have, but from the googling I did to find the above, I see that ATI's proprietary drivers at least, have posts about changing MTRRs to increase performance. Whatever your video card driver is, that's probably (but not for certain) what's causing the log messages. Therefore, take a look at /var/log/Xorg.0.log, and see if you can match up any possible MTRR messages listed there. Next, take a look at the driver documentation and your xorg.conf file and boot loader config, and see what sort of adjustments you might need to make. Of course, if it's /not/ video driver related, you'll likely have to figure out what else is accessing the MTRRs and how to reconfigure it correctly. Taking a wild guess, I'd say check anything that's likely to be using DMA, thus, stuff like NICs or storage devices, and their drivers. Oh, MTRRs are also used in connection with mapping around the PCI device memory hole just below 4GB, if you have 4G or more of memory. Hmm... Now I'm beginning to integrate what I just learned here, with some other stuff I just read, with stuff I knew before, and a real-time look at /proc/mtrr on my system... and things are beginning to "click". You get to see my understanding developing as I write this! >>From an Opteron BIOS integrator's pdf @ amd... they recommend one of the variable MTRRs (there are some fixed ones covering the memory space from 640k to 1 MB as well, that must be what those common settings in BIOS must be for) be set to cover the entire physical memory range... And so I see... I have a gig of memory and see a 1024 MB MTRR set @ base-address 0x 0000 0000 (0 MB), type write-back (so read/write cacheable). That makes sense as it's telling the CPU(s, 2 in my case) that all of my main memory is fully cacheable, no special restrictions needed! Apparently, some CPUs only have two variable MTRRs, and if one is used to cover all of physical main-memory, that leaves only one available, which would be used by the video driver, so that's how Linux is normally setup. Again apparently, modern x86 (and presumably x86_64 as well) CPUs from both AMD and Intel have eight such MTRRs, so more ranges can be programed as needed. Now this is just a guess based on the fixed MTRs (memory type ranges, I'm using MTRR to reference the register, MTR to reference the range in memory it controls, here referring to the previously mentioned fixed range MTRRs/MTRs between 640k and 1M) being included in the main-mem variable range, and variable MTRs within other variable MTRs might not work quite the same (tho I expect they do), but going on that... smaller ones in larger ones can act as exceptions. That would explain the BIOS MTRR setting in many AMD64 systems for > 3.5 GB physical memory -- continuous or explicit hole for the 3.5 - 4 GB (0x e000 0000 - 0x ffff ffff) top-of-32-bit-memory-space PCI hole. The continuous option will map the entire X GB of memory as a single MTR, presumably (presumably since I've never run > 4 GB myself, and I'm making connections faster than I could look them up to verify them) with additional MTRs such as the video memory one overlaid on top of it, in the 3.5-4.0 GB PCI device space. Non-contiguous or explicit-hole would map an explicit hole to the 3.5-4 GB area. Note that this would map additional memory up above the 4 GB 32-bit barrier, out of reach of most 32-bit systems, explaining some comments on forums.amd.com I was reading. If this isn't handled correctly, even 64-bit systems won't see all their memory if it's more than 3.5 GB, since part of it will be hidden by the PCI device overlay 3.5 - 4.0 GB. Note that at least here (Tyan s2885 dual Opteron board), the BIOS actually has two related settings controlling the way > 3.5 GB of physical memory is mapped. One apparently controls the actual memory addresses, whether they skip the 3.5-4 GB range or not, the other controls the MTRRs, continuous or not. If the two don't match, it could mean all of memory is seen but not all of it is marked as cacheable, slowing access to the uncached memory range. ... Another leap of understanding... Remember those ranges need to be in power-of-2 sizes and on matching power-of-2 boundaries? I happen to have a gig of memory, an even power of two, so one MTR covers it exactly. That wouldn't work for those with 3/4 gig or 1.5 or 3 gig or some such. OTOH, I noticed a count=1 at the end of the two ranges I have mapped in my /proc/mtrr, so it would appear that could be remedied by using say 3 half-gig MTRs (count=3) stacked end-to-end, or a 1 gig and a half gig (thus two), to map that 1.5 gig area. I'm not sure if the count= would mean it's using additional MTRRs or not, tho I'd expect it would indeed mean that. Thus, non-power-of-two memory layouts will probably mean more MTRRs used to map the full memory MTR. (If there's anyone with an odd amount of memory that could verify, it'd be nice...) My second MTRR is set to cover the 128 MB range dedicated to my video card (it's got 256 MB but on 128 MB is being used, it would appear, but it's no big deal since I don't do much 3D anyway because I'm running dual head video @ 2048x1536 each). Here, it's the 128 MB beginning at exactly 3.5 MB base address (0x e000 0000), write combining. OK... 0x e000 0000 is the 3.5 GB boundary, so your 0x e802 0000 is indeed in the PCI device area... 0x 800 0000 is 128 MB, 0x 2 0000 is 8 KB, so the base address it's attempting to use is 3.5 GB + 128 MB + 8 KB. The requested size is 0x 40 0000 or 4 MB, so the closest boundary would be the 0x e800 0000 you see it trying for and getting the type mismatch. Note also that the AMD docs say disable caching (which would mean flush all pending writes) before making other MTRR changes, also. What I'd guess is happening here, then, is that these errors are occurring when you start X, and the graphics video driver tries to overlay a write-combining MTRR over top of (part of) the previously mapped write-back MTRR covering all of main-memory (which would appear to be 4GB or better, thus overlapping the 3.5-4.0 GB PCI device area). They will mean problems with the video rendering, either glitches or not as fast as it could be. (Write-back being less strict than write-combining, I'm thinking it could mean glitches, whenever the video card tries to draw memory that's not current with that in cache. However, it may not appear except under heavy 3D use, such as in games.) If that's the case, in addition to the MTRR remapping you may do either from your boot scripts or by reconfiguring your graphics card driver in xorg.conf, you may have to reset your BIOS' MTRR settings to specifically map out the pci-area hole. That's about all I have ATM... Hope this is as informative for others as it just was for me! (And, if anyone's an expert on this stuff and I got it wrong, please inform me where! Better to correct any mistakes I have now while it's new than after I build a bunch more suppositions on a faulty foundation!) -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman in http://www.linuxdevcenter.com/pub/a/linux/2004/12/22/rms_interview.html -- gentoo-amd64@gentoo.org mailing list