- 09 May, 2012 1 commit
-
-
Tejun Heo authored
pcpu_embed_first_chunk() allocates memory for each node, copies percpu data and frees unused portions of it before proceeding to the next group. This assumes that allocations for different nodes doesn't overlap; however, depending on memory topology, the bootmem allocator may end up allocating memory from a different node than the requested one which may overlap with the portion freed from one of the previous percpu areas. This leads to percpu groups for different nodes overlapping which is a serious bug. This patch separates out copy & partial free from the allocation loop such that all allocations are complete before partial frees happen. This also fixes overlapping frees which could happen on allocation failure path - out_free_areas path frees whole groups but the groups could have portions freed at that point. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: stable@vger.kernel.org Reported-by: "Pavel V. Panteleev" <pp_84@mail.ru> Tested-by: "Pavel V. Panteleev" <pp_84@mail.ru> LKML-Reference: <E1SNhwY-0007ui-V7.pp_84-mail-ru@f220.mail.ru>
-
- 08 May, 2012 5 commits
-
-
git://people.freedesktop.org/~airlied/linuxLinus Torvalds authored
Pull drm fixes from Dave Airlie: "Two fixes from Intel, one a regression, one because I merged an early version of a fix. Also the nouveau revert of the i2c code that was tested on the list." * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: drm/nouveau/i2c: resume use of i2c-algo-bit, rather than custom stack drm/i915: Do no set Stencil Cache eviction LRA w/a on gen7+ drm/i915: disable sdvo hotplug on i945g/gm
-
Linus Torvalds authored
Merge tag 'stable/for-linus-3.4-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull xen fixes from Konrad Rzeszutek Wilk: - fix to Kconfig to make it fit within 80 line characters, - two bootup fixes (AMD 8-core and with PCI BIOS), - cleanup code in a Xen PV fb driver, - and a crash fix when trying to see non-existent PTE's * tag 'stable/for-linus-3.4-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/Kconfig: fix Kconfig layout xen/pci: don't use PCI BIOS service for configuration space accesses xen/pte: Fix crashes when trying to see non-existent PGD/PMD/PUD/PTEs xen/apic: Return the APIC ID (and version) for CPU 0. drivers/video/xen-fbfront.c: add missing cleanup code
-
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpuLinus Torvalds authored
Pull two percpu fixes from Tejun Heo: "One adds missing KERN_CONT on split printk()s and the other makes the percpu allocator avoid using PMD_SIZE as atom_size on x86_32. Using PMD_SIZE led to vmalloc area exhaustion on certain configurations (x86_32 android) and the only cost of using PAGE_SIZE instead is static percpu area not being aligned to large page mapping." * 'for-3.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu, x86: don't use PMD_SIZE as embedded atom_size on 32bit percpu: use KERN_CONT in pcpu_dump_alloc_info()
-
git://git.linaro.org/people/rmk/linux-armLinus Torvalds authored
Pull ARM fixes from Russell King: "This is mainly audit fixes, found by folks who happened to enable this feature and then found it broke their user applications." * 'fixes' of git://git.linaro.org/people/rmk/linux-arm: ARM: 7414/1: SMP: prevent use of the console when using idmap_pgd ARM: 7412/1: audit: use only AUDIT_ARCH_ARM regardless of endianness ARM: 7411/1: audit: fix treatment of saved ip register during syscall tracing ARM: 7410/1: Add extra clobber registers for assembly in kernel_execve
-
Tejun Heo authored
With the embed percpu first chunk allocator, x86 uses either PAGE_SIZE or PMD_SIZE for atom_size. PMD_SIZE is used when CPU supports PSE so that percpu areas are aligned to PMD mappings and possibly allow using PMD mappings in vmalloc areas in the future. Using larger atom_size doesn't waste actual memory; however, it does require larger vmalloc space allocation later on for !first chunks. With reasonably sized vmalloc area, PMD_SIZE shouldn't be a problem but x86_32 at this point is anything but reasonable in terms of address space and using larger atom_size reportedly leads to frequent percpu allocation failures on certain setups. As there is no reason to not use PMD_SIZE on x86_64 as vmalloc space is aplenty and most x86_64 configurations support PSE, fix the issue by always using PMD_SIZE on x86_64 and PAGE_SIZE on x86_32. v2: drop cpu_has_pse test and make x86_64 always use PMD_SIZE and x86_32 PAGE_SIZE as suggested by hpa. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Yanmin Zhang <yanmin.zhang@intel.com> Reported-by: ShuoX Liu <shuox.liu@intel.com> Acked-by: H. Peter Anvin <hpa@zytor.com> LKML-Reference: <4F97BA98.6010001@intel.com> Cc: stable@vger.kernel.org
-
- 07 May, 2012 10 commits
-
-
Andrew Morton authored
Fit it into 80 columns so that it is readable in menuconfig. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
David Vrabel authored
The accessing PCI configuration space with the PCI BIOS32 service does not work in PV guests. On systems without MMCONFIG or where the BIOS hasn't marked the MMCONFIG region as reserved in the e820 map, the BIOS service is probed (even though direct access is preferred) and this hangs. CC: stable@kernel.org Acked-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> [v1: Fixed compile error when CONFIG_PCI is not set] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
Konrad Rzeszutek Wilk authored
If I try to do "cat /sys/kernel/debug/kernel_page_tables" I end up with: BUG: unable to handle kernel paging request at ffffc7fffffff000 IP: [<ffffffff8106aa51>] ptdump_show+0x221/0x480 PGD 0 Oops: 0000 [#1] SMP CPU 0 .. snip.. RAX: 0000000000000000 RBX: ffffc00000000fff RCX: 0000000000000000 RDX: 0000800000000000 RSI: 0000000000000000 RDI: ffffc7fffffff000 which is due to the fact we are trying to access a PFN that is not accessible to us. The reason (at least in this case) was that PGD[256] is set to __HYPERVISOR_VIRT_START which was setup (by the hypervisor) to point to a read-only linear map of the MFN->PFN array. During our parsing we would get the MFN (a valid one), try to look it up in the MFN->PFN tree and find it invalid and return ~0 as PFN. Then pte_mfn_to_pfn would happilly feed that in, attach the flags and return it back to the caller. 'ptdump_show' bitshifts it and gets and invalid value that it tries to dereference. Instead of doing all of that, we detect the ~0 case and just return !_PAGE_PRESENT. This bug has been in existence .. at least until 2.6.37 (yikes!) CC: stable@kernel.org Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
Konrad Rzeszutek Wilk authored
On x86_64 on AMD machines where the first APIC_ID is not zero, we get: ACPI: LAPIC (acpi_id[0x01] lapic_id[0x10] enabled) BIOS bug: APIC version is 0 for CPU 1/0x10, fixing up to 0x10 BIOS bug: APIC version mismatch, boot CPU: 0, CPU 1: version 10 which means that when the ACPI processor driver loads and tries to parse the _Pxx states it fails to do as, as it ends up calling acpi_get_cpuid which does this: for_each_possible_cpu(i) { if (cpu_physical_id(i) == apic_id) return i; } And the bootup CPU, has not been found so it fails and returns -1 for the first CPU - which then subsequently in the loop that "acpi_processor_get_info" does results in returning an error, which means that "acpi_processor_add" failing and per_cpu(processor) is never set (and is NULL). That means that when xen-acpi-processor tries to load (much much later on) and parse the P-states it gets -ENODEV from acpi_processor_register_performance() (which tries to read the per_cpu(processor)) and fails to parse the data. Reported-by-and-Tested-by: Stefan Bader <stefan.bader@canonical.com> Suggested-by: Boris Ostrovsky <boris.ostrovsky@amd.com> [v2: Bit-shift APIC ID by 24 bits] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
Julia Lawall authored
The operations in the subsequent error-handling code appear to be also useful here. Acked-by: Florian Tobias Schandinat <FlorianSchandinat@gmx.de> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> [v1: Collapse some of the error handling functions] [v2: Fix compile warning] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-
Ben Skeggs authored
Previous issues with i2c-algo-bit have now been resolved. This is a revert of f553b79c mostly, due to fixes in the i2c core repairing the original issue, this code isn't required and was causing regressions. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reported-by: Nick Bowler <nbowler@elliptictech.com> Tested-by: Nick Bowler <nbowler@elliptictech.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
-
git://people.freedesktop.org/~danvet/drm-intelDave Airlie authored
Daniel wrote: 2 little patches: - One regression fix to disable sdvo hotplug on broken hw. - One patch to upconvert the snb hang workaround from patch v1 to patch v2. * 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel: drm/i915: Do no set Stencil Cache eviction LRA w/a on gen7+ drm/i915: disable sdvo hotplug on i945g/gm
-
Daniel Vetter authored
I've flagged this while reviewing the first version and Ken Graunke fixed it up in v2, but unfortunately Dave Airlie picked up the wrong version. Cc: Dave Airlie <airlied@redhat.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: stable@kernel.org Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Daniel Vetter authored
Chris Wilson dug out a hw erratum saying that there's noise on the interrupt line on i945G chips. We also have a bug report from a i945GM chip with an sdvo hotplug interrupt storm (and no apparent cause). Play it safe and disable sdvo hotplug on all i945 variants. Note that this is a regression that has been introduced in 3.1, when we've enabled sdvo hotplug support with commit cc68c81a Author: Simon Farnsworth <simon.farnsworth@onelan.co.uk> Date: Wed Sep 21 17:13:30 2011 +0100 drm/i915: Enable SDVO hotplug interrupts for HDMI and DVI Cc: stable@kernel.org Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=38442Reported-and-tested-by: Dominik Köppl <dominik@devwork.org> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Larry Finger authored
Commit ce7e5d2d ("x86: fix broken TASK_SIZE for ia32_aout") breaks kernel builds when "CONFIG_IA32_AOUT=m" with ERROR: "set_personality_ia32" [arch/x86/ia32/ia32_aout.ko] undefined! make[1]: *** [__modpost] Error 1 The entry point needs to be exported. Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Acked-by: Al Viro <viro@zeniv.linux.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
- 06 May, 2012 6 commits
-
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 fixes form Peter Anvin * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: intel_mid_powerbtn: mark irq as IRQF_NO_SUSPEND arch/x86/platform/geode/net5501.c: change active_low to 0 for LED driver x86, relocs: Remove an unused variable asm-generic: Use __BITS_PER_LONG in statfs.h x86/amd: Re-enable CPU topology extensions in case BIOS has disabled it
-
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfsLinus Torvalds authored
Pull btrfs fixes from Chris Mason: "The big ones here are a memory leak we introduced in rc1, and a scheduling while atomic if the transid on disk doesn't match the transid we expected. This happens for corrupt blocks, or out of date disks. It also fixes up the ioctl definition for our ioctl to resolve logical inode numbers. The __u32 was a merging error and doesn't match what we ship in the progs." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: avoid sleeping in verify_parent_transid while atomic Btrfs: fix crash in scrub repair code when device is missing btrfs: Fix mismatching struct members in ioctl.h Btrfs: fix page leak when allocing extent buffers Btrfs: Add properly locking around add_root_to_dirty_list
-
Al Viro authored
Setting TIF_IA32 in load_aout_binary() used to be enough; these days TASK_SIZE is controlled by TIF_ADDR32 and that one doesn't get set there. Switch to use of set_personality_ia32()... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Chris Mason authored
verify_parent_transid needs to lock the extent range to make sure no IO is underway, and so it can safely clear the uptodate bits if our checks fail. But, a few callers are using it with spinlocks held. Most of the time, the generation numbers are going to match, and we don't want to switch to a blocking lock just for the error case. This adds an atomic flag to verify_parent_transid, and changes it to return EAGAIN if it needs to block to properly verifiy things. Signed-off-by: Chris Mason <chris.mason@oracle.com>
-
Colin Cross authored
Commit 4e8ee7de (ARM: SMP: use idmap_pgd for mapping MMU enable during secondary booting) switched secondary boot to use idmap_pgd, which is initialized during early_initcall, instead of a page table initialized during __cpu_up. This causes idmap_pgd to contain the static mappings but be missing all dynamic mappings. If a console is registered that creates a dynamic mapping, the printk in secondary_start_kernel will trigger a data abort on the missing mapping before the exception handlers have been initialized, leading to a hang. Initial boot is not affected because no consoles have been registered, and resume is usually not affected because the offending console is suspended. Onlining a cpu with hotplug triggers the problem. A workaround is to the printk in secondary_start_kernel until after the page tables have been switched back to init_mm. Cc: <stable@vger.kernel.org> Signed-off-by: Colin Cross <ccross@android.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
-
- 05 May, 2012 13 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alphaLinus Torvalds authored
Pull alpha fixes from Matt Turner: "My alpha tree is back up (after taking quite some time to get my GPG key signed). It contains just some simple fixes." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha: alpha: silence 'const' warning in sys_marvel.c alpha: include module.h to fix modpost on Tsunami alpha: properly define get/set_rtc_time on Marvel/SMP alpha: VGA_HOSE depends on VGA_CONSOLE
-
Jiri Slaby authored
The test in pdc_console_tty_close '!tty->count' was always wrong because tty->count is decremented after tty->ops->close is called and thus can never be zero. Hence the 'then' branch was never executed and the timer never deleted. This did not matter until commit 5dd5bc40 ("TTY: pdc_cons, use tty_port"). There we needed to set TTY in tty_port to NULL, but this never happened due to the bug above. So change the test to really trigger at the last close by changing the condition to 'tty->count == 1'. Well, the driver should not touch tty->count at all. It should use tty_port->count and count open count there itself. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Reported-and-tested-by: Mikulas Patocka <mpatocka@redhat.com> Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Helge Deller <deller@gmx.de> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/soundLinus Torvalds authored
Pull sound sound fixes from Takashi Iwai: "As good as nothing exciting here; just a few trivial fixes for various ASoC stuff." * tag 'sound-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ASoC: omap-pcm: Free dma buffers in case of error. ASoC: s3c2412-i2s: Fix dai registration ASoC: wm8350: Don't use locally allocated codec struct ASoC: tlv312aic23: unbreak resume ASoC: bf5xx-ssm2602: Set DAI format ASoC: core: check of_property_count_strings failure ASoC: dt: sgtl5000.txt: Add description for 'reg' field ASoC: wm_hubs: Make sure we don't disable differential line outputs
-
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linuxLinus Torvalds authored
Pull an ACPI patch from Len Brown: "It fixes a D3 issue new in 3.4-rc1." By Lin Ming via Len Brown: * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: ACPI: Fix D3hot v D3cold confusion
-
Sasha Levin authored
Currently, we'll try mounting any device who's major device number is UNNAMED_MAJOR as NFS root. This would happen for non-NFS devices as well (such as 9p devices) but it wouldn't cause any issues since mounting the device as NFS would fail quickly and the code proceeded to doing the proper mount: [ 101.522716] VFS: Unable to mount root fs via NFS, trying floppy. [ 101.534499] VFS: Mounted root (9p filesystem) on device 0:18. Commit 6829a048102a ("NFS: Retry mounting NFSROOT") introduced retries when mounting NFS root, which means that now we don't immediately fail and instead it takes an additional 90+ seconds until we stop retrying, which has revealed the issue this patch fixes. This meant that it would take an additional 90 seconds to boot when we're not using a device type which gets detected in order before NFS. This patch modifies the NFS type check to require device type to be 'Root_NFS' instead of requiring the device to have an UNNAMED_MAJOR major. This makes boot process cleaner since we now won't go through the NFS mounting code at all when the device isn't an NFS root ("/dev/nfs"). Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Will Deacon authored
The machine endianness has no direct correspondence to the syscall ABI, so use only AUDIT_ARCH_ARM when identifying the ABI to the audit tools in userspace. Cc: stable@vger.kernel.org Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
-
Will Deacon authored
The ARM audit code incorrectly uses the saved application ip register value to infer syscall entry or exit. Additionally, the saved value will be clobbered if the current task is not being traced, which can lead to libc corruption if ip is live (apparently glibc uses it for the TLS pointer). This patch fixes the syscall tracing code so that the why parameter is used to infer the syscall direction and the saved ip is only updated if we know that we will be signalling a ptrace trap. Reported-and-Tested-by: Jon Masters <jcm@jonmasters.org> Cc: stable@vger.kernel.org Cc: Eric Paris <eparis@redhat.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
-
Tim Bird authored
The inline assembly in kernel_execve() uses r8 and r9. Since this code sequence does not return, it usually doesn't matter if the register clobber list is accurate. However, I saw a case where a particular version of gcc used r8 as an intermediate for the value eventually passed to r9. Because r8 is used in the inline assembly, and not mentioned in the clobber list, r9 was set to an incorrect value. This resulted in a kernel panic on execution of the first user-space program in the system. r9 is used in ret_to_user as the thread_info pointer, and if it's wrong, bad things happen. Cc: <stable@vger.kernel.org> Signed-off-by: Tim Bird <tim.bird@am.sony.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
-
Takashi Iwai authored
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/soundTakashi Iwai authored
ASoC: Updates for 3.4 Nothing terribly exciting here, a bunch of small and simple fixes scattered around the place.
-
Lin Ming authored
Before this patch, ACPI_STATE_D3 incorrectly referenced D3hot in some places, but D3cold in other places. After this patch, ACPI_STATE_D3 always means ACPI_STATE_D3_COLD; and all references to D3hot use ACPI_STATE_D3_HOT. ACPI's _PR3 method is used to enter both D3hot and D3cold states. What distinguishes D3hot from D3cold is the presence _PR3 (Power Resources for D3hot) If these resources are all ON, then the state is D3hot. If _PR3 is not present, or all _PR0 resources for the devices are OFF, then the state is D3cold. This patch applies after Linux-3.4-rc1. A future syntax cleanup may remove ACPI_STATE_D3 to emphasize that it always means ACPI_STATE_D3_COLD. Signed-off-by: Lin Ming <ming.m.lin@intel.com> Acked-by: Rafael J. Wysocki <rjw@sisk.pl> Reviewed-by: Aaron Lu <aaron.lu@amd.com> Signed-off-by: Len Brown <len.brown@intel.com>
-
Greg Kroah-Hartman authored
Commit ec81aecb ("hfs: fix a potential buffer overflow") fixed a few potential buffer overflows in the hfs filesystem. But as Timo Warns pointed out, these changes also need to be made on the hfsplus filesystem as well. Reported-by: Timo Warns <warns@pre-sense.de> Acked-by: WANG Cong <amwang@redhat.com> Cc: Alexey Khoroshilov <khoroshilov@ispras.ru> Cc: Miklos Szeredi <mszeredi@suse.cz> Cc: Sage Weil <sage@newdream.net> Cc: Eugene Teo <eteo@redhat.com> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Dave Anderson <anderson@redhat.com> Cc: stable <stable@vger.kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
- 04 May, 2012 5 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull timer fix from Thomas Gleixner. * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: rtc: Fix possible null pointer dereference in rtc-mpc5121.c
-
git://git.samba.org/sfrench/cifs-2.6Linus Torvalds authored
Pull CIFS fixes from Steve French. * git://git.samba.org/sfrench/cifs-2.6: fs/cifs: fix parsing of dfs referrals cifs: make sure we ignore the credentials= and cred= options [CIFS] Update cifs version to 1.78 cifs - check S_AUTOMOUNT in revalidate cifs: add missing initialization of server->req_lock cifs: don't cap ra_pages at the same level as default_backing_dev_info CIFS: Fix indentation in cifs_show_options
-
Dave Jones authored
Remove myself as cpufreq maintainer. x86 driver changes can go through the regular x86/ACPI trees. ARM driver changes through the ARM trees. cpufreq core changes are rare these days, and can just go to lkml/direct. Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Linus Torvalds authored
The normal read_seqcount_begin() function will wait for any current writers to exit their critical region by looping until the sequence count is even. That "wait for sequence count to stabilize" is the right thing to do if the read-locker will just retry the whole operation on contention: no point in doing a potentially expensive reader sequence if we know at the beginning that we'll just end up re-doing it all. HOWEVER. Some users don't actually retry the operation, but instead will abort and do the operation with proper locking. So the sequence count case may be the optimistic quick case, but in the presense of writers you may want to do full locking in order to guarantee forward progress. The prime example of this would be the RCU name lookup. And in that case, you may well be better off without the "retry early", and are in a rush to instead get to the failure handling. Thus this "raw" interface that just returns the sequence number without testing it - it just forces the low bit to zero so that read_seqcount_retry() will always fail such a "active concurrent writer" scenario. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Linus Torvalds authored
We really need to use a ACCESS_ONCE() on the sequence value read in __read_seqcount_begin(), because otherwise the compiler might end up reloading the value in between the test and the return of it. As a result, it might end up returning an odd value (which means that a write is in progress). If the reader is then fast enough that that odd value is still the current one when the read_seqcount_retry() is done, we might end up with a "successful" read sequence, even despite the concurrent write being active. In practice this probably never really happens - there just isn't anything else going on around the read of the sequence count, and the common case is that we end up having a read barrier immediately afterwards. So the code sequence in which gcc might decide to reaload from memory is small, and there's no reason to believe it would ever actually do the reload. But if the compiler ever were to decide to do so, it would be incredibly annoying to debug. Let's just make sure. Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-