1. 09 Feb, 2009 1 commit
  2. 05 Feb, 2009 8 commits
  3. 04 Feb, 2009 1 commit
  4. 03 Feb, 2009 1 commit
    • Yinghai Lu's avatar
      x86, percpu: fix kexec with vmlinux · ef3892bd
      Yinghai Lu authored
      Impact: fix regression with kexec with vmlinux
      
      Split data.init into data.init, percpu, data.init2 sections
      instead of let data.init wrap percpu secion.
      
      Thus kexec loading will be happy, because sections will not
      overlap.
      
      Before the patch we have:
      
      Elf file type is EXEC (Executable file)
      Entry point 0x200000
      There are 6 program headers, starting at offset 64
      
      Program Headers:
        Type           Offset             VirtAddr           PhysAddr
                       FileSiz            MemSiz              Flags  Align
        LOAD           0x0000000000200000 0xffffffff80200000 0x0000000000200000
                       0x0000000000ca6000 0x0000000000ca6000  R E    200000
        LOAD           0x0000000000ea6000 0xffffffff80ea6000 0x0000000000ea6000
                       0x000000000014dfe0 0x000000000014dfe0  RWE    200000
        LOAD           0x0000000001000000 0xffffffffff600000 0x0000000000ff4000
                       0x0000000000000888 0x0000000000000888  RWE    200000
        LOAD           0x00000000011f6000 0xffffffff80ff6000 0x0000000000ff6000
                       0x0000000000073086 0x0000000000a2d938  RWE    200000
        LOAD           0x0000000001400000 0x0000000000000000 0x000000000106a000
                       0x00000000001d2ce0 0x00000000001d2ce0  RWE    200000
        NOTE           0x00000000009e2c1c 0xffffffff809e2c1c 0x00000000009e2c1c
                       0x0000000000000024 0x0000000000000024         4
      
       Section to Segment mapping:
        Segment Sections...
         00     .text .notes __ex_table .rodata __bug_table .pci_fixup .builtin_fw __ksymtab __ksymtab_gpl __ksymtab_strings __init_rodata __param
         01     .data .init.rodata .data.cacheline_aligned .data.read_mostly
         02     .vsyscall_0 .vsyscall_fn .vsyscall_gtod_data .vsyscall_1 .vsyscall_2 .vgetcpu_mode .jiffies
         03     .data.init_task .smp_locks .init.text .init.data .init.setup .initcall.init .con_initcall.init .x86_cpu_dev.init .altinstructions .altinstr_replacement .exit.text .init.ramfs .bss
         04     .data.percpu
         05     .notes
      
      After patch we've got:
      
      Elf file type is EXEC (Executable file)
      Entry point 0x200000
      There are 7 program headers, starting at offset 64
      
      Program Headers:
        Type           Offset             VirtAddr           PhysAddr
                       FileSiz            MemSiz              Flags  Align
        LOAD           0x0000000000200000 0xffffffff80200000 0x0000000000200000
                       0x0000000000ca6000 0x0000000000ca6000  R E    200000
        LOAD           0x0000000000ea6000 0xffffffff80ea6000 0x0000000000ea6000
                       0x000000000014dfe0 0x000000000014dfe0  RWE    200000
        LOAD           0x0000000001000000 0xffffffffff600000 0x0000000000ff4000
                       0x0000000000000888 0x0000000000000888  RWE    200000
        LOAD           0x00000000011f6000 0xffffffff80ff6000 0x0000000000ff6000
                       0x0000000000073086 0x0000000000073086  RWE    200000
        LOAD           0x0000000001400000 0x0000000000000000 0x000000000106a000
                       0x00000000001d2ce0 0x00000000001d2ce0  RWE    200000
        LOAD           0x000000000163d000 0xffffffff8123d000 0x000000000123d000
                       0x0000000000000000 0x00000000007e6938  RWE    200000
        NOTE           0x00000000009e2c1c 0xffffffff809e2c1c 0x00000000009e2c1c
                       0x0000000000000024 0x0000000000000024         4
      
       Section to Segment mapping:
        Segment Sections...
         00     .text .notes __ex_table .rodata __bug_table .pci_fixup .builtin_fw __ksymtab __ksymtab_gpl __ksymtab_strings __init_rodata __param
         01     .data .init.rodata .data.cacheline_aligned .data.read_mostly
         02     .vsyscall_0 .vsyscall_fn .vsyscall_gtod_data .vsyscall_1 .vsyscall_2 .vgetcpu_mode .jiffies
         03     .data.init_task .smp_locks .init.text .init.data .init.setup .initcall.init .con_initcall.init .x86_cpu_dev.init .altinstructions .altinstr_replacement .exit.text .init.ramfs
         04     .data.percpu
         05     .bss
         06     .notes
      Signed-off-by: default avatarYinghai Lu <yinghai@kernel.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ef3892bd
  5. 02 Feb, 2009 2 commits
  6. 31 Jan, 2009 6 commits
  7. 30 Jan, 2009 21 commits
    • Jeremy Fitzhardinge's avatar
      x86/paravirt: fix missing callee-save call on pud_val · 4767afbf
      Jeremy Fitzhardinge authored
      Impact: Fix build when CONFIG_PARAVIRT_DEBUG is enabled
      
      Fix missed convertion to using callee-saved calls for pud_val, which
      causes a compile error when CONFIG_PARAVIRT_DEBUG is enabled.
      Signed-off-by: default avatarJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
      4767afbf
    • Jeremy Fitzhardinge's avatar
      x86/paravirt: use callee-saved convention for pte_val/make_pte/etc · da5de7c2
      Jeremy Fitzhardinge authored
      Impact: Optimization
      
      In the native case, pte_val, make_pte, etc are all just identity
      functions, so there's no need to clobber a lot of registers over them.
      
      (This changes the 32-bit callee-save calling convention to return both
      EAX and EDX so functions can return 64-bit values.)
      Signed-off-by: default avatarJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      da5de7c2
    • Jeremy Fitzhardinge's avatar
      x86/paravirt: implement PVOP_CALL macros for callee-save functions · 791bad9d
      Jeremy Fitzhardinge authored
      Impact: Optimization
      
      Functions with the callee save calling convention clobber many fewer
      registers than the normal C calling convention.  Implement variants of
      PVOP_V?CALL* accordingly.  This only bothers with functions up to 3
      args, since functions with more args may as well use the normal
      calling convention.
      Signed-off-by: default avatarJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      791bad9d
    • Jeremy Fitzhardinge's avatar
      x86/paravirt: add register-saving thunks to reduce caller register pressure · ecb93d1c
      Jeremy Fitzhardinge authored
      Impact: Optimization
      
      One of the problems with inserting a pile of C calls where previously
      there were none is that the register pressure is greatly increased.
      The C calling convention says that the caller must expect a certain
      set of registers may be trashed by the callee, and that the callee can
      use those registers without restriction.  This includes the function
      argument registers, and several others.
      
      This patch seeks to alleviate this pressure by introducing wrapper
      thunks that will do the register saving/restoring, so that the
      callsite doesn't need to worry about it, but the callee function can
      be conventional compiler-generated code.  In many cases (particularly
      performance-sensitive cases) the callee will be in assembler anyway,
      and need not use the compiler's calling convention.
      
      Standard calling convention is:
      	 arguments	    return	scratch
      x86-32	 eax edx ecx	    eax		?
      x86-64	 rdi rsi rdx rcx    rax		r8 r9 r10 r11
      
      The thunk preserves all argument and scratch registers.  The return
      register is not preserved, and is available as a scratch register for
      unwrapped callee code (and of course the return value).
      
      Wrapped function pointers are themselves wrapped in a struct
      paravirt_callee_save structure, in order to get some warning from the
      compiler when functions with mismatched calling conventions are used.
      
      The most common paravirt ops, both statically and dynamically, are
      interrupt enable/disable/save/restore, so handle them first.  This is
      particularly easy since their calls are handled specially anyway.
      
      XXX Deal with VMI.  What's their calling convention?
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      ecb93d1c
    • Jeremy Fitzhardinge's avatar
      x86/paravirt: selectively save/restore regs around pvops calls · 9104a18d
      Jeremy Fitzhardinge authored
      Impact: Optimization
      
      Each asm paravirt-ops call says what registers are available for
      clobbering.  This patch makes use of this to selectively save/restore
      registers around each pvops call.  In many cases this significantly
      shrinks code size.
      Signed-off-by: default avatarJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      9104a18d
    • Jeremy Fitzhardinge's avatar
      x86: fix paravirt clobber in entry_64.S · b8aa287f
      Jeremy Fitzhardinge authored
      Impact: Fix latent bug
      
      The clobber is trying to say that anything except RDI is available for
      clobbering, but actually clobbers everything.  This hasn't mattered
      because the clobbers were basically ignored, but subsequent patches
      will rely on them.
      Signed-off-by: default avatarJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      b8aa287f
    • Jeremy Fitzhardinge's avatar
      x86/pvops: add a paravirt_ident functions to allow special patching · 41edafdb
      Jeremy Fitzhardinge authored
      Impact: Optimization
      
      Several paravirt ops implementations simply return their arguments,
      the most obvious being the make_pte/pte_val class of operations on
      native.
      
      On 32-bit, the identity function is literally a no-op, as the calling
      convention uses the same registers for the first argument and return.
      On 64-bit, it can be implemented with a single "mov".
      
      This patch adds special identity functions for 32 and 64 bit argument,
      and machinery to recognize them and replace them with either nops or a
      mov as appropriate.
      
      At the moment, the only users for the identity functions are the
      pagetable entry conversion functions.
      
      The result is a measureable improvement on pagetable-heavy benchmarks
      (2-3%, reducing the pvops overhead from 5 to 2%).
      Signed-off-by: default avatarJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      41edafdb
    • Jeremy Fitzhardinge's avatar
      xen: move remaining mmu-related stuff into mmu.c · 319f3ba5
      Jeremy Fitzhardinge authored
      Impact: Cleanup
      
      Move remaining mmu-related stuff into mmu.c.
      A general cleanup, and lay the groundwork for later patches.
      Signed-off-by: default avatarJeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      319f3ba5
    • H. Peter Anvin's avatar
      9b7ed8fa
    • Tejun Heo's avatar
      linker script: use separate simpler definition for PERCPU() · 3ac6cffe
      Tejun Heo authored
      Impact: fix linker screwup on x86_32
      
      Recent x86_64 zerobased patches introduced PERCPU_VADDR() to put
      .data.percpu to a predefined address and re-defined PERCPU() in terms
      of it.  The new macro defined one extra symbol, __per_cpu_load, for
      LMA of the section so that the init data could be accessed.  This new
      symbol introduced the following problems to x86_32.
      
      1. If __per_cpu_load is defined outside of .data.percpu as an absolute
         symbol, relocation generation for relocatable kernel fails due to
         absolute relocation.
      
      2. If __per_cpu_load is put inside .data.percpu with absolute address
         assignment to work around #1, linker gets confused and under
         certain configurations ends up relocating the symbol against
         .data.percpu such that the load address gets added on top of
         already set load address.
      
      As x86_32 doesn't use predefined address for .data.percpu, there's no
      need for it to care about the possibility of __per_cpu_load being
      different from __per_cpu_start.
      
      This patch defines PERCPU() separately so that __per_cpu_load is
      defined inside .data.percpu so that everything is ordinary
      linking-wise.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      3ac6cffe
    • Linus Torvalds's avatar
      Allow opportunistic merging of VM_CAN_NONLINEAR areas · 33bfad54
      Linus Torvalds authored
      Commit de33c8db ("Fix OOPS in
      mmap_region() when merging adjacent VM_LOCKED file segments") unified
      the vma merging of anonymous and file maps to just one place, which
      simplified the code and fixed a use-after-free bug that could cause an
      oops.
      
      But by doing the merge opportunistically before even having called
      ->mmap() on the file method, it now compares two different 'vm_flags'
      values: the pre-mmap() value of the new not-yet-formed vma, and previous
      mappings of the same file around it.
      
      And in doing so, it refused to merge the common file case, which adds a
      marker to say "I can be made non-linear".
      
      This fixes it by just adding a set of flags that don't have to match,
      because we know they are ok to merge.  Currently it's only that single
      VM_CAN_NONLINEAR flag, but at least conceptually there could be others
      in the future.
      Reported-and-acked-by: default avatarHugh Dickins <hugh@veritas.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Greg KH <gregkh@suse.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      33bfad54
    • Ingo Molnar's avatar
      Merge branch 'linus' into core/percpu · c43e0e46
      Ingo Molnar authored
      Conflicts:
      	kernel/irq/handle.c
      c43e0e46
    • Linus Torvalds's avatar
      Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · c01a25e7
      Linus Torvalds authored
      * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        ext4: Remove bogus BUG() check in ext4_bmap()
        ext4: Fix building with EXT4FS_DEBUG
        ext4: Initialize the new group descriptor when resizing the filesystem
        ext4: Fix ext4_free_blocks() w/o a journal when files have indirect blocks
        jbd2: On a __journal_expect() assertion failure printk "JBD2", not "EXT3-fs"
        ext3: Add sanity check to make_indexed_dir
        ext4: Add sanity check to make_indexed_dir
        ext4: only use i_size_high for regular files
        ext4: fix wrong use of do_div
      c01a25e7
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block · ae704e9f
      Linus Torvalds authored
      * 'for-linus' of git://git.kernel.dk/linux-2.6-block:
        cfq-iosched: Allow RT requests to pre-empt ongoing BE timeslice
        block: add sysfs file for controlling io stats accounting
        Mark mandatory elevator functions in the biodoc.txt
        include/linux: Add bsg.h to the Kernel exported headers
        block: silently error an unsupported barrier bio
        block: Fix documentation for blkdev_issue_flush()
        block: add bio_rw_flagged() for testing bio->bi_rw
        block: seperate bio/request unplug and sync bits
        block: export SSD/non-rotational queue flag through sysfs
        Fix small typo in bio.h's documentation
        block: get rid of the manual directory counting in blktrace
        block: Allow empty integrity profile
        block: Remove obsolete BUG_ON
        block: Don't verify integrity metadata on read error
      ae704e9f
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 · dbeb1701
      Linus Torvalds authored
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (29 commits)
        tulip: fix 21142 with 10Mbps without negotiation
        drivers/net/skfp: if !capable(CAP_NET_ADMIN): inverted logic
        gianfar: Fix Wake-on-LAN support
        smsc911x: timeout reaches -1
        smsc9420: fix interrupt signalling test failures
        ucc_geth: Change uec phy id to the same format as gianfar's
        wimax: fix build issue when debugfs is disabled
        netxen: fix memory leak in drivers/net/netxen_nic_init.c
        tun: Add some missing TUN compat ioctl translations.
        ipv4: fix infinite retry loop in IP-Config
        net: update documentation ip aliases
        net: Fix OOPS in skb_seq_read().
        net: Fix frag_list handling in skb_seq_read
        netxen: revert jumbo ringsize
        ath5k: fix locking in ath5k_config
        cfg80211: print correct intersected regulatory domain
        cfg80211: Fix sanity check on 5 GHz when processing country IE
        iwlwifi: fix kernel oops when ucode DMA memory allocation failure
        rtl8187: Fix error in setting OFDM power settings for RTL8187L
        mac80211: remove Michael Wu as maintainer
        ...
      dbeb1701
    • Paul Larson's avatar
      Add enable_ms to jsm driver · 0461ec5b
      Paul Larson authored
      This fixes a crash observed when non-existant enable_ms function is
      called for jsm driver.
      Signed-off-by: default avatarScott Kilau <Scott.Kilau@digi.com>
      Signed-off-by: default avatarPaul Larson <pl@linux.vnet.ibm.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0461ec5b
    • Divyesh Shah's avatar
      cfq-iosched: Allow RT requests to pre-empt ongoing BE timeslice · 3a9a3f6c
      Divyesh Shah authored
      This patch adds the ability to pre-empt an ongoing BE timeslice when a RT
      request is waiting for the current timeslice to complete. This reduces the
      wait time to disk for RT requests from an upper bound of 4 (current value
      of cfq_quantum) to 1 disk request.
      
      Applied Jens' suggeested changes to avoid the rb lookup and use !cfq_class_rt()
      and retested.
      
      Latency(secs) for the RT task when doing sequential reads from 10G file.
                             | only RT | RT + BE | RT + BE + this patch
      small (512 byte) reads | 143     | 163     | 145
      large (1Mb) reads      | 142     | 158     | 146
      Signed-off-by: default avatarDivyesh Shah <dpshah@google.com>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      3a9a3f6c
    • Jens Axboe's avatar
      block: add sysfs file for controlling io stats accounting · bc58ba94
      Jens Axboe authored
      This allows us to turn off disk stat accounting completely, for the cases
      where the 0.5-1% reduction in system time is important.
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      bc58ba94
    • Nikanth Karthikesan's avatar
      Mark mandatory elevator functions in the biodoc.txt · 7598909e
      Nikanth Karthikesan authored
      biodoc.txt mentions that elevator functions marked with * are mandatory, but
      no function is marked with *. Mark the 3 functions which should be
      implemented by any io scheduler.
      Signed-off-by: default avatarNikanth Karthikesan <knikanth@suse.de>
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      7598909e
    • Boaz Harrosh's avatar
      include/linux: Add bsg.h to the Kernel exported headers · a229fc61
      Boaz Harrosh authored
      bsg.h in current form is perfectly suitable for user-mode
      consumption. It is needed together with scsi/sg.h for applications
      that want to interface with the bsg driver.
      
      Currently the few projects that use it would copy it over into
      the projects. But that is not acceptable for projects that need
      to provide source and devel packages for distros.
      
      This should also be submitted to stable 2.6.28 and 2.6.27 since bsg had
      a stable API since these Kernels and distro users will need the header
      for these kernels a swell
      Signed-off-by: default avatarBoaz Harrosh <bharrosh@panasas.com>
      Acked-by: default avatarFUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
      CC: stable@kernel.org
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      a229fc61
    • Jens Axboe's avatar
      block: silently error an unsupported barrier bio · cec0707e
      Jens Axboe authored
      This fixes a "regression" from 2.6.28, where the barrier probes that file
      systems may do would trigger additional end request warnings in dmesg.
      Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>
      cec0707e