1. 03 Dec, 2010 2 commits
    • Dave Airlie's avatar
      Merge branch 'drm-radeon-next' of ../drm-radeon-next into drm-core-next · 7e76c5cf
      Dave Airlie authored
      * 'drm-radeon-next' of ../drm-radeon-next:
        drm/radeon/kms: improve pflip precision on r1xx-r4xx
        drm/kms/radeon: Use high precision timestamps for pageflip completion events.
        drm/kms/radeon: Reorder vblank and pageflip interrupt handling.
        drm/radeon/kms: add pageflip ioctl support (v3)
        drm/kms/radeon: Add support for precise vblank timestamping.
      7e76c5cf
    • Dave Airlie's avatar
      Merge branch 'drm-ttm-next' into drm-core-next · a9979d60
      Dave Airlie authored
      * drm-ttm-next:
        drm/radeon: Use the ttm execbuf utilities
        drm/ttm: Fix up io_mem_reserve / io_mem_free calling
        drm/ttm/vmwgfx: Have TTM manage the validation sequence.
        drm/ttm: Improved fencing of buffer object lists
        drm/ttm/radeon/nouveau: Kill the bo lock in favour of a bo device fence_lock
        drm/ttm: Don't deadlock on recursive multi-bo reservations
        drm/ttm: Optimize ttm_eu_backoff_reservation
        drm/ttm: Use kref_sub instead of repeatedly calling kref_put
        kref: Add a kref_sub function
        drm/ttm: Add a bo list reserve fastpath (v2)
      a9979d60
  2. 26 Nov, 2010 1 commit
  3. 22 Nov, 2010 15 commits
    • Thomas Hellstrom's avatar
      drm/radeon: Use the ttm execbuf utilities · 147666fb
      Thomas Hellstrom authored
      Rather than re-implementing in the Radeon driver,
      Use the execbuf / cs / pushbuf utilities that comes with TTM.
      This comes with an even greater benefit now that many spinlocks have been
      optimized away...
      Signed-off-by: default avatarThomas Hellstrom <thellstrom@vmware.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      147666fb
    • Thomas Hellstrom's avatar
      drm/ttm: Fix up io_mem_reserve / io_mem_free calling · eba67093
      Thomas Hellstrom authored
      This patch attempts to fix up shortcomings with the current calling
      sequences.
      
      1) There's a fastpath where no locking occurs and only io_mem_reserved is
         called to obtain needed info for mapping. The fastpath is set per
         memory type manager.
      2) If the fastpath is disabled, io_mem_reserve and io_mem_free will be exactly
         balanced and not called recursively for the same struct ttm_mem_reg.
      3) Optionally the driver can choose to enable a per memory type manager LRU
         eviction mechanism that, when io_mem_reserve returns -EAGAIN will attempt
         to kill user-space mappings of memory in that manager to free up needed
         resources
      Signed-off-by: default avatarThomas Hellstrom <thellstrom@vmware.com>
      Reviewed-by: default avatarBen Skeggs <bskeggs@redhat.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      eba67093
    • Thomas Hellstrom's avatar
      drm/ttm/vmwgfx: Have TTM manage the validation sequence. · 65705962
      Thomas Hellstrom authored
      Rather than having the driver supply the validation sequence, leave that
      responsibility to TTM. This saves some confusion and a function argument.
      Signed-off-by: default avatarThomas Hellstrom <thellstrom@vmware.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      65705962
    • Thomas Hellstrom's avatar
      drm/ttm: Improved fencing of buffer object lists · 95762c2b
      Thomas Hellstrom authored
      Drastically reduce the number of spin lock / unlock operations by performing
      unreserving and fencing under global locks.
      Signed-off-by: default avatarThomas Hellstrom <thellstrom@vmware.com>
      Reviewed-by: default avatarJerome Glisse <j.glisse@redhat.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      95762c2b
    • Thomas Hellstrom's avatar
      drm/ttm/radeon/nouveau: Kill the bo lock in favour of a bo device fence_lock · 702adba2
      Thomas Hellstrom authored
      The bo lock used only to protect the bo sync object members, and since it
      is a per bo lock, fencing a buffer list will see a lot of locks and unlocks.
      Replace it with a per-device lock that protects the sync object members on
      *all* bos. Reading and setting these members will always be very quick, so
      the risc of heavy lock contention is microscopic. Note that waiting for
      sync objects will always take place outside of this lock.
      
      The bo device fence lock will eventually be replaced with a seqlock /
      rcu mechanism so we can determine that a bo is idle under a
      rcu / read seqlock.
      
      However this change will allow us to batch fencing and unreserving of
      buffers with a minimal amount of locking.
      Signed-off-by: default avatarThomas Hellstrom <thellstrom@vmware.com>
      Reviewed-by: default avatarJerome Glisse <j.glisse@gmail.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      702adba2
    • Thomas Hellstrom's avatar
      drm/ttm: Don't deadlock on recursive multi-bo reservations · 96726fe5
      Thomas Hellstrom authored
      Add an aid for the driver to detect deadlocks on multi-bo reservations
      Update documentation.
      Signed-off-by: default avatarThomas Hellstrom <thellstrom@vmware.com>
      Reviewed-by: default avatarJerome Glisse <j.glisse@gmail.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      96726fe5
    • Thomas Hellstrom's avatar
      drm/ttm: Optimize ttm_eu_backoff_reservation · 68c4fa31
      Thomas Hellstrom authored
      Avoid the ttm_bo_unreserve() spinlocks by calling
      ttm_eu_backoff_reservation_locked under the lru spinlock.
      Signed-off-by: default avatarThomas Hellstrom <thellstrom@vmware.com>
      Reviewed-by: default avatarJerome Glisse <j.glisse@gmail.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      68c4fa31
    • Thomas Hellstrom's avatar
    • Thomas Hellstrom's avatar
      kref: Add a kref_sub function · ecf7ace9
      Thomas Hellstrom authored
      Makes it possible to optimize batched multiple unrefs.
      Initial user will be drivers/gpu/ttm which accumulates unrefs to be
      processed outside of atomic code.
      Signed-off-by: default avatarThomas Hellstrom <thellstrom@vmware.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      ecf7ace9
    • Dave Airlie's avatar
      drm/ttm: Add a bo list reserve fastpath (v2) · d6ea8886
      Dave Airlie authored
      Makes it possible to reserve a list of buffer objects with a single
      spin lock / unlock if there is no contention.
      Should improve cpu usage on SMP kernels.
      
      v2: Initialize private list members on reserve and don't call
      ttm_bo_list_ref_sub() with zero put_count.
      Signed-off-by: default avatarThomas Hellstrom <thellstrom@vmware.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      d6ea8886
    • Mario Kleiner's avatar
    • Mario Kleiner's avatar
      drm/kms/radeon: Reorder vblank and pageflip interrupt handling. · 3e4ea742
      Mario Kleiner authored
      In the vblank irq handler, calls to actual vblank handling,
      or at least drm_handle_vblank(), need to happen before
      calls to radeon_crtc_handle_flip().
      
      Reason: The high precision pageflip timestamping
      and some other pageflip optimizations will need the updated
      vblank count and timestamps for the current vblank interval.
      
      These are calculated in drm_handle_vblank(), therefore it
      must go first.
      Signed-off-by: default avatarMario Kleiner <mario.kleiner@tuebingen.mpg.de>
      Signed-off-by: default avatarAlex Deucher <alexdeucher@gmail.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      3e4ea742
    • Alex Deucher's avatar
      drm/radeon/kms: add pageflip ioctl support (v3) · 6f34be50
      Alex Deucher authored
      This adds support for dri2 pageflipping.
      
      v2: precision updates from Mario Kleiner.
      v3: Multihead fixes from Mario Kleiner; missing crtc offset
          add note about update pending bit on pre-avivo chips
      Signed-off-by: default avatarAlex Deucher <alexdeucher@gmail.com>
      Signed-off-by: default avatarMario Kleiner <mario.kleiner@tuebingen.mpg.de>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      6f34be50
    • Mario Kleiner's avatar
      drm/kms/radeon: Add support for precise vblank timestamping. · f5a80209
      Mario Kleiner authored
      This patch adds new functions for use by the drm core:
      
      .get_vblank_timestamp() provides a precise timestamp
      for the end of the most recent (or current) vblank
      interval of a given crtc, as needed for the DRI2
      implementation of the OML_sync_control extension.
      
      It is a thin wrapper around the drm function
      drm_calc_vbltimestamp_from_scanoutpos() which does
      almost all the work and is shared across drivers.
      
      .get_scanout_position() provides the current horizontal
      and vertical video scanout position and "in vblank"
      status of a given crtc, as needed by the drm for use by
      drm_calc_vbltimestamp_from_scanoutpos().
      
      The function is also used by the dynamic gpu reclocking
      code to determine when it is safe to reclock inside vblank.
      
      For that purpose radeon_pm_in_vbl() is modified to
      accomodate a small change in the function prototype of
      the radeon_get_crtc_scanoutpos() which is hooked up to
      .get_scanout_position().
      
      This code has been tested on AVIVO hardware, a RV530
      (ATI Mobility Radeon X1600) in a Intel Core-2 Duo MacBookPro
      and some R600 variant (FireGL V7600) in a single cpu
      AMD Athlon 64 PC.
      Signed-off-by: default avatarMario Kleiner <mario.kleiner@tuebingen.mpg.de>
      Reviewed-by: default avatarAlex Deucher <alexdeucher@gmail.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      f5a80209
    • Mario Kleiner's avatar
      drm/vblank: Add support for precise vblank timestamping. · 27641c3f
      Mario Kleiner authored
      The DRI2 swap & sync implementation needs precise
      vblank counts and precise timestamps corresponding
      to those vblank counts. For conformance to the OpenML
      OML_sync_control extension specification the DRM
      timestamp associated with a vblank count should
      correspond to the start of video scanout of the first
      scanline of the video frame following the vblank
      interval for that vblank count.
      
      Therefore we need to carry around precise timestamps
      for vblanks. Currently the DRM and KMS drivers generate
      timestamps ad-hoc via do_gettimeofday() in some
      places. The resulting timestamps are sometimes not
      very precise due to interrupt handling delays, they
      don't conform to OML_sync_control and some are wrong,
      as they aren't taken synchronized to the vblank.
      
      This patch implements support inside the drm core
      for precise and robust timestamping. It consists
      of the following interrelated pieces.
      
      1. Vblank timestamp caching:
      
      A per-crtc ringbuffer stores the most recent vblank
      timestamps corresponding to vblank counts.
      
      The ringbuffer can be read out lock-free via the
      accessor function:
      
      struct timeval timestamp;
      vblankcount = drm_vblank_count_and_time(dev, crtcid, &timestamp).
      
      The function returns the current vblank count and
      the corresponding timestamp for start of video
      scanout following the vblank interval. It can be
      used anywhere between enclosing drm_vblank_get(dev, crtcid)
      and drm_vblank_put(dev,crtcid) statements. It is used
      inside the drmWaitVblank ioctl and in the vblank event
      queueing and handling. It should be used by kms drivers for
      timestamping of bufferswap completion.
      
      The timestamp ringbuffer is reinitialized each time
      vblank irq's get reenabled in drm_vblank_get()/
      drm_update_vblank_count(). It is invalidated when
      vblank irq's get disabled.
      
      The ringbuffer is updated inside drm_handle_vblank()
      at each vblank irq.
      
      2. Calculation of precise vblank timestamps:
      
      drm_get_last_vbltimestamp() is used to compute the
      timestamp for the end of the most recent vblank (if
      inside active scanout), or the expected end of the
      current vblank interval (if called inside a vblank
      interval). The function calls into a new optional kms
      driver entry point dev->driver->get_vblank_timestamp()
      which is supposed to provide the precise timestamp.
      If a kms driver doesn't implement the entry point or
      if the call fails, a simple do_gettimeofday() timestamp
      is returned as crude approximation of the true vblank time.
      
      A new drm module parameter drm.timestamp_precision_usec
      allows to disable high precision timestamps (if set to
      zero) or to specify the maximum acceptable error in
      the timestamps in microseconds.
      
      Kms drivers could implement their get_vblank_timestamp()
      function in a gpu specific way, as long as returned
      timestamps conform to OML_sync_control, e.g., by use
      of gpu specific hardware timestamps.
      
      Optionally, kms drivers can simply wrap and use the new
      utility function drm_calc_vbltimestamp_from_scanoutpos().
      This function calls a new optional kms driver function
      dev->driver->get_scanout_position() which returns the
      current horizontal and vertical video scanout position
      of the crtc. The scanout position together with the
      drm_display_timing of the current video mode is used
      to calculate elapsed time relative to start of active scanout
      for the current video frame. This elapsed time is subtracted
      from the current do_gettimeofday() time to get the timestamp
      corresponding to start of video scanout. Currently
      non-interlaced, non-doublescan video modes, with or
      without panel scaling are handled correctly. Interlaced/
      doublescan modes are tbd in a future patch.
      
      3. Filtering of redundant vblank irq's and removal of
      some race-conditions in the vblank irq enable/disable path:
      
      Some gpu's (e.g., Radeon R500/R600) send spurious vblank
      irq's outside the vblank if vblank irq's get reenabled.
      These get detected by use of the vblank timestamps and
      filtered out to avoid miscounting of vblanks.
      
      Some race-conditions between the vblank irq enable/disable
      functions, the vblank irq handler and the gpu itself (updating
      its hardware vblank counter in the "wrong" moment) are
      fixed inside vblank_disable_and_save() and
      drm_update_vblank_count() by use of the vblank timestamps and
      a new spinlock dev->vblank_time_lock.
      
      The time until vblank irq disable is now configurable via
      a new drm module parameter drm.vblankoffdelay to allow
      experimentation with timeouts that are much shorter than
      the current 5 seconds and should allow longer vblank off
      periods for better power savings.
      
      Followup patches will use these new functions to
      implement precise timestamping for the intel and radeon
      kms drivers.
      Signed-off-by: default avatarMario Kleiner <mario.kleiner@tuebingen.mpg.de>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      27641c3f
  4. 21 Nov, 2010 1 commit
  5. 20 Nov, 2010 3 commits
    • Linus Torvalds's avatar
      Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · b86db474
      Linus Torvalds authored
      * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        ext4: Add EXT4_IOC_TRIM ioctl to handle batched discard
        fs: Do not dispatch FITRIM through separate super_operation
        ext4: ext4_fill_super shouldn't return 0 on corruption
        jbd2: fix /proc/fs/jbd2/<dev> when using an external journal
        ext4: missing unlock in ext4_clear_request_list()
        ext4: fix setting random pages PageUptodate
      b86db474
    • Lukas Czerner's avatar
      ext4: Add EXT4_IOC_TRIM ioctl to handle batched discard · e681c047
      Lukas Czerner authored
      Filesystem independent ioctl was rejected as not common enough to be in
      core vfs ioctl. Since we still need to access to this functionality this
      commit adds ext4 specific ioctl EXT4_IOC_TRIM to dispatch
      ext4_trim_fs().
      
      It takes fstrim_range structure as an argument. fstrim_range is definec in
      the include/linux/fs.h and its definition is as follows.
      
      struct fstrim_range {
      	__u64 start;
      	__u64 len;
      	__u64 minlen;
      }
      
      start	- first Byte to trim
      len	- number of Bytes to trim from start
      minlen	- minimum extent length to trim, free extents shorter than this
        number of Bytes will be ignored. This will be rounded up to fs
        block size.
      
      After the FITRIM is done, the number of actually discarded Bytes is stored
      in fstrim_range.len to give the user better insight on how much storage
      space has been really released for wear-leveling.
      Signed-off-by: default avatarLukas Czerner <lczerner@redhat.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      e681c047
    • Lukas Czerner's avatar
      fs: Do not dispatch FITRIM through separate super_operation · 93bb41f4
      Lukas Czerner authored
      There was concern that FITRIM ioctl is not common enough to be included
      in core vfs ioctl, as Christoph Hellwig pointed out there's no real point
      in dispatching this out to a separate vector instead of just through
      ->ioctl.
      
      So this commit removes ioctl_fstrim() from vfs ioctl and trim_fs
      from super_operation structure.
      Signed-off-by: default avatarLukas Czerner <lczerner@redhat.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      93bb41f4
  6. 19 Nov, 2010 15 commits
  7. 18 Nov, 2010 3 commits