Merge tag 'vmwgfx-next-2014-01-17' of git://people.freedesktop.org/~thomash/linux into drm-next

Pull request of 2014-01-17 Pull request for 3.14. One not so urgent fix, One huge device update. The pull request corresponds to the patches sent out on dri-devel, except: [PATCH 02/33], review tag typo pointed out by Matt Turner. [PATCH 04/33], dropped. The new surface formats are never used. The upcoming vmware svga2 hardware version 11 will introduce the concept of "guest backed objects" or -resources. The device will in principle get all of its memory from the guest, which has big advantages from the device point of view. This means that vmwgfx contexts, shaders and surfaces need to be backed by guest memory in the form of buffer objects called MOBs, presumably short for MemoryOBjects, which are bound to the device in a special way. This patch series introduces guest backed object support. Some new IOCTLs are added to allocate these new guest backed object, and to optionally provide them with a backing MOB. There is an update to the gallium driver that comes with this update, and it will be pushed in the near timeframe presumably to a separate mesa branch before merged to master. * tag 'vmwgfx-next-2014-01-17' of git://people.freedesktop.org/~thomash/linux: (33 commits) drm/vmwgfx: Invalidate surface on non-readback unbind drm/vmwgfx: Silence the device command verifier drm/vmwgfx: Implement 64-bit Otable- and MOB binding v2 drm/vmwgfx: Fix surface framebuffer check for guest-backed surfaces drm/vmwgfx: Update otable definitions drm/vmwgfx: Use the linux DMA api also for MOBs drm/vmwgfx: Ditch the vmw_dummy_query_bo_prepare function drm/vmwgfx: Persistent tracking of context bindings drm/vmwgfx: Track context bindings and scrub them upon exiting execbuf drm/vmwgfx: Block the BIND_SHADERCONSTS command drm/vmwgfx: Add a parameter to get max MOB memory size drm/vmwgfx: Implement a buffer object synccpu ioctl. drm/vmwgfx: Make sure that the multisampling is off drm/vmwgfx: Extend the command verifier to handle guest-backed on / off drm/vmwgfx: Fix up the vmwgfx_drv.h header for new files drm/vmwgfx: Enable 3D for new hardware version drm/vmwgfx: Add new unused (by user-space) commands to the verifier drm/vmwgfx: Validate guest-backed shader const commands drm/vmwgfx: Add guest-backed shaders drm/vmwgfx: Hook up guest-backed surfaces ...

Merge tag 'vmwgfx-next-2014-01-17' of git://people.freedesktop.org/~thomash/linux into drm-next
Pull request of 2014-01-17 Pull request for 3.14. One not so urgent fix, One huge device update. The pull request corresponds to the patches sent out on dri-devel, except: [PATCH 02/33], review tag typo pointed out by Matt Turner. [PATCH 04/33], dropped. The new surface formats are never used. The upcoming vmware svga2 hardware version 11 will introduce the concept of "guest backed objects" or -resources. The device will in principle get all of its memory from the guest, which has big advantages from the device point of view. This means that vmwgfx contexts, shaders and surfaces need to be backed by guest memory in the form of buffer objects called MOBs, presumably short for MemoryOBjects, which are bound to the device in a special way. This patch series introduces guest backed object support. Some new IOCTLs are added to allocate these new guest backed object, and to optionally provide them with a backing MOB. There is an update to the gallium driver that comes with this update, and it will be pushed in the near timeframe presumably to a separate mesa branch before merged to master. * tag 'vmwgfx-next-2014-01-17' of git://people.freedesktop.org/~thomash/linux: (33 commits) drm/vmwgfx: Invalidate surface on non-readback unbind drm/vmwgfx: Silence the device command verifier drm/vmwgfx: Implement 64-bit Otable- and MOB binding v2 drm/vmwgfx: Fix surface framebuffer check for guest-backed surfaces drm/vmwgfx: Update otable definitions drm/vmwgfx: Use the linux DMA api also for MOBs drm/vmwgfx: Ditch the vmw_dummy_query_bo_prepare function drm/vmwgfx: Persistent tracking of context bindings drm/vmwgfx: Track context bindings and scrub them upon exiting execbuf drm/vmwgfx: Block the BIND_SHADERCONSTS command drm/vmwgfx: Add a parameter to get max MOB memory size drm/vmwgfx: Implement a buffer object synccpu ioctl. drm/vmwgfx: Make sure that the multisampling is off drm/vmwgfx: Extend the command verifier to handle guest-backed on / off drm/vmwgfx: Fix up the vmwgfx_drv.h header for new files drm/vmwgfx: Enable 3D for new hardware version drm/vmwgfx: Add new unused (by user-space) commands to the verifier drm/vmwgfx: Validate guest-backed shader const commands drm/vmwgfx: Add guest-backed shaders drm/vmwgfx: Hook up guest-backed surfaces ...
9354eafd · Dave Airlie · 53dac830 · 1985f999 · 9354eafd · 9354eafd
Commit 9354eafd authored Jan 20, 2014 by Dave Airlie
19 changed files
--- a/drivers/gpu/drm/vmwgfx/Makefile
+++ b/drivers/gpu/drm/vmwgfx/Makefile
@@ -6,6 +6,6 @@ vmwgfx-y := vmwgfx_execbuf.o vmwgfx_gmr.o vmwgfx_kms.o vmwgfx_drv.o \
 	    vmwgfx_fifo.o vmwgfx_irq.o vmwgfx_ldu.o vmwgfx_ttm_glue.o \
 	    vmwgfx_overlay.o vmwgfx_marker.o vmwgfx_gmrid_manager.o \
 	    vmwgfx_fence.o vmwgfx_dmabuf.o vmwgfx_scrn.o vmwgfx_context.o \
-	    vmwgfx_surface.o vmwgfx_prime.o
+	    vmwgfx_surface.o vmwgfx_prime.o vmwgfx_mob.o vmwgfx_shader.o

 obj-$(CONFIG_DRM_VMWGFX) := vmwgfx.o
--- a/drivers/gpu/drm/vmwgfx/svga3d_reg.h
+++ b/drivers/gpu/drm/vmwgfx/svga3d_reg.h
--- a/drivers/gpu/drm/vmwgfx/svga_reg.h
+++ b/drivers/gpu/drm/vmwgfx/svga_reg.h
@@ -169,7 +169,10 @@ enum {
   SVGA_REG_TRACES = 45,            /* Enable trace-based updates even when FIFO is on */
   SVGA_REG_GMRS_MAX_PAGES = 46,    /* Maximum number of 4KB pages for all GMRs */
   SVGA_REG_MEMORY_SIZE = 47,       /* Total dedicated device memory excluding FIFO */
-   SVGA_REG_TOP = 48,               /* Must be 1 more than the last register */
+   SVGA_REG_MAX_PRIMARY_BOUNDING_BOX_MEM = 50,   /* Max primary memory */
+   SVGA_REG_SUGGESTED_GBOBJECT_MEM_SIZE_KB = 51, /* Suggested limit on mob mem */
+   SVGA_REG_DEV_CAP = 52,           /* Write dev cap index, read value */
+   SVGA_REG_TOP = 53,               /* Must be 1 more than the last register */

   SVGA_PALETTE_BASE = 1024,        /* Base of SVGA color map */
   /* Next 768 (== 256*3) registers exist for colormap */
@@ -431,7 +434,10 @@ struct SVGASignedPoint {
 #define SVGA_CAP_TRACES             0x00200000
 #define SVGA_CAP_GMR2               0x00400000
 #define SVGA_CAP_SCREEN_OBJECT_2    0x00800000
-
+#define SVGA_CAP_COMMAND_BUFFERS    0x01000000
+#define SVGA_CAP_DEAD1              0x02000000
+#define SVGA_CAP_CMD_BUFFERS_2      0x04000000
+#define SVGA_CAP_GBOBJECTS          0x08000000

 /*
 * FIFO register indices.

--- a/drivers/gpu/drm/vmwgfx/vmwgfx_buffer.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_buffer.c
@@ -40,6 +40,10 @@ static uint32_t vram_ne_placement_flags = TTM_PL_FLAG_VRAM |
 static uint32_t sys_placement_flags = TTM_PL_FLAG_SYSTEM |
 	TTM_PL_FLAG_CACHED;

+static uint32_t sys_ne_placement_flags = TTM_PL_FLAG_SYSTEM |
+	TTM_PL_FLAG_CACHED |
+	TTM_PL_FLAG_NO_EVICT;
+
 static uint32_t gmr_placement_flags = VMW_PL_FLAG_GMR |
 	TTM_PL_FLAG_CACHED;

@@ -47,6 +51,9 @@ static uint32_t gmr_ne_placement_flags = VMW_PL_FLAG_GMR |
 	TTM_PL_FLAG_CACHED |
 	TTM_PL_FLAG_NO_EVICT;

+static uint32_t mob_placement_flags = VMW_PL_FLAG_MOB |
+	TTM_PL_FLAG_CACHED;
+
 struct ttm_placement vmw_vram_placement = {
 	.fpfn = 0,
 	.lpfn = 0,
@@ -116,16 +123,26 @@ struct ttm_placement vmw_sys_placement = {
 	.busy_placement = &sys_placement_flags
 };

+struct ttm_placement vmw_sys_ne_placement = {
+	.fpfn = 0,
+	.lpfn = 0,
+	.num_placement = 1,
+	.placement = &sys_ne_placement_flags,
+	.num_busy_placement = 1,
+	.busy_placement = &sys_ne_placement_flags
+};
+
 static uint32_t evictable_placement_flags[] = {
 	TTM_PL_FLAG_SYSTEM | TTM_PL_FLAG_CACHED,
 	TTM_PL_FLAG_VRAM | TTM_PL_FLAG_CACHED,
-	VMW_PL_FLAG_GMR | TTM_PL_FLAG_CACHED
+	VMW_PL_FLAG_GMR | TTM_PL_FLAG_CACHED,
+	VMW_PL_FLAG_MOB | TTM_PL_FLAG_CACHED
 };

 struct ttm_placement vmw_evictable_placement = {
 	.fpfn = 0,
 	.lpfn = 0,
-	.num_placement = 3,
+	.num_placement = 4,
 	.placement = evictable_placement_flags,
 	.num_busy_placement = 1,
 	.busy_placement = &sys_placement_flags
@@ -140,10 +157,21 @@ struct ttm_placement vmw_srf_placement = {
 	.busy_placement = gmr_vram_placement_flags
 };

+struct ttm_placement vmw_mob_placement = {
+	.fpfn = 0,
+	.lpfn = 0,
+	.num_placement = 1,
+	.num_busy_placement = 1,
+	.placement = &mob_placement_flags,
+	.busy_placement = &mob_placement_flags
+};
+
 struct vmw_ttm_tt {
 	struct ttm_dma_tt dma_ttm;
 	struct vmw_private *dev_priv;
 	int gmr_id;
+	struct vmw_mob *mob;
+	int mem_type;
 	struct sg_table sgt;
 	struct vmw_sg_table vsgt;
 	uint64_t sg_alloc_size;
@@ -244,6 +272,7 @@ void vmw_piter_start(struct vmw_piter *viter, const struct vmw_sg_table *vsgt,
 		viter->dma_address = &__vmw_piter_dma_addr;
 		viter->page = &__vmw_piter_non_sg_page;
 		viter->addrs = vsgt->addrs;
+		viter->pages = vsgt->pages;
 		break;
 	case vmw_dma_map_populate:
 	case vmw_dma_map_bind:
@@ -424,6 +453,63 @@ static void vmw_ttm_unmap_dma(struct vmw_ttm_tt *vmw_tt)
 	vmw_tt->mapped = false;
 }

+
+/**
+ * vmw_bo_map_dma - Make sure buffer object pages are visible to the device
+ *
+ * @bo: Pointer to a struct ttm_buffer_object
+ *
+ * Wrapper around vmw_ttm_map_dma, that takes a TTM buffer object pointer
+ * instead of a pointer to a struct vmw_ttm_backend as argument.
+ * Note that the buffer object must be either pinned or reserved before
+ * calling this function.
+ */
+int vmw_bo_map_dma(struct ttm_buffer_object *bo)
+{
+	struct vmw_ttm_tt *vmw_tt =
+		container_of(bo->ttm, struct vmw_ttm_tt, dma_ttm.ttm);
+
+	return vmw_ttm_map_dma(vmw_tt);
+}
+
+
+/**
+ * vmw_bo_unmap_dma - Make sure buffer object pages are visible to the device
+ *
+ * @bo: Pointer to a struct ttm_buffer_object
+ *
+ * Wrapper around vmw_ttm_unmap_dma, that takes a TTM buffer object pointer
+ * instead of a pointer to a struct vmw_ttm_backend as argument.
+ */
+void vmw_bo_unmap_dma(struct ttm_buffer_object *bo)
+{
+	struct vmw_ttm_tt *vmw_tt =
+		container_of(bo->ttm, struct vmw_ttm_tt, dma_ttm.ttm);
+
+	vmw_ttm_unmap_dma(vmw_tt);
+}
+
+
+/**
+ * vmw_bo_sg_table - Return a struct vmw_sg_table object for a
+ * TTM buffer object
+ *
+ * @bo: Pointer to a struct ttm_buffer_object
+ *
+ * Returns a pointer to a struct vmw_sg_table object. The object should
+ * not be freed after use.
+ * Note that for the device addresses to be valid, the buffer object must
+ * either be reserved or pinned.
+ */
+const struct vmw_sg_table *vmw_bo_sg_table(struct ttm_buffer_object *bo)
+{
+	struct vmw_ttm_tt *vmw_tt =
+		container_of(bo->ttm, struct vmw_ttm_tt, dma_ttm.ttm);
+
+	return &vmw_tt->vsgt;
+}
+
+
 static int vmw_ttm_bind(struct ttm_tt *ttm, struct ttm_mem_reg *bo_mem)
 {
 	struct vmw_ttm_tt *vmw_be =
@@ -435,9 +521,27 @@ static int vmw_ttm_bind(struct ttm_tt *ttm, struct ttm_mem_reg *bo_mem)
 		return ret;

 	vmw_be->gmr_id = bo_mem->start;
+	vmw_be->mem_type = bo_mem->mem_type;
+
+	switch (bo_mem->mem_type) {
+	case VMW_PL_GMR:
+		return vmw_gmr_bind(vmw_be->dev_priv, &vmw_be->vsgt,
+				    ttm->num_pages, vmw_be->gmr_id);
+	case VMW_PL_MOB:
+		if (unlikely(vmw_be->mob == NULL)) {
+			vmw_be->mob =
+				vmw_mob_create(ttm->num_pages);
+			if (unlikely(vmw_be->mob == NULL))
+				return -ENOMEM;
+		}

-	return vmw_gmr_bind(vmw_be->dev_priv, &vmw_be->vsgt,
-			    ttm->num_pages, vmw_be->gmr_id);
+		return vmw_mob_bind(vmw_be->dev_priv, vmw_be->mob,
+				    &vmw_be->vsgt, ttm->num_pages,
+				    vmw_be->gmr_id);
+	default:
+		BUG();
+	}
+	return 0;
 }

 static int vmw_ttm_unbind(struct ttm_tt *ttm)
@@ -445,7 +549,16 @@ static int vmw_ttm_unbind(struct ttm_tt *ttm)
 	struct vmw_ttm_tt *vmw_be =
 		container_of(ttm, struct vmw_ttm_tt, dma_ttm.ttm);

-	vmw_gmr_unbind(vmw_be->dev_priv, vmw_be->gmr_id);
+	switch (vmw_be->mem_type) {
+	case VMW_PL_GMR:
+		vmw_gmr_unbind(vmw_be->dev_priv, vmw_be->gmr_id);
+		break;
+	case VMW_PL_MOB:
+		vmw_mob_unbind(vmw_be->dev_priv, vmw_be->mob);
+		break;
+	default:
+		BUG();
+	}

 	if (vmw_be->dev_priv->map_mode == vmw_dma_map_bind)
 		vmw_ttm_unmap_dma(vmw_be);
@@ -453,6 +566,7 @@ static int vmw_ttm_unbind(struct ttm_tt *ttm)
 	return 0;
 }

+
 static void vmw_ttm_destroy(struct ttm_tt *ttm)
 {
 	struct vmw_ttm_tt *vmw_be =
@@ -463,9 +577,14 @@ static void vmw_ttm_destroy(struct ttm_tt *ttm)
 		ttm_dma_tt_fini(&vmw_be->dma_ttm);
 	else
 		ttm_tt_fini(ttm);
+
+	if (vmw_be->mob)
+		vmw_mob_destroy(vmw_be->mob);
+
 	kfree(vmw_be);
 }

+
 static int vmw_ttm_populate(struct ttm_tt *ttm)
 {
 	struct vmw_ttm_tt *vmw_tt =
@@ -500,6 +619,12 @@ static void vmw_ttm_unpopulate(struct ttm_tt *ttm)
 	struct vmw_private *dev_priv = vmw_tt->dev_priv;
 	struct ttm_mem_global *glob = vmw_mem_glob(dev_priv);

+
+	if (vmw_tt->mob) {
+		vmw_mob_destroy(vmw_tt->mob);
+		vmw_tt->mob = NULL;
+	}
+
 	vmw_ttm_unmap_dma(vmw_tt);
 	if (dev_priv->map_mode == vmw_dma_alloc_coherent) {
 		size_t size =
@@ -530,6 +655,7 @@ static struct ttm_tt *vmw_ttm_tt_create(struct ttm_bo_device *bdev,

 	vmw_be->dma_ttm.ttm.func = &vmw_ttm_func;
 	vmw_be->dev_priv = container_of(bdev, struct vmw_private, bdev);
+	vmw_be->mob = NULL;

 	if (vmw_be->dev_priv->map_mode == vmw_dma_alloc_coherent)
 		ret = ttm_dma_tt_init(&vmw_be->dma_ttm, bdev, size, page_flags,
@@ -571,6 +697,7 @@ static int vmw_init_mem_type(struct ttm_bo_device *bdev, uint32_t type,
 		man->default_caching = TTM_PL_FLAG_CACHED;
 		break;
 	case VMW_PL_GMR:
+	case VMW_PL_MOB:
 		/*
 		 * "Guest Memory Regions" is an aperture like feature with
 		 *  one slot per bo. There is an upper limit of the number of
@@ -618,6 +745,7 @@ static int vmw_ttm_io_mem_reserve(struct ttm_bo_device *bdev, struct ttm_mem_reg
 	switch (mem->mem_type) {
 	case TTM_PL_SYSTEM:
 	case VMW_PL_GMR:
+	case VMW_PL_MOB:
 		return 0;
 	case TTM_PL_VRAM:
 		mem->bus.offset = mem->start << PAGE_SHIFT;
@@ -677,6 +805,38 @@ static int vmw_sync_obj_wait(void *sync_obj, bool lazy, bool interruptible)
 				  VMW_FENCE_WAIT_TIMEOUT);
 }

+/**
+ * vmw_move_notify - TTM move_notify_callback
+ *
+ * @bo:             The TTM buffer object about to move.
+ * @mem:            The truct ttm_mem_reg indicating to what memory
+ *                  region the move is taking place.
+ *
+ * Calls move_notify for all subsystems needing it.
+ * (currently only resources).
+ */
+static void vmw_move_notify(struct ttm_buffer_object *bo,
+			    struct ttm_mem_reg *mem)
+{
+	vmw_resource_move_notify(bo, mem);
+}
+
+
+/**
+ * vmw_swap_notify - TTM move_notify_callback
+ *
+ * @bo:             The TTM buffer object about to be swapped out.
+ */
+static void vmw_swap_notify(struct ttm_buffer_object *bo)
+{
+	struct ttm_bo_device *bdev = bo->bdev;
+
+	spin_lock(&bdev->fence_lock);
+	ttm_bo_wait(bo, false, false, false);
+	spin_unlock(&bdev->fence_lock);
+}
+
+
 struct ttm_bo_driver vmw_bo_driver = {
 	.ttm_tt_create = &vmw_ttm_tt_create,
 	.ttm_tt_populate = &vmw_ttm_populate,
@@ -691,8 +851,8 @@ struct ttm_bo_driver vmw_bo_driver = {
 	.sync_obj_flush = vmw_sync_obj_flush,
 	.sync_obj_unref = vmw_sync_obj_unref,
 	.sync_obj_ref = vmw_sync_obj_ref,
-	.move_notify = NULL,
-	.swap_notify = NULL,
+	.move_notify = vmw_move_notify,
+	.swap_notify = vmw_swap_notify,
 	.fault_reserve_notify = &vmw_ttm_fault_reserve_notify,
 	.io_mem_reserve = &vmw_ttm_io_mem_reserve,
 	.io_mem_free = &vmw_ttm_io_mem_free,

--- a/drivers/gpu/drm/vmwgfx/vmwgfx_context.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_context.c
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_dmabuf.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_dmabuf.c
@@ -290,8 +290,7 @@ void vmw_bo_get_guest_ptr(const struct ttm_buffer_object *bo,
 /**
 * vmw_bo_pin - Pin or unpin a buffer object without moving it.
 *
- * @bo: The buffer object. Must be reserved, and present either in VRAM
- * or GMR memory.
+ * @bo: The buffer object. Must be reserved.
 * @pin: Whether to pin or unpin.
 *
 */
@@ -303,10 +302,9 @@ void vmw_bo_pin(struct ttm_buffer_object *bo, bool pin)
 	int ret;

 	lockdep_assert_held(&bo->resv->lock.base);
-	BUG_ON(old_mem_type != TTM_PL_VRAM &&
-	       old_mem_type != VMW_PL_GMR);

-	pl_flags = TTM_PL_FLAG_VRAM | VMW_PL_FLAG_GMR | TTM_PL_FLAG_CACHED;
+	pl_flags = TTM_PL_FLAG_VRAM | VMW_PL_FLAG_GMR | VMW_PL_FLAG_MOB
+		| TTM_PL_FLAG_SYSTEM | TTM_PL_FLAG_CACHED;
 	if (pin)
 		pl_flags |= TTM_PL_FLAG_NO_EVICT;


--- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_fifo.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_fifo.c
@@ -35,6 +35,23 @@ bool vmw_fifo_have_3d(struct vmw_private *dev_priv)
 	uint32_t fifo_min, hwversion;
 	const struct vmw_fifo_state *fifo = &dev_priv->fifo;

+	if (!(dev_priv->capabilities & SVGA_CAP_3D))
+		return false;
+
+	if (dev_priv->capabilities & SVGA_CAP_GBOBJECTS) {
+		uint32_t result;
+
+		if (!dev_priv->has_mob)
+			return false;
+
+		mutex_lock(&dev_priv->hw_mutex);
+		vmw_write(dev_priv, SVGA_REG_DEV_CAP, SVGA3D_DEVCAP_3D);
+		result = vmw_read(dev_priv, SVGA_REG_DEV_CAP);
+		mutex_unlock(&dev_priv->hw_mutex);
+
+		return (result != 0);
+	}
+
 	if (!(dev_priv->capabilities & SVGA_CAP_EXTENDED_FIFO))
 		return false;

@@ -511,24 +528,16 @@ int vmw_fifo_send_fence(struct vmw_private *dev_priv, uint32_t *seqno)
 }

 /**
- * vmw_fifo_emit_dummy_query - emits a dummy query to the fifo.
+ * vmw_fifo_emit_dummy_legacy_query - emits a dummy query to the fifo using
+ * legacy query commands.
 *
 * @dev_priv: The device private structure.
 * @cid: The hardware context id used for the query.
 *
- * This function is used to emit a dummy occlusion query with
- * no primitives rendered between query begin and query end.
- * It's used to provide a query barrier, in order to know that when
- * this query is finished, all preceding queries are also finished.
- *
- * A Query results structure should have been initialized at the start
- * of the dev_priv->dummy_query_bo buffer object. And that buffer object
- * must also be either reserved or pinned when this function is called.
- *
- * Returns -ENOMEM on failure to reserve fifo space.
+ * See the vmw_fifo_emit_dummy_query documentation.
 */
-int vmw_fifo_emit_dummy_query(struct vmw_private *dev_priv,
-			      uint32_t cid)
+static int vmw_fifo_emit_dummy_legacy_query(struct vmw_private *dev_priv,
+					    uint32_t cid)
 {
 	/*
 	 * A query wait without a preceding query end will
@@ -566,3 +575,75 @@ int vmw_fifo_emit_dummy_query(struct vmw_private *dev_priv,

 	return 0;
 }
+
+/**
+ * vmw_fifo_emit_dummy_gb_query - emits a dummy query to the fifo using
+ * guest-backed resource query commands.
+ *
+ * @dev_priv: The device private structure.
+ * @cid: The hardware context id used for the query.
+ *
+ * See the vmw_fifo_emit_dummy_query documentation.
+ */
+static int vmw_fifo_emit_dummy_gb_query(struct vmw_private *dev_priv,
+					uint32_t cid)
+{
+	/*
+	 * A query wait without a preceding query end will
+	 * actually finish all queries for this cid
+	 * without writing to the query result structure.
+	 */
+
+	struct ttm_buffer_object *bo = dev_priv->dummy_query_bo;
+	struct {
+		SVGA3dCmdHeader header;
+		SVGA3dCmdWaitForGBQuery body;
+	} *cmd;
+
+	cmd = vmw_fifo_reserve(dev_priv, sizeof(*cmd));
+
+	if (unlikely(cmd == NULL)) {
+		DRM_ERROR("Out of fifo space for dummy query.\n");
+		return -ENOMEM;
+	}
+
+	cmd->header.id = SVGA_3D_CMD_WAIT_FOR_GB_QUERY;
+	cmd->header.size = sizeof(cmd->body);
+	cmd->body.cid = cid;
+	cmd->body.type = SVGA3D_QUERYTYPE_OCCLUSION;
+	BUG_ON(bo->mem.mem_type != VMW_PL_MOB);
+	cmd->body.mobid = bo->mem.start;
+	cmd->body.offset = 0;
+
+	vmw_fifo_commit(dev_priv, sizeof(*cmd));
+
+	return 0;
+}
+
+
+/**
+ * vmw_fifo_emit_dummy_gb_query - emits a dummy query to the fifo using
+ * appropriate resource query commands.
+ *
+ * @dev_priv: The device private structure.
+ * @cid: The hardware context id used for the query.
+ *
+ * This function is used to emit a dummy occlusion query with
+ * no primitives rendered between query begin and query end.
+ * It's used to provide a query barrier, in order to know that when
+ * this query is finished, all preceding queries are also finished.
+ *
+ * A Query results structure should have been initialized at the start
+ * of the dev_priv->dummy_query_bo buffer object. And that buffer object
+ * must also be either reserved or pinned when this function is called.
+ *
+ * Returns -ENOMEM on failure to reserve fifo space.
+ */
+int vmw_fifo_emit_dummy_query(struct vmw_private *dev_priv,
+			      uint32_t cid)
+{
+	if (dev_priv->has_mob)
+		return vmw_fifo_emit_dummy_gb_query(dev_priv, cid);
+
+	return vmw_fifo_emit_dummy_legacy_query(dev_priv, cid);
+}
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_gmr.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_gmr.c
@@ -125,181 +125,27 @@ static void vmw_gmr2_unbind(struct vmw_private *dev_priv,
 }


-static void vmw_gmr_free_descriptors(struct device *dev, dma_addr_t desc_dma,
-				     struct list_head *desc_pages)
-{
-	struct page *page, *next;
-	struct svga_guest_mem_descriptor *page_virtual;
-	unsigned int desc_per_page = PAGE_SIZE /
-		sizeof(struct svga_guest_mem_descriptor) - 1;
-
-	if (list_empty(desc_pages))
-		return;
-
-	list_for_each_entry_safe(page, next, desc_pages, lru) {
-		list_del_init(&page->lru);
-
-		if (likely(desc_dma != DMA_ADDR_INVALID)) {
-			dma_unmap_page(dev, desc_dma, PAGE_SIZE,
-				       DMA_TO_DEVICE);
-		}
-
-		page_virtual = kmap_atomic(page);
-		desc_dma = (dma_addr_t)
-			le32_to_cpu(page_virtual[desc_per_page].ppn) <<
-			PAGE_SHIFT;
-		kunmap_atomic(page_virtual);
-
-		__free_page(page);
-	}
-}
-
-/**
- * FIXME: Adjust to the ttm lowmem / highmem storage to minimize
- * the number of used descriptors.
- *
- */
-
-static int vmw_gmr_build_descriptors(struct device *dev,
-				     struct list_head *desc_pages,
-				     struct vmw_piter *iter,
-				     unsigned long num_pages,
-				     dma_addr_t *first_dma)
-{
-	struct page *page;
-	struct svga_guest_mem_descriptor *page_virtual = NULL;
-	struct svga_guest_mem_descriptor *desc_virtual = NULL;
-	unsigned int desc_per_page;
-	unsigned long prev_pfn;
-	unsigned long pfn;
-	int ret;
-	dma_addr_t desc_dma;
-
-	desc_per_page = PAGE_SIZE /
-	    sizeof(struct svga_guest_mem_descriptor) - 1;
-
-	while (likely(num_pages != 0)) {
-		page = alloc_page(__GFP_HIGHMEM);
-		if (unlikely(page == NULL)) {
-			ret = -ENOMEM;
-			goto out_err;
-		}
-
-		list_add_tail(&page->lru, desc_pages);
-		page_virtual = kmap_atomic(page);
-		desc_virtual = page_virtual - 1;
-		prev_pfn = ~(0UL);
-
-		while (likely(num_pages != 0)) {
-			pfn = vmw_piter_dma_addr(iter) >> PAGE_SHIFT;
-
-			if (pfn != prev_pfn + 1) {
-
-				if (desc_virtual - page_virtual ==
-				    desc_per_page - 1)
-					break;
-
-				(++desc_virtual)->ppn = cpu_to_le32(pfn);
-				desc_virtual->num_pages = cpu_to_le32(1);
-			} else {
-				uint32_t tmp =
-				    le32_to_cpu(desc_virtual->num_pages);
-				desc_virtual->num_pages = cpu_to_le32(tmp + 1);
-			}
-			prev_pfn = pfn;
-			--num_pages;
-			vmw_piter_next(iter);
-		}
-
-		(++desc_virtual)->ppn = DMA_PAGE_INVALID;
-		desc_virtual->num_pages = cpu_to_le32(0);
-		kunmap_atomic(page_virtual);
-	}
-
-	desc_dma = 0;
-	list_for_each_entry_reverse(page, desc_pages, lru) {
-		page_virtual = kmap_atomic(page);
-		page_virtual[desc_per_page].ppn = cpu_to_le32
-			(desc_dma >> PAGE_SHIFT);
-		kunmap_atomic(page_virtual);
-		desc_dma = dma_map_page(dev, page, 0, PAGE_SIZE,
-					DMA_TO_DEVICE);
-
-		if (unlikely(dma_mapping_error(dev, desc_dma)))
-			goto out_err;
-	}
-	*first_dma = desc_dma;
-
-	return 0;
-out_err:
-	vmw_gmr_free_descriptors(dev, DMA_ADDR_INVALID, desc_pages);
-	return ret;
-}
-
-static void vmw_gmr_fire_descriptors(struct vmw_private *dev_priv,
-				     int gmr_id, dma_addr_t desc_dma)
-{
-	mutex_lock(&dev_priv->hw_mutex);
-
-	vmw_write(dev_priv, SVGA_REG_GMR_ID, gmr_id);
-	wmb();
-	vmw_write(dev_priv, SVGA_REG_GMR_DESCRIPTOR, desc_dma >> PAGE_SHIFT);
-	mb();
-
-	mutex_unlock(&dev_priv->hw_mutex);
-
-}
-
 int vmw_gmr_bind(struct vmw_private *dev_priv,
 		 const struct vmw_sg_table *vsgt,
 		 unsigned long num_pages,
 		 int gmr_id)
 {
-	struct list_head desc_pages;
-	dma_addr_t desc_dma = 0;
-	struct device *dev = dev_priv->dev->dev;
 	struct vmw_piter data_iter;
-	int ret;

 	vmw_piter_start(&data_iter, vsgt, 0);

 	if (unlikely(!vmw_piter_next(&data_iter)))
 		return 0;

-	if (likely(dev_priv->capabilities & SVGA_CAP_GMR2))
-		return vmw_gmr2_bind(dev_priv, &data_iter, num_pages, gmr_id);
-
-	if (unlikely(!(dev_priv->capabilities & SVGA_CAP_GMR)))
-		return -EINVAL;
-
-	if (vsgt->num_regions > dev_priv->max_gmr_descriptors)
+	if (unlikely(!(dev_priv->capabilities & SVGA_CAP_GMR2)))
 		return -EINVAL;

-	INIT_LIST_HEAD(&desc_pages);
-
-	ret = vmw_gmr_build_descriptors(dev, &desc_pages, &data_iter,
-					num_pages, &desc_dma);
-	if (unlikely(ret != 0))
-		return ret;
-
-	vmw_gmr_fire_descriptors(dev_priv, gmr_id, desc_dma);
-	vmw_gmr_free_descriptors(dev, desc_dma, &desc_pages);
-
-	return 0;
+	return vmw_gmr2_bind(dev_priv, &data_iter, num_pages, gmr_id);
 }


 void vmw_gmr_unbind(struct vmw_private *dev_priv, int gmr_id)
 {
-	if (likely(dev_priv->capabilities & SVGA_CAP_GMR2)) {
+	if (likely(dev_priv->capabilities & SVGA_CAP_GMR2))
 		vmw_gmr2_unbind(dev_priv, gmr_id);
-		return;
-	}
-
-	mutex_lock(&dev_priv->hw_mutex);
-	vmw_write(dev_priv, SVGA_REG_GMR_ID, gmr_id);
-	wmb();
-	vmw_write(dev_priv, SVGA_REG_GMR_DESCRIPTOR, 0);
-	mb();
-	mutex_unlock(&dev_priv->hw_mutex);
 }
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c
@@ -125,10 +125,21 @@ static int vmw_gmrid_man_init(struct ttm_mem_type_manager *man,
 		return -ENOMEM;

 	spin_lock_init(&gman->lock);
-	gman->max_gmr_pages = dev_priv->max_gmr_pages;
 	gman->used_gmr_pages = 0;
 	ida_init(&gman->gmr_ida);
-	gman->max_gmr_ids = p_size;
+
+	switch (p_size) {
+	case VMW_PL_GMR:
+		gman->max_gmr_ids = dev_priv->max_gmr_ids;
+		gman->max_gmr_pages = dev_priv->max_gmr_pages;
+		break;
+	case VMW_PL_MOB:
+		gman->max_gmr_ids = VMWGFX_NUM_MOB;
+		gman->max_gmr_pages = dev_priv->max_mob_pages;
+		break;
+	default:
+		BUG();
+	}
 	man->priv = (void *) gman;
 	return 0;
 }

--- a/drivers/gpu/drm/vmwgfx/vmwgfx_ioctl.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_ioctl.c
@@ -53,7 +53,7 @@ int vmw_getparam_ioctl(struct drm_device *dev, void *data,
 		param->value = dev_priv->fifo.capabilities;
 		break;
 	case DRM_VMW_PARAM_MAX_FB_SIZE:
-		param->value = dev_priv->vram_size;
+		param->value = dev_priv->prim_bb_mem;
 		break;
 	case DRM_VMW_PARAM_FIFO_HW_VERSION:
 	{
@@ -68,6 +68,20 @@ int vmw_getparam_ioctl(struct drm_device *dev, void *data,
 				  SVGA_FIFO_3D_HWVERSION));
 		break;
 	}
+	case DRM_VMW_PARAM_MAX_SURF_MEMORY:
+		param->value = dev_priv->memory_size;
+		break;
+	case DRM_VMW_PARAM_3D_CAPS_SIZE:
+		if (dev_priv->capabilities & SVGA_CAP_GBOBJECTS)
+			param->value = SVGA3D_DEVCAP_MAX;
+		else
+			param->value = (SVGA_FIFO_3D_CAPS_LAST -
+					SVGA_FIFO_3D_CAPS + 1);
+		param->value *= sizeof(uint32_t);
+		break;
+	case DRM_VMW_PARAM_MAX_MOB_MEMORY:
+		param->value = dev_priv->max_mob_pages * PAGE_SIZE;
+		break;
 	default:
 		DRM_ERROR("Illegal vmwgfx get param request: %d\n",
 			  param->param);
@@ -89,13 +103,19 @@ int vmw_get_cap_3d_ioctl(struct drm_device *dev, void *data,
 	void __user *buffer = (void __user *)((unsigned long)(arg->buffer));
 	void *bounce;
 	int ret;
+	bool gb_objects = !!(dev_priv->capabilities & SVGA_CAP_GBOBJECTS);

 	if (unlikely(arg->pad64 != 0)) {
 		DRM_ERROR("Illegal GET_3D_CAP argument.\n");
 		return -EINVAL;
 	}

-	size = (SVGA_FIFO_3D_CAPS_LAST - SVGA_FIFO_3D_CAPS + 1) << 2;
+	if (gb_objects)
+		size = SVGA3D_DEVCAP_MAX;
+	else
+		size = (SVGA_FIFO_3D_CAPS_LAST - SVGA_FIFO_3D_CAPS + 1);
+
+	size *= sizeof(uint32_t);

 	if (arg->max_size < size)
 		size = arg->max_size;
@@ -106,8 +126,22 @@ int vmw_get_cap_3d_ioctl(struct drm_device *dev, void *data,
 		return -ENOMEM;
 	}

-	fifo_mem = dev_priv->mmio_virt;
-	memcpy_fromio(bounce, &fifo_mem[SVGA_FIFO_3D_CAPS], size);
+	if (gb_objects) {
+		int i;
+		uint32_t *bounce32 = (uint32_t *) bounce;
+
+		mutex_lock(&dev_priv->hw_mutex);
+		for (i = 0; i < SVGA3D_DEVCAP_MAX; ++i) {
+			vmw_write(dev_priv, SVGA_REG_DEV_CAP, i);
+			*bounce32++ = vmw_read(dev_priv, SVGA_REG_DEV_CAP);
+		}
+		mutex_unlock(&dev_priv->hw_mutex);
+
+	} else {
+
+		fifo_mem = dev_priv->mmio_virt;
+		memcpy_fromio(bounce, &fifo_mem[SVGA_FIFO_3D_CAPS], size);
+	}

 	ret = copy_to_user(buffer, bounce, size);
 	if (ret)

--- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
@@ -672,9 +672,9 @@ static int vmw_kms_new_framebuffer_surface(struct vmw_private *dev_priv,

 	if (unlikely(surface->mip_levels[0] != 1 ||
 		     surface->num_sizes != 1 ||
-		     surface->sizes[0].width < mode_cmd->width ||
-		     surface->sizes[0].height < mode_cmd->height ||
-		     surface->sizes[0].depth != 1)) {
+		     surface->base_size.width < mode_cmd->width ||
+		     surface->base_size.height < mode_cmd->height ||
+		     surface->base_size.depth != 1)) {
 		DRM_ERROR("Incompatible surface dimensions "
 			  "for requested mode.\n");
 		return -EINVAL;
@@ -1645,7 +1645,7 @@ bool vmw_kms_validate_mode_vram(struct vmw_private *dev_priv,
 				uint32_t pitch,
 				uint32_t height)
 {
-	return ((u64) pitch * (u64) height) < (u64) dev_priv->vram_size;
+	return ((u64) pitch * (u64) height) < (u64) dev_priv->prim_bb_mem;
 }



--- a/drivers/gpu/drm/vmwgfx/vmwgfx_mob.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_mob.c
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
@@ -215,6 +215,7 @@ int vmw_resource_init(struct vmw_private *dev_priv, struct vmw_resource *res,
 	res->func = func;
 	INIT_LIST_HEAD(&res->lru_head);
 	INIT_LIST_HEAD(&res->mob_head);
+	INIT_LIST_HEAD(&res->binding_head);
 	res->id = -1;
 	res->backup = NULL;
 	res->backup_offset = 0;
@@ -441,6 +442,21 @@ static void vmw_user_dmabuf_release(struct ttm_base_object **p_base)
 	ttm_bo_unref(&bo);
 }

+static void vmw_user_dmabuf_ref_obj_release(struct ttm_base_object *base,
+					    enum ttm_ref_type ref_type)
+{
+	struct vmw_user_dma_buffer *user_bo;
+	user_bo = container_of(base, struct vmw_user_dma_buffer, prime.base);
+
+	switch (ref_type) {
+	case TTM_REF_SYNCCPU_WRITE:
+		ttm_bo_synccpu_write_release(&user_bo->dma.base);
+		break;
+	default:
+		BUG();
+	}
+}
+
 /**
 * vmw_user_dmabuf_alloc - Allocate a user dma buffer
 *
@@ -471,6 +487,8 @@ int vmw_user_dmabuf_alloc(struct vmw_private *dev_priv,
 	}

 	ret = vmw_dmabuf_init(dev_priv, &user_bo->dma, size,
+			      (dev_priv->has_mob) ?
+			      &vmw_sys_placement :
 			      &vmw_vram_sys_placement, true,
 			      &vmw_user_dmabuf_destroy);
 	if (unlikely(ret != 0))
@@ -482,7 +500,8 @@ int vmw_user_dmabuf_alloc(struct vmw_private *dev_priv,
 				    &user_bo->prime,
 				    shareable,
 				    ttm_buffer_type,
-				    &vmw_user_dmabuf_release, NULL);
+				    &vmw_user_dmabuf_release,
+				    &vmw_user_dmabuf_ref_obj_release);
 	if (unlikely(ret != 0)) {
 		ttm_bo_unref(&tmp);
 		goto out_no_base_object;
@@ -515,6 +534,130 @@ int vmw_user_dmabuf_verify_access(struct ttm_buffer_object *bo,
 		vmw_user_bo->prime.base.shareable) ? 0 : -EPERM;
 }

+/**
+ * vmw_user_dmabuf_synccpu_grab - Grab a struct vmw_user_dma_buffer for cpu
+ * access, idling previous GPU operations on the buffer and optionally
+ * blocking it for further command submissions.
+ *
+ * @user_bo: Pointer to the buffer object being grabbed for CPU access
+ * @tfile: Identifying the caller.
+ * @flags: Flags indicating how the grab should be performed.
+ *
+ * A blocking grab will be automatically released when @tfile is closed.
+ */
+static int vmw_user_dmabuf_synccpu_grab(struct vmw_user_dma_buffer *user_bo,
+					struct ttm_object_file *tfile,
+					uint32_t flags)
+{
+	struct ttm_buffer_object *bo = &user_bo->dma.base;
+	bool existed;
+	int ret;
+
+	if (flags & drm_vmw_synccpu_allow_cs) {
+		struct ttm_bo_device *bdev = bo->bdev;
+
+		spin_lock(&bdev->fence_lock);
+		ret = ttm_bo_wait(bo, false, true,
+				  !!(flags & drm_vmw_synccpu_dontblock));
+		spin_unlock(&bdev->fence_lock);
+		return ret;
+	}
+
+	ret = ttm_bo_synccpu_write_grab
+		(bo, !!(flags & drm_vmw_synccpu_dontblock));
+	if (unlikely(ret != 0))
+		return ret;
+
+	ret = ttm_ref_object_add(tfile, &user_bo->prime.base,
+				 TTM_REF_SYNCCPU_WRITE, &existed);
+	if (ret != 0 || existed)
+		ttm_bo_synccpu_write_release(&user_bo->dma.base);
+
+	return ret;
+}
+
+/**
+ * vmw_user_dmabuf_synccpu_release - Release a previous grab for CPU access,
+ * and unblock command submission on the buffer if blocked.
+ *
+ * @handle: Handle identifying the buffer object.
+ * @tfile: Identifying the caller.
+ * @flags: Flags indicating the type of release.
+ */
+static int vmw_user_dmabuf_synccpu_release(uint32_t handle,
+					   struct ttm_object_file *tfile,
+					   uint32_t flags)
+{
+	if (!(flags & drm_vmw_synccpu_allow_cs))
+		return ttm_ref_object_base_unref(tfile, handle,
+						 TTM_REF_SYNCCPU_WRITE);
+
+	return 0;
+}
+
+/**
+ * vmw_user_dmabuf_synccpu_release - ioctl function implementing the synccpu
+ * functionality.
+ *
+ * @dev: Identifies the drm device.
+ * @data: Pointer to the ioctl argument.
+ * @file_priv: Identifies the caller.
+ *
+ * This function checks the ioctl arguments for validity and calls the
+ * relevant synccpu functions.
+ */
+int vmw_user_dmabuf_synccpu_ioctl(struct drm_device *dev, void *data,
+				  struct drm_file *file_priv)
+{
+	struct drm_vmw_synccpu_arg *arg =
+		(struct drm_vmw_synccpu_arg *) data;
+	struct vmw_dma_buffer *dma_buf;
+	struct vmw_user_dma_buffer *user_bo;
+	struct ttm_object_file *tfile = vmw_fpriv(file_priv)->tfile;
+	int ret;
+
+	if ((arg->flags & (drm_vmw_synccpu_read | drm_vmw_synccpu_write)) == 0
+	    || (arg->flags & ~(drm_vmw_synccpu_read | drm_vmw_synccpu_write |
+			       drm_vmw_synccpu_dontblock |
+			       drm_vmw_synccpu_allow_cs)) != 0) {
+		DRM_ERROR("Illegal synccpu flags.\n");
+		return -EINVAL;
+	}
+
+	switch (arg->op) {
+	case drm_vmw_synccpu_grab:
+		ret = vmw_user_dmabuf_lookup(tfile, arg->handle, &dma_buf);
+		if (unlikely(ret != 0))
+			return ret;
+
+		user_bo = container_of(dma_buf, struct vmw_user_dma_buffer,
+				       dma);
+		ret = vmw_user_dmabuf_synccpu_grab(user_bo, tfile, arg->flags);
+		vmw_dmabuf_unreference(&dma_buf);
+		if (unlikely(ret != 0 && ret != -ERESTARTSYS &&
+			     ret != -EBUSY)) {
+			DRM_ERROR("Failed synccpu grab on handle 0x%08x.\n",
+				  (unsigned int) arg->handle);
+			return ret;
+		}
+		break;
+	case drm_vmw_synccpu_release:
+		ret = vmw_user_dmabuf_synccpu_release(arg->handle, tfile,
+						      arg->flags);
+		if (unlikely(ret != 0)) {
+			DRM_ERROR("Failed synccpu release on handle 0x%08x.\n",
+				  (unsigned int) arg->handle);
+			return ret;
+		}
+		break;
+	default:
+		DRM_ERROR("Invalid synccpu operation.\n");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
 int vmw_dmabuf_alloc_ioctl(struct drm_device *dev, void *data,
 			   struct drm_file *file_priv)
 {
@@ -591,7 +734,8 @@ int vmw_user_dmabuf_lookup(struct ttm_object_file *tfile,
 }

 int vmw_user_dmabuf_reference(struct ttm_object_file *tfile,
-			      struct vmw_dma_buffer *dma_buf)
+			      struct vmw_dma_buffer *dma_buf,
+			      uint32_t *handle)
 {
 	struct vmw_user_dma_buffer *user_bo;

@@ -599,6 +743,8 @@ int vmw_user_dmabuf_reference(struct ttm_object_file *tfile,
 		return -EINVAL;

 	user_bo = container_of(dma_buf, struct vmw_user_dma_buffer, dma);
+
+	*handle = user_bo->prime.base.hash.key;
 	return ttm_ref_object_add(tfile, &user_bo->prime.base,
 				  TTM_REF_USAGE, NULL);
 }
@@ -1291,11 +1437,54 @@ void vmw_fence_single_bo(struct ttm_buffer_object *bo,
 * @mem:            The truct ttm_mem_reg indicating to what memory
 *                  region the move is taking place.
 *
- * For now does nothing.
+ * Evicts the Guest Backed hardware resource if the backup
+ * buffer is being moved out of MOB memory.
+ * Note that this function should not race with the resource
+ * validation code as long as it accesses only members of struct
+ * resource that remain static while bo::res is !NULL and
+ * while we have @bo reserved. struct resource::backup is *not* a
+ * static member. The resource validation code will take care
+ * to set @bo::res to NULL, while having @bo reserved when the
+ * buffer is no longer bound to the resource, so @bo:res can be
+ * used to determine whether there is a need to unbind and whether
+ * it is safe to unbind.
 */
 void vmw_resource_move_notify(struct ttm_buffer_object *bo,
 			      struct ttm_mem_reg *mem)
 {
+	struct vmw_dma_buffer *dma_buf;
+
+	if (mem == NULL)
+		return;
+
+	if (bo->destroy != vmw_dmabuf_bo_free &&
+	    bo->destroy != vmw_user_dmabuf_destroy)
+		return;
+
+	dma_buf = container_of(bo, struct vmw_dma_buffer, base);
+
+	if (mem->mem_type != VMW_PL_MOB) {
+		struct vmw_resource *res, *n;
+		struct ttm_bo_device *bdev = bo->bdev;
+		struct ttm_validate_buffer val_buf;
+
+		val_buf.bo = bo;
+
+		list_for_each_entry_safe(res, n, &dma_buf->res_list, mob_head) {
+
+			if (unlikely(res->func->unbind == NULL))
+				continue;
+
+			(void) res->func->unbind(res, true, &val_buf);
+			res->backup_dirty = true;
+			res->res_dirty = false;
+			list_del_init(&res->mob_head);
+		}
+
+		spin_lock(&bdev->fence_lock);
+		(void) ttm_bo_wait(bo, false, false, false);
+		spin_unlock(&bdev->fence_lock);
+	}
 }

 /**

--- a/drivers/gpu/drm/vmwgfx/vmwgfx_shader.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_shader.c
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c
--- a/include/uapi/drm/vmwgfx_drm.h
+++ b/include/uapi/drm/vmwgfx_drm.h