- 27 Sep, 2014 8 commits
-
-
Martin K. Petersen authored
Add a BLK_ prefix to the integrity profile flags. Also rename the flags to be more consistent with the generate/verify terminology in the rest of the integrity code. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Martin K. Petersen authored
Instead of the "operate" parameter we pass in a seed value and a pointer to a function that can be used to process the integrity metadata. The generation function is changed to have a return value to fit into this scheme. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Martin K. Petersen authored
Now that the protection interval has been detached from the sector size we need to be able to handle sizes that are different from 4K and 512. Make the interval calculation generic. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Martin K. Petersen authored
The protection interval is not necessarily tied to the logical block size of a block device. Stop using the terms "sector" and "sectors". Going forward we will use the term "seed" to describe the initial reference tag value for a given I/O. "Interval" will be used to describe the portion of the data buffer that a given piece of protection information is associated with. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Martin K. Petersen authored
bip_buf is not really needed so we can remove it. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Martin K. Petersen authored
None of the filesystems appear interested in using the integrity tagging feature. Potentially because very few storage devices actually permit using the application tag space. Remove the tagging functions. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Martin K. Petersen authored
For commands like REQ_COPY we need a way to pass extra information along with each bio. Like integrity metadata this information must be available at the bottom of the stack so bi_private does not suffice. Rename the existing bi_integrity field to bi_special and make it a union so we can have different bio extensions for each class of command. We previously used bi_integrity != NULL as a way to identify whether a bio had integrity metadata or not. Introduce a REQ_INTEGRITY to be the indicator now that bi_special can contain different things. In addition, bio_integrity(bio) will now return a pointer to the integrity payload (when applicable). Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Martin K. Petersen authored
bdev_integrity_enabled() is only used by bio_integrity_enabled(). Combine these two functions. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
- 25 Sep, 2014 10 commits
-
-
Ming Lei authored
This patch supports to run one single flush machinery for each blk-mq dispatch queue, so that: - current init_request and exit_request callbacks can cover flush request too, then the buggy copying way of initializing flush request's pdu can be fixed - flushing performance gets improved in case of multi hw-queue In fio sync write test over virtio-blk(4 hw queues, ioengine=sync, iodepth=64, numjobs=4, bs=4K), it is observed that througput gets increased a lot over my test environment: - throughput: +70% in case of virtio-blk over null_blk - throughput: +30% in case of virtio-blk over SSD image The multi virtqueue feature isn't merged to QEMU yet, and patches for the feature can be found in below tree: git://kernel.ubuntu.com/ming/qemu.git v2.1.0-mq.4 And simply passing 'num_queues=4 vectors=5' should be enough to enable multi queue(quad queue) feature for QEMU virtio-blk. Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Ming Lei authored
This patch adds 'blk_mq_ctx' parameter to blk_get_flush_queue(), so that this function can find the corresponding blk_flush_queue bound with current mq context since the flush queue will become per hw-queue. For legacy queue, the parameter can be simply 'NULL'. For multiqueue case, the parameter should be set as the context from which the related request is originated. With this context info, the hw queue and related flush queue can be found easily. Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Ming Lei authored
Just figuring out flush queue at the entry of kicking off flush machinery and request's completion handler, then pass it through. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Ming Lei authored
Now mission of the two helpers is over, and just call blk_alloc_flush_queue() and blk_free_flush_queue() directly. Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Ming Lei authored
This patch introduces 'struct blk_flush_queue' and puts all flush machinery related fields into this structure, so that - flush implementation details aren't exposed to driver - it is easy to convert to per dispatch-queue flush machinery This patch is basically a mechanical replacement. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Ming Lei authored
This patch trys to use local variable to access flush request, so that we can convert to per-queue flush machinery a bit easier. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Ming Lei authored
These fields are always used with the flush request, so initialize them together. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Ming Lei authored
These two temporary functions are introduced for holding flush initialization and de-initialization, so that we can introduce 'flush queue' easier in the following patch. And once 'flush queue' and its allocation/free functions are ready, they will be removed for sake of code readability. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Ming Lei authored
It is reasonable to allocate flush req in blk_mq_init_flush(). Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Ming Lei authored
Failure of initializing one hctx isn't handled, so this patch introduces blk_mq_init_hctx() and its pair to handle it explicitly. Also this patch makes code cleaner. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
- 22 Sep, 2014 17 commits
-
-
Christoph Hellwig authored
Some ATA drivers need the dma drain size workaround, and thus need to call blk_mq_start_request before the S/G mapping. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Christoph Hellwig authored
Signed-off-by: Christoph Hellwig <hch@lst.de> Moved blk_mq_rq_timed_out() definition to the private blk-mq.h header. Signed-off-by: Jens Axboe <axboe@fb.com>
-
Ming Lei authored
Commit 8cb34819cdd5d(blk-mq: unshared timeout handler) introduces blk-mq's own timeout handler, and removes following line: blk_queue_rq_timed_out(q, blk_mq_rq_timed_out); which then causes blk_add_timer() to bypass adding the timer, since blk-mq no longer has q->rq_timed_out_fn defined. This patch fixes the problem by bypassing the check for blk-mq, so that both request deadlines are still set and the rolling timer updated. Signed-off-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Jens Axboe authored
It's not uncommon for crash dump kernels to be limited to 128MB or something low in that area. This is normally not a problem for devices as we don't use that much memory, but for some shared SCSI setups with huge queue depths, it can potentially fill most of memory with tons of request allocations. blk-mq does scale back when it fails to allocate memory, but it scales back just enough so that blk-mq succeeds. This could still leave the system with not enough memory to make any real progress. Check if we are in a kdump environment and limit the hardware queues and tag depth. Signed-off-by: Jens Axboe <axboe@fb.com>
-
Ming Lei authored
This patch removes two unnecessary blk_clear_rq_complete(), the REQ_ATOM_COMPLETE flag is cleared inside blk_mq_start_request(), so: - The blk_clear_rq_complete() in blk_flush_restore_request() needn't because the request will be freed later, and clearing it here may open a small race window with timeout. - The blk_clear_rq_complete() in blk_mq_requeue_request() isn't necessary too, even though REQ_ATOM_STARTED is cleared in __blk_mq_requeue_request(), in theory it still may cause a small race window with timeout since the two clear_bit() may be reordered. Signed-off-by: Ming Lei <ming.lei@canoical.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Christoph Hellwig authored
Allow blk-mq to pass an argument to the timeout handler to indicate if we're timing out a reserved or regular command. For many drivers those need to be handled different. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Christoph Hellwig authored
Duplicate the (small) timeout handler in blk-mq so that we can pass arguments more easily to the driver timeout handler. This enables the next patch. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Christoph Hellwig authored
Don't do a kmalloc from timer to handle timeouts, chances are we could be under heavy load or similar and thus just miss out on the timeouts. Fortunately it is very easy to just iterate over all in use tags, and doing this properly actually cleans up the blk_mq_busy_iter API as well, and prepares us for the next patch by passing a reserved argument to the iterator. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Christoph Hellwig authored
Now that we've changed the driver API on the submission side use the opportunity to fix up the name on the completion side to fit into the general scheme. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Christoph Hellwig authored
When we call blk_mq_start_request from the core blk-mq code before calling into ->queue_rq there is a racy window where the timeout handler can hit before we've fully set up the driver specific part of the command. Move the call to blk_mq_start_request into the driver so the driver can start the request only once it is fully set up. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Christoph Hellwig authored
Pass an explicit parameter for the last request in a batch to ->queue_rq instead of using a request flag. Besides being a cleaner and non-stateful interface this is also required for the next patch, which fixes the blk-mq I/O submission code to not start a time too early. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Jens Axboe authored
Moving patches from for-linus to 3.18 instead, pull in this changes that will go to Linus today.
-
Jens Axboe authored
When requests are retried due to hw or sw resource shortages, we often stop the associated hardware queue. So ensure that we restart the queues when running the requeue work, otherwise the queue run will be a no-op. Signed-off-by: Jens Axboe <axboe@fb.com>
-
Jens Axboe authored
__blk_mq_alloc_rq_maps() can be invoked multiple times, if we scale back the queue depth if we are low on memory. So don't clear set->tags when we fail, this is handled directly in the parent function, blk_mq_alloc_tag_set(). Reported-by: Robert Elliott <Elliott@hp.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Christoph Hellwig authored
We should not insert requests into the flush state machine from blk_mq_insert_request. All incoming flush requests come through blk_{m,s}q_make_request and are handled there, while blk_execute_rq_nowait should only be called for BLOCK_PC requests. All other callers deal with requests that already went through the flush statemchine and shouldn't be reinserted into it. Reported-by: Robert Elliott <Elliott@hp.com> Debugged-by: Ming Lei <ming.lei@canonical.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com>
-
David Hildenbrand authored
This patch should fix the bug reported in https://lkml.org/lkml/2014/9/11/249. We have to initialize at least the atomic_flags and the cmd_flags when allocating storage for the requests. Otherwise blk_mq_timeout_check() might dereference uninitialized pointers when racing with the creation of a request. Also move the reset of cmd_flags for the initializing code to the point where a request is freed. So we will never end up with pending flush request indicators that might trigger dereferences of invalid pointers in blk_mq_timeout_check(). Cc: stable@vger.kernel.org Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reported-by: Paulo De Rezende Pinatti <ppinatti@linux.vnet.ibm.com> Tested-by: Paulo De Rezende Pinatti <ppinatti@linux.vnet.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
Jens Axboe authored
When we start the request, we set the deadline and flip the bits marking the request as started and non-complete. However, it's important that the deadline store is ordered before flipping the bits, otherwise we could have a small window where the request is marked started but with an invalid deadline. This can confuse the timeout handling. Suggested-by: Ming Lei <tom.leiming@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com>
-
- 16 Sep, 2014 3 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixesLinus Torvalds authored
Pull gfs2 fixes from Steven Whitehouse: "Here are a number of small fixes for GFS2. There is a fix for FIEMAP on large sparse files, a negative dentry hashing fix, a fix for flock, and a bug fix relating to d_splice_alias usage. There are also (patches 1 and 5) a couple of updates which are less critical, but small and low risk" * tag 'gfs2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes: GFS2: fix d_splice_alias() misuses GFS2: Don't use MAXQUOTAS value GFS2: Hash the negative dentry during inode lookup GFS2: Request demote when a "try" flock fails GFS2: Change maxlen variables to size_t GFS2: fs/gfs2/super.c: replace seq_printf by seq_puts
-
James Hogan authored
Commit d6bb3e90 ("vfs: simplify and shrink stack frame of link_path_walk()") introduced build problems with GCC versions older than 4.6 due to the initialisation of a member of an anonymous union in struct qstr without enclosing braces. This hits GCC bug 10676 [1] (which was fixed in GCC 4.6 by [2]), and causes the following build error: fs/namei.c: In function 'link_path_walk': fs/namei.c:1778: error: unknown field 'hash_len' specified in initializer This is worked around by adding explicit braces. [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=10676 [2] https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=159206 Fixes: d6bb3e90 (vfs: simplify and shrink stack frame of link_path_walk()) Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: linux-fsdevel@vger.kernel.org Cc: linux-metag@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linuxLinus Torvalds authored
Pull virtio fixes from Rusty Russell: "virtio-rng corner case fixes, with cc:stable. Survived a few days in linux-next" * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: virtio-rng: skip reading when we start to remove the device virtio-rng: fix stuck of hot-unplugging busy device
-
- 15 Sep, 2014 2 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmapLinus Torvalds authored
Pull regmap fix from Mark Brown: "Fix registers file in debugfs Ensure that the mode reported for the registers file in debugfs is accurate by marking it as read only when the define to enable writes has not been set. This is on the edge of being a bug fix but it's debugfs and it makes it much easier for users to spot what's going wrong when they forget to enable writeability" * tag 'regmap-v3.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: regmap: Fix debugfs-file 'registers' mode
-
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/inputLinus Torvalds authored
Pull input updates from Dmitry Torokhov: "A few quirks for i8042/AT keyboards and a small device tree doc fix for Atmel Touchscreens" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: atmel_mxt_ts - fix merge in DT documentation Input: i8042 - also set the firmware id for MUXed ports Input: i8042 - add nomux quirk for Avatar AVIU-145A6 Input: i8042 - add Fujitsu U574 to no_timeout dmi table Input: atkbd - do not try 'deactivate' keyboard on any LG laptops
-