- 29 Feb, 2016 40 commits
-
-
Andy Lutomirski authored
If a process gets access to a mount from a different user namespace, that process should not be able to take advantage of setuid files or selinux entrypoints from that filesystem. Prevent this by treating mounts from other mount namespaces and those not owned by current_user_ns() or an ancestor as nosuid. This will make it safer to allow more complex filesystems to be mounted in non-root user namespaces. This does not remove the need for MNT_LOCK_NOSUID. The setuid, setgid, and file capability bits can no longer be abused if code in a user namespace were to clear nosuid on an untrusted filesystem, but this patch, by itself, is insufficient to protect the system from abuse of files that, when execed, would increase MAC privilege. As a more concrete explanation, any task that can manipulate a vfsmount associated with a given user namespace already has capabilities in that namespace and all of its descendents. If they can cause a malicious setuid, setgid, or file-caps executable to appear in that mount, then that executable will only allow them to elevate privileges in exactly the set of namespaces in which they are already privileges. On the other hand, if they can cause a malicious executable to appear with a dangerous MAC label, running it could change the caller's security context in a way that should not have been possible, even inside the namespace in which the task is confined. As a hardening measure, this would have made CVE-2014-5207 much more difficult to exploit. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: James Morris <james.l.morris@oracle.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Seth Forshee authored
Unprivileged users should not be able to mount block devices when they lack sufficient privileges towards the block device inode. Update blkdev_get_by_path() to validate that the user has the required access to the inode at the specified path. The check will be skipped for CAP_SYS_ADMIN, so privileged mounts will continue working as before. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Seth Forshee authored
When looking up a block device by path no permission check is done to verify that the user has access to the block device inode at the specified path. In some cases it may be necessary to check permissions towards the inode, such as allowing unprivileged users to mount block devices in user namespaces. Add an argument to lookup_bdev() to optionally perform this permission check. A value of 0 skips the permission check and behaves the same as before. A non-zero value specifies the mask of access rights required towards the inode at the specified path. The check is always skipped if the user has CAP_SYS_ADMIN. All callers of lookup_bdev() currently pass a mask of 0, so this patch results in no functional change. Subsequent patches will add permission checks where appropriate. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Seth Forshee authored
Security labels from unprivileged mounts cannot be trusted. Ideally for these mounts we would assign the objects in the filesystem the same label as the inode for the backing device passed to mount. Unfortunately it's currently impossible to determine which inode this is from the LSM mount hooks, so we settle for the label of the process doing the mount. This label is assigned to s_root, and also to smk_default to ensure that new inodes receive this label. The transmute property is also set on s_root to make this behavior more explicit, even though it is technically not necessary. If a filesystem has existing security labels, access to inodes is permitted if the label is the same as smk_root, otherwise access is denied. The SMACK64EXEC xattr is completely ignored. Explicit setting of security labels continues to require CAP_MAC_ADMIN in init_user_ns. Altogether, this ensures that filesystem objects are not accessible to subjects which cannot already access the backing store, that MAC is not violated for any objects in the fileystem which are already labeled, and that a user cannot use an unprivileged mount to gain elevated MAC privileges. sysfs, tmpfs, and ramfs are already mountable from user namespaces and support security labels. We can't rule out the possibility that these filesystems may already be used in mounts from user namespaces with security lables set from the init namespace, so failing to trust lables in these filesystems may introduce regressions. It is safe to trust labels from these filesystems, since the unprivileged user does not control the backing store and thus cannot supply security labels, so an explicit exception is made to trust labels from these filesystems. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Seth Forshee authored
Capability sets attached to files must be ignored except in the user namespaces where the mounter is privileged, i.e. s_user_ns and its descendants. Otherwise a vector exists for gaining privileges in namespaces where a user is not already privileged. Add a new helper function, in_user_ns(), to test whether a user namespace is the same as or a descendant of another namespace. Use this helper to determine whether a file's capability set should be applied to the caps constructed during exec. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Seth Forshee authored
Initially this will be used to eliminate the implicit MNT_NODEV flag for mounts from user namespaces. In the future it will also be used for translating ids and checking capabilities for filesystems mounted from user namespaces. s_user_ns is initialized in alloc_super() and is generally set to current_user_ns(). To avoid security and corruption issues, two additional mount checks are also added: - do_new_mount() gains a check that the user has CAP_SYS_ADMIN in current_user_ns(). - sget() will fail with EBUSY when the filesystem it's looking for is already mounted from another user namespace. proc requires some special handling. The user namespace of current isn't appropriate when forking as a result of clone (2) with CLONE_NEWPID|CLONE_NEWUSER, as it will set s_user_ns to the namespace of the parent and make proc unmountable in the new user namespace. Instead, the user namespace which owns the new pid namespace is used. sget_userns() is allowed to allow passing in a namespace other than that of current, and sget becomes a wrapper around sget_userns() which passes current_user_ns(). Changes to original version of this patch * Documented @user_ns in sget_userns, alloc_super and fs.h * Kept an blank line in fs.h * Removed unncessary include of user_namespace.h from fs.h * Tweaked the location of get_user_ns and put_user_ns so the security modules can (if they wish) depend on it. -- EWB Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Xiangliang Yu authored
BugLink: http://bugs.launchpad.net/bugs/1542071 This adds support for AMD's PCI-Express Non-Transparent Bridge (NTB) device on the Zeppelin platform. The driver connnects to the standard NTB sub-system interface, with modification to add hooks for power management in a separate patch. The AMD NTB device has 3 memory windows, 16 doorbell, 16 scratch-pad registers, and supports up to 16 PCIe lanes running a Gen3 speeds. Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com> Reviewed-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us> (cherry picked from commit a1b36958) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Tim Gardner authored
BugLink: http://bugs.launchpad.net/bugs/1542071Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Joseph Salisbury authored
BugLink: http://bugs.launchpad.net/bugs/1495983 OriginalAuthor: Olaf Hering <olaf@aepfle.de> Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com> Acked-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Brad Figg <brad.figg@canonical.com>
-
Tim Gardner authored
BugLink: http://bugs.launchpad.net/bugs/1545542Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Suman Tripathi authored
ahci_xgene: Implement the workaround to fix the missing of the edge interrupt for the HOST_IRQ_STAT. Due to H/W errata, the HOST_IRQ_STAT register misses the edge interrupt when clearing the HOST_IRQ_STAT register and hardware reporting the PORT_IRQ_STAT register happens to be at the same clock cycle. Signed-off-by: Suman Tripathi <stripathi@apm.com> Signed-off-by: Tejun Heo <tj@kernel.org> (cherry picked from linux-next commit 32aea268) Signed-off-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Suman Tripathi authored
The flexibility to override the irq handles in the LLD's are already present, so controllers implementing a edge trigger latch can implement their own interrupt handler inside the driver. This patch removes the AHCI_HFLAG_EDGE_IRQ support from libahci and moves edge irq handling to ahci_xgene. tj: Minor update to description. Signed-off-by: Suman Tripathi <stripathi@apm.com> Signed-off-by: Tejun Heo <tj@kenrel.org> (cherry picked from linux-next commit d867b95f) [ dannf: offset adjustments ] Signed-off-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Suman Tripathi authored
This patch implements the capability to override the generic AHCI interrupt handler so that specific ahci drivers can implement their own custom interrupt handler routines. It also exports ahci_handle_port_intr so that custom irq_handler implementations can use it. tj: s/ahci_irq_handler/irq_handler/ and updated description. Signed-off-by: Suman Tripathi <stripathi@apm.com> Signed-off-by: Tejun Heo <tj@kernel.org> (cherry picked from linux-next commit f070d671) [ dannf: backported to v4.4 ] Signed-off-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Nicholas Krause authored
This adds the needed check after the call to the function mraid_mm_alloc_kioc in order to make sure that this function has not returned NULL and therefore makes sure we do not deference a NULL pointer if one is returned by mraid_mm_alloc_kioc. Further more add needed comments explaining that this function call can return NULL if the list head is empty for the pointer passed in order to allow furture users to understand this required pointer check. Signed-off-by: Nicholas Krause <xerofoify@gmail.com> Acked-by: Sumit Saxena <sumit.saxena@avagotech.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit 7296f62f) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Tim Gardner authored
Ignore: yes Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Tim Gardner authored
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Sebastian Ott authored
BugLink: http://bugs.launchpad.net/bugs/1541534 Per channel path measurement characteristics are obtained during channel path registration. However if some properties of a channel path change we don't update the measurement characteristics. Make sure to update the characteristics when we change the properties of a channel path or receive a notification from FW about such a change. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> (cherry picked from commit 9f3d6d7a) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Sebastian Ott authored
BugLink: http://bugs.launchpad.net/bugs/1541534 Make sure that in all cases where we could not obtain measurement characteristics the associated fields are set to invalid values. Note: without this change the "shared" capability of a channel path for which we could not obtain the measurement characteristics was incorrectly displayed as 0 (not shared). We will now correctly report "unknown" in this case. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> (cherry picked from commit 61f0bfcf) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Sebastian Ott authored
BugLink: http://bugs.launchpad.net/bugs/1541534 Measurement characteristics are allocated during channel path registration but not freed during deregistration. Fix this by embedding these characteristics inside struct channel_path. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> (cherry picked from commit 0d9bfe91) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Ursula Braun authored
BugLink: http://bugs.launchpad.net/bugs/1541907 /sys/class/net/<interface>/operstate for an active qeth network interface offen shows "unknown", which translates to "state UNKNOWN in output of "ip link show". It is caused by a missing initialization of the __LINK_STATE_NOCARRIER bit in the net_device state field. This patch adds a netif_carrier_off() invocation when creating the net_device for a qeth device. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Acked-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Reference-ID: Bugzilla 133209 Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit e5ebe632) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Tim Gardner authored
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Dan Williams authored
BugLink: http://bugs.launchpad.net/bugs/1534647 ZONE_DEVICE (merged in 4.3) and ZONE_CMA (proposed) are examples of new mm zones that are bumping up against the current maximum limit of 4 zones, i.e. 2 bits in page->flags. When adding a zone this equation still needs to be satisified: SECTIONS_WIDTH + ZONES_WIDTH + NODES_SHIFT + LAST_CPUPID_SHIFT <= BITS_PER_LONG - NR_PAGEFLAGS ZONE_DEVICE currently tries to satisfy this equation by requiring that ZONE_DMA be disabled, but this is untenable given generic kernels want to support ZONE_DEVICE and ZONE_DMA simultaneously. ZONE_CMA would like to increase the amount of memory covered per section, but that limits the minimum granularity at which consecutive memory ranges can be added via devm_memremap_pages(). The trade-off of what is acceptable to sacrifice depends heavily on the platform. For example, ZONE_CMA is targeted for 32-bit platforms where page->flags is constrained, but those platforms likely do not care about the minimum granularity of memory hotplug. A big iron machine with 1024 numa nodes can likely sacrifice ZONE_DMA where a general purpose distribution kernel can not. CONFIG_NR_ZONES_EXTENDED is a configuration symbol that gets selected when the number of configured zones exceeds 4. It documents the configuration symbols and definitions that get modified when ZONES_WIDTH is greater than 2. For now, it steals a bit from NODES_SHIFT. Later on it can be used to document the definitions that get modified when a 32-bit configuration wants more zone bits. Note that GFP_ZONE_TABLE poses an interesting constraint since include/linux/gfp.h gets included by the 32-bit portion of a 64-bit build. We need to be careful to only build the table for zones that have a corresponding gfp_t flag. GFP_ZONES_SHIFT is introduced for this purpose. This patch does not attempt to solve the problem of adding a new zone that also has a corresponding GFP_ flag. Link: https://bugzilla.kernel.org/show_bug.cgi?id=110931 Fixes: 033fbae9 ("mm: ZONE_DEVICE for "device memory"") Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reported-by: Mark <markk@clara.co.uk> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> (cherry picked from linux-next commit 27ffb3827ac71a46e8d52fc7ed7302d33a619d6c) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Tim Gardner authored
BugLink: http://bugs.launchpad.net/bugs/1534647Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Hemant Kumar authored
BugLink: http://bugs.launchpad.net/bugs/1521678 Powerpc provides hcall events that also provides insights into guest behaviour. Enhance perf kvm stat to record and analyze hcall events. - To trace hcall events : perf kvm stat record - To show the results : perf kvm stat report --event=hcall The result shows the number of hypervisor calls from the guest grouped by their respective reasons displayed with the frequency. This patch makes use of two additional tracepoints "kvm_hv:kvm_hcall_enter" and "kvm_hv:kvm_hcall_exit". To map the hcall codes to their respective names, it needs a mapping. Such mapping is added in this patch in book3s_hcalls.h. # pgrep qemu A sample output : 19378 60515 2 VMs running. # perf kvm stat record -a ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 4.153 MB perf.data.guest (39624 samples) ] # perf kvm stat report -p 60515 --event=hcall Analyze events for all VMs, all VCPUs: HCALL-EVENT Samples Samples% Time% MinTime MaxTime AvgTime H_IPI 822 66.08% 88.10% 0.63us 11.38us 2.05us (+- 1.42%) H_SEND_CRQ 144 11.58% 3.77% 0.41us 0.88us 0.50us (+- 1.47%) H_VIO_SIGNAL 118 9.49% 2.86% 0.37us 0.83us 0.47us (+- 1.43%) H_PUT_TERM_CHAR 76 6.11% 2.07% 0.37us 0.90us 0.52us (+- 2.43%) H_GET_TERM_CHAR 74 5.95% 2.23% 0.37us 1.70us 0.58us (+- 4.77%) H_RTAS 6 0.48% 0.85% 1.10us 9.25us 2.70us (+-48.57%) H_PERFMON 4 0.32% 0.12% 0.41us 0.96us 0.59us (+-20.92%) Total Samples:1244, Total events handled time:1916.69us. Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Alexander Yarygin <yarygin@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Scott Wood <scottwood@freescale.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1453962787-15376-4-git-send-email-hemant@linux.vnet.ibm.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> (cherry picked from linux-next commit 78e6c39b) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Hemant Kumar authored
BugLink: http://bugs.launchpad.net/bugs/1521678 perf kvm can be used to analyze guest exit reasons. This support already exists in x86. Hence, porting it to powerpc. - To trace KVM events : perf kvm stat record If many guests are running, we can track for a specific guest by using --pid as in : perf kvm stat record --pid <pid> - To see the results : perf kvm stat report The result shows the number of exits (from the guest context to host/hypervisor context) grouped by their respective exit reasons with their frequency. Since, different powerpc machines have different KVM tracepoints, this patch discovers the available tracepoints dynamically and accordingly looks for them. If any single tracepoint is not present, this support won't be enabled for reporting. To record, this will fail if any of the events we are looking to record isn't available. Right now, its only supported on PowerPC Book3S_HV architectures. To analyze the different exits, group them and present them (in a slight descriptive way) to the user, we need a mapping between the "exit code" (dumped in the kvm_guest_exit tracepoint data) and to its related Interrupt vector description (exit reason). This patch adds this mapping in book3s_hv_exits.h. It records on two available KVM tracepoints for book3s_hv: "kvm_hv:kvm_guest_exit" and "kvm_hv:kvm_guest_enter". Here is a sample o/p: # pgrep qemu 19378 60515 2 Guests are running on the host. # perf kvm stat record -a ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 4.153 MB perf.data.guest (39624 samples) ] # perf kvm stat report -p 60515 Analyze events for pid(s) 60515, all VCPUs: VM-EXIT Samples Samples% Time% MinTime MaxTime Avg time SYSCALL 9141 63.67% 7.49% 1.26us 5782.39us 9.87us (+- 6.46%) H_DATA_STORAGE 4114 28.66% 5.07% 1.72us 4597.68us 14.84us (+-20.06%) HV_DECREMENTER 418 2.91% 4.26% 0.70us 30002.22us 122.58us (+-70.29%) EXTERNAL 392 2.73% 0.06% 0.64us 104.10us 1.94us (+-18.83%) RETURN_TO_HOST 287 2.00% 83.11% 1.53us 124240.15us 3486.52us (+-16.81%) H_INST_STORAGE 5 0.03% 0.00% 1.88us 3.73us 2.39us (+-14.20%) Total Samples:14357, Total events handled time:1203918.42us. Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Alexander Yarygin <yarygin@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Scott Wood <scottwood@freescale.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1453962787-15376-3-git-send-email-hemant@linux.vnet.ibm.comSigned-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> (cherry picked from linux-next commit 066d3593) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Hemant Kumar authored
BugLink: http://bugs.launchpad.net/bugs/1521678 This patch removes the "const" qualifier from kvm_events_tp declaration to account for the fact that some architectures may need to update this variable dynamically. For instance, powerpc will need to update this variable dynamically depending on the machine type. Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com> Acked-by: David Ahern <dsahern@gmail.com> Cc: Alexander Yarygin <yarygin@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Scott Wood <scottwood@freescale.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1453962787-15376-2-git-send-email-hemant@linux.vnet.ibm.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> (cherry picked from linux-next commit 48deaa74) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Hemant Kumar authored
BugLink: http://bugs.launchpad.net/bugs/1521678 Its better to remove the dependency on uapi/kvm_perf.h to allow dynamic discovery of kvm events (if its needed). To do this, some extern variables have been introduced with which we can keep the generic functions generic. Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com> Acked-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com> Acked-by: David Ahern <dsahern@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Scott Wood <scottwood@freescale.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1453962787-15376-1-git-send-email-hemant@linux.vnet.ibm.comSigned-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> (cherry picked from linux-next commit 162607ea) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Dan Streetman authored
BugLink: http://bugs.launchpad.net/bugs/1505564 Make the "Attempted send on closed socket" error messages generated in nbd_request_handler() ratelimited. When the nbd socket is shutdown, the nbd_request_handler() function emits an error message for every request remaining in its queue. If the queue is large, this will spam a large amount of messages to the log. There's no need for a separate error message for each request, so this patch ratelimits it. In the specific case this was found, the system was virtual and the error messages were logged to the serial port, which overwhelmed it. Fixes: 4d48a542 ("nbd: fix I/O hang on disconnected nbds") Signed-off-by: Dan Streetman <dan.streetman@canonical.com> Signed-off-by: Markus Pargmann <mpa@pengutronix.de> (cherry-picked from commit da6ccaaa git://git.pengutronix.de/git/mpa/linux-nbd.git) Acked-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Saeed Mahameed authored
BugLink: http://bugs.launchpad.net/bugs/1540435 Under SRIOV there might be a case where VFs are loaded without pre-assigned MAC address. In this case, the VF will randomize its own MAC. This will address the case of administrator not assigning MAC to the VF through the PF OS APIs and keep udev happy. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit 108805fc) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Saeed Mahameed authored
BugLink: http://bugs.launchpad.net/bugs/1540435 E-Switch capabilities should be queried only if E-Switch flow table is supported and not only when vport group manager. Fixes: d6666753 ("net/mlx5: E-Switch, Introduce HCA cap and E-Switch vport context") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit 9bd0a185) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Saeed Mahameed authored
BugLink: http://bugs.launchpad.net/bugs/1540435 Implement and enable SR-IOV ndos to manage SR-IOV configuration via netdev netlink API. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit 66e49ded) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Saeed Mahameed authored
BugLink: http://bugs.launchpad.net/bugs/1540435 Add support to get VF statistics using query vport counter command. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit 3b751a2a) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Saeed Mahameed authored
BugLink: http://bugs.launchpad.net/bugs/1540435 Add query and modify functions to control client vlan and qos striping or insertion, in E-Switch vports contexts. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit 9e7ea352) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Saeed Mahameed authored
BugLink: http://bugs.launchpad.net/bugs/1540435 E-Switch vport context is unlike NIC vport context, managed by the E-Switch manager or vport_group_manager and not by the NIC(VF) driver. The E-Switch manager can access (read/modify) any of its vports E-Switch context. Currently E-Switch vport context includes only clietnt and server vlan insertion and striping data (for later support of VST mode). Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit d6666753) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Saeed Mahameed authored
BugLink: http://bugs.launchpad.net/bugs/1540435 Implement set VF mac/link state and query VF config to be used later in nedev VF ndos or any other management API. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit 77256579) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Saeed Mahameed authored
BugLink: http://bugs.launchpad.net/bugs/1540435 Enabling E-Switch SRIOV for nvfs+1 vports. Create E-Switch FDB for L2 UC/MC mac steering between VFs/PF and external vport (Uplink). FDB contains forwarding rules such as: UC MAC0 -> vport0(PF). UC MAC1 -> vport1. UC MAC2 -> vport2. MC MACX -> vport0, vport2, Uplink. MC MACY -> vport1, Uplink. For unmatched traffic FDB has the following default rules: Unmached Traffic (src vport != Uplink) -> Uplink. Unmached Traffic (src vport == Uplink) -> vport0(PF). FDB rules population: Each NIC vport (VF) will notify E-Switch manager of its UC/MC vport context changes via modify vport context command, which will be translated to an event that will be handled by E-Switch manager (PF) which will update FDB table accordingly. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit 81848731) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Saeed Mahameed authored
BugLink: http://bugs.launchpad.net/bugs/1540435 Define needed hardware structures and capabilities needed for E-Switch FDB flow tables and read them on driver load. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit 495716b1) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Saeed Mahameed authored
BugLink: http://bugs.launchpad.net/bugs/1540435 E-Switch is the software entity that represents and manages ConnectX4 inter-HCA ethernet l2 switching. E-Switch has its own Virtual Ports, each Vport/vNIC/VF can be connected to the device through a vport of an e-switch. Each e-switch is managed by one vNIC identified by HCA_CAP.vport_group_manager (usually it is the PF/vport[0]), and its main responsibility is to forward each packet to the right vport. e-Switch needs to manage its own l2-table and FDB tables. L2 table is a flow table that is managed by FW, it is needed for Multi-host (Multi PF) configuration for inter HCA switching between PFs. FDB table is a flow table that is totally managed by e-Switch driver, its main responsibility is to switch packets between e-Swtich internal vports and uplink vport that belong to the same. This patch introduces only e-Swtich l2 table management, FDB managemnt will come later when ethernet SRIOV/VFs will be enabled. preperation for ethernet sriov and l2 table management. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit 073bb189) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Saeed Mahameed authored
BugLink: http://bugs.launchpad.net/bugs/1540435 Each Vport/vNIC must notify underlying e-Switch layer for vlan table changes in-order to update SR-IOV FDB tables. We do that at vlan_rx_add_vid and vlan_rx_kill_vid ndos. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit aad9e6e4) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Saeed Mahameed authored
BugLink: http://bugs.launchpad.net/bugs/1540435 Each Vport/vNIC must notify underlying e-Switch layer for UC/MC list and promisc mode updates, in-order to update l2 tables and SR-IOV FDB tables. We do that at set_rx_mode ndo. preperation for ethernet-SRIOV and l2 table management. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit 5e55da1d) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-