- 06 Apr, 2016 19 commits
-
-
Tim Gardner authored
Ignore: yes Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Tim Gardner authored
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Seth Forshee authored
This is still an experimental feature, so disable it by default and allow it only when the system administrator supplies the userns_mounts=true module parameter. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Seth Forshee authored
This is still an experimental feature, so disable it by default and allow it only when the system administrator supplies the userns_mounts=true module parameter. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Seth Forshee authored
Support unprivileged mounting of ext4 volumes from user namespaces. This requires the following changes: - Perform all uid and gid conversions to/from disk relative to s_user_ns. In many cases this will already be handled by the vfs helper functions. This also requires updates to handle cases where ids may not map into s_user_ns. - Update most capability checks to check for capabilities in s_user_ns rather than init_user_ns. These mostly reflect changes to the filesystem that a user in s_user_ns could already make externally by virtue of having write access to the backing device. - Restrict unsafe options in either the mount options or the ext4 superblock. Currently the only concerning option is errors=panic, and this is made to require CAP_SYS_ADMIN in init_user_ns. - Verify that unprivileged users have the required access to the journal device at the path passed via the journal_path mount option. Note that for the journal_path and the journal_dev mount options, and for external journal devices specified in the ext4 superblock, devcgroup restrictions will be enforced by __blkdev_get(), (via blkdev_get_by_dev()), ensuring that the user has been granted appropriate access to the block device. - Set the FS_USERNS_MOUNT flag on the filesystem types supported by ext4. sysfs attributes for ext4 mounts remain writable only by real root. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Seth Forshee authored
For unprivileged mounts to be safe the user must not be able to make changes to the backing store while it is mounted. This patch takes a step towards preventing this by refusing to mount in a user namepspace if the block device is open for writing and refusing attempts to open the block device for writing by non- root while it is mounted in a user namespace. To prevent this from happening we use i_writecount in the inodes of the bdev filesystem similarly to how it is used for regular files. Whenever the device is opened for writing i_writecount is checked; if it is negative the open returns -EBUSY, otherwise i_writecount is incremented. On mount, a positive i_writecount results in mount_bdev returning -EBUSY, otherwise i_writecount is decremented. Opens by root and mounts from init_user_ns do not check nor modify i_writecount. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Seth Forshee authored
Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Seth Forshee authored
The user in control of a super block should be allowed to freeze and thaw it. Relax the restrictions on the FIFREEZE and FITHAW ioctls to require CAP_SYS_ADMIN in s_user_ns. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Seth Forshee authored
The EVM HMAC should be calculated using the on disk user and group ids, so the k[ug]ids in the inode must be translated relative to the s_user_ns of the inode's super block. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Seth Forshee authored
When dealing with mounts from user namespaces quota user ids must be translated relative to s_user_ns, especially when they will be stored on disk. For purposes of the in-kernel hash table using init_user_ns is still okay and may help reduce hash collisions, so continue using init_user_ns there. These changes introduce the possibility that the conversion operations in struct qtree_fmt_operations could fail if an id cannot be legitimately converted. Change these operations to return int rather than void, and update the callers to handle failures. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Seth Forshee authored
For filesystems mounted from a user namespace on-disk ids should be translated relative to s_users_ns rather than init_user_ns. When an id in the filesystem doesn't exist in s_user_ns the associated id in the inode will be set to INVALID_[UG]ID, which turns these into de facto "nobody" ids. This actually maps pretty well into the way most code already works, and those places where it didn't were fixed in previous patches. Moving forward vfs code needs to be careful to handle instances where ids in inodes may be invalid. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Seth Forshee authored
Unprivileged users should not be able to mount mtd block devices when they lack sufficient privileges towards the block device inode. Update mount_mtd() to validate that the user has the required access to the inode at the specified path. The check will be skipped for CAP_SYS_ADMIN, so privileged mounts will continue working as before. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Seth Forshee authored
Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Seth Forshee authored
Unprivileged users are normally restricted from mounting with the allow_other option by system policy, but this could be bypassed for a mount done with user namespace root permissions. In such cases allow_other should not allow users outside the userns to access the mount as doing so would give the unprivileged user the ability to manipulate processes it would otherwise be unable to manipulate. Restrict allow_other to apply to users in the same userns used at mount or a descendant of that namespace. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Seth Forshee authored
In order to support mounts from namespaces other than init_user_ns, fuse must translate uids and gids to/from the userns of the process servicing requests on /dev/fuse. This patch does that, with a couple of restrictions on the namespace: - The userns for the fuse connection is fixed to the namespace from which /dev/fuse is opened. - The namespace must be the same as s_user_ns. These restrictions simplify the implementation by avoiding the need to pass around userns references and by allowing fuse to rely on the checks in inode_change_ok for ownership changes. Either restriction could be relaxed in the future if needed. For cuse the namespace used for the connection is also simply current_user_ns() at the time /dev/cuse is opened. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Seth Forshee authored
If the userspace process servicing fuse requests is running in a pid namespace then pids passed via the fuse fd need to be translated relative to that namespace. Capture the pid namespace in use when the filesystem is mounted and use this for pid translation. Since no use case currently exists for changing namespaces all translations are done relative to the pid namespace in use when /dev/fuse is opened. Mounting or /dev/fuse IO from another namespace will return errors. Requests from processes whose pid cannot be translated into the target namespace are not permitted, except for requests allocated via fuse_get_req_nofail_nopages. For no-fail requests in.h.pid will be 0 if the pid translation fails. File locking changes based on previous work done by Eric Biederman. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Seth Forshee authored
A privileged user in s_user_ns will generally have the ability to manipulate the backing store and insert security.* xattrs into the filesystem directly. Therefore the kernel must be prepared to handle these xattrs from unprivileged mounts, and it makes little sense for commoncap to prevent writing these xattrs to the filesystem. The capability and LSM code have already been updated to appropriately handle xattrs from unprivileged mounts, so it is safe to loosen this restriction on setting xattrs. The exception to this logic is that writing xattrs to a mounted filesystem may also cause the LSM inode_post_setxattr or inode_setsecurity callbacks to be invoked. SELinux will deny the xattr update by virtue of applying mountpoint labeling to unprivileged userns mounts, and Smack will deny the writes for any user without global CAP_MAC_ADMIN, so loosening the capability check in commoncap is safe in this respect as well. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Seth Forshee authored
Superblock level remounts are currently restricted to global CAP_SYS_ADMIN, as is the path for changing the root mount to read only on umount. Loosen both of these permission checks to also allow CAP_SYS_ADMIN in any namespace which is privileged towards the userns which originally mounted the filesystem. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Seth Forshee authored
Expand the check in should_remove_suid() to keep privileges for CAP_FSETID in s_user_ns rather than init_user_ns. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
- 29 Feb, 2016 21 commits
-
-
Seth Forshee authored
ids in on-disk ACLs should be converted to s_user_ns instead of init_user_ns as is done now. This introduces the possibility for id mappings to fail, and when this happens syscalls will return EOVERFLOW. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Seth Forshee authored
Add checks to inode_change_ok to verify that uid and gid changes will map into the superblock's user namespace. If they do not fail with -EOVERFLOW. This cannot be overriden with ATTR_FORCE. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Seth Forshee authored
Using INVALID_[UG]ID for the LSM file creation context doesn't make sense, so return an error if the inode passed to set_create_file_as() has an invalid id. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Seth Forshee authored
Filesystem uids which don't map into a user namespace may result in inode->i_uid being INVALID_UID. A symlink and its parent could have different owners in the filesystem can both get mapped to INVALID_UID, which may result in following a symlink when this would not have otherwise been permitted when protected symlinks are enabled. Add a new helper function, uid_valid_eq(), and use this to validate that the ids in may_follow_link() are both equal and valid. Also add an equivalent helper for gids, which is currently unused. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Seth Forshee authored
The SMACK64, SMACK64EXEC, and SMACK64MMAP labels are all handled differently in untrusted mounts. This is confusing and potentically problematic. Change this to handle them all the same way that SMACK64 is currently handled; that is, read the label from disk and check it at use time. For SMACK64 and SMACK64MMAP access is denied if the label does not match smk_root. To be consistent with suid, a SMACK64EXEC label which does not match smk_root will still allow execution of the file but will not run with the label supplied in the xattr. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Seth Forshee authored
All current callers of in_userns pass current_user_ns as the first argument. Simplify by replacing in_userns with current_in_userns which checks whether current_user_ns is in the namespace supplied as an argument. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: James Morris <james.l.morris@oracle.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Seth Forshee authored
Security labels from unprivileged mounts in user namespaces must be ignored. Force superblocks from user namespaces whose labeling behavior is to use xattrs to use mountpoint labeling instead. For the mountpoint label, default to converting the current task context into a form suitable for file objects, but also allow the policy writer to specify a different label through policy transition rules. Pieced together from code snippets provided by Stephen Smalley. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Acked-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Andy Lutomirski authored
If a process gets access to a mount from a different user namespace, that process should not be able to take advantage of setuid files or selinux entrypoints from that filesystem. Prevent this by treating mounts from other mount namespaces and those not owned by current_user_ns() or an ancestor as nosuid. This will make it safer to allow more complex filesystems to be mounted in non-root user namespaces. This does not remove the need for MNT_LOCK_NOSUID. The setuid, setgid, and file capability bits can no longer be abused if code in a user namespace were to clear nosuid on an untrusted filesystem, but this patch, by itself, is insufficient to protect the system from abuse of files that, when execed, would increase MAC privilege. As a more concrete explanation, any task that can manipulate a vfsmount associated with a given user namespace already has capabilities in that namespace and all of its descendents. If they can cause a malicious setuid, setgid, or file-caps executable to appear in that mount, then that executable will only allow them to elevate privileges in exactly the set of namespaces in which they are already privileges. On the other hand, if they can cause a malicious executable to appear with a dangerous MAC label, running it could change the caller's security context in a way that should not have been possible, even inside the namespace in which the task is confined. As a hardening measure, this would have made CVE-2014-5207 much more difficult to exploit. Signed-off-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: James Morris <james.l.morris@oracle.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Seth Forshee authored
Unprivileged users should not be able to mount block devices when they lack sufficient privileges towards the block device inode. Update blkdev_get_by_path() to validate that the user has the required access to the inode at the specified path. The check will be skipped for CAP_SYS_ADMIN, so privileged mounts will continue working as before. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Seth Forshee authored
When looking up a block device by path no permission check is done to verify that the user has access to the block device inode at the specified path. In some cases it may be necessary to check permissions towards the inode, such as allowing unprivileged users to mount block devices in user namespaces. Add an argument to lookup_bdev() to optionally perform this permission check. A value of 0 skips the permission check and behaves the same as before. A non-zero value specifies the mask of access rights required towards the inode at the specified path. The check is always skipped if the user has CAP_SYS_ADMIN. All callers of lookup_bdev() currently pass a mask of 0, so this patch results in no functional change. Subsequent patches will add permission checks where appropriate. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Seth Forshee authored
Security labels from unprivileged mounts cannot be trusted. Ideally for these mounts we would assign the objects in the filesystem the same label as the inode for the backing device passed to mount. Unfortunately it's currently impossible to determine which inode this is from the LSM mount hooks, so we settle for the label of the process doing the mount. This label is assigned to s_root, and also to smk_default to ensure that new inodes receive this label. The transmute property is also set on s_root to make this behavior more explicit, even though it is technically not necessary. If a filesystem has existing security labels, access to inodes is permitted if the label is the same as smk_root, otherwise access is denied. The SMACK64EXEC xattr is completely ignored. Explicit setting of security labels continues to require CAP_MAC_ADMIN in init_user_ns. Altogether, this ensures that filesystem objects are not accessible to subjects which cannot already access the backing store, that MAC is not violated for any objects in the fileystem which are already labeled, and that a user cannot use an unprivileged mount to gain elevated MAC privileges. sysfs, tmpfs, and ramfs are already mountable from user namespaces and support security labels. We can't rule out the possibility that these filesystems may already be used in mounts from user namespaces with security lables set from the init namespace, so failing to trust lables in these filesystems may introduce regressions. It is safe to trust labels from these filesystems, since the unprivileged user does not control the backing store and thus cannot supply security labels, so an explicit exception is made to trust labels from these filesystems. Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Seth Forshee authored
Capability sets attached to files must be ignored except in the user namespaces where the mounter is privileged, i.e. s_user_ns and its descendants. Otherwise a vector exists for gaining privileges in namespaces where a user is not already privileged. Add a new helper function, in_user_ns(), to test whether a user namespace is the same as or a descendant of another namespace. Use this helper to determine whether a file's capability set should be applied to the caps constructed during exec. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Seth Forshee authored
Initially this will be used to eliminate the implicit MNT_NODEV flag for mounts from user namespaces. In the future it will also be used for translating ids and checking capabilities for filesystems mounted from user namespaces. s_user_ns is initialized in alloc_super() and is generally set to current_user_ns(). To avoid security and corruption issues, two additional mount checks are also added: - do_new_mount() gains a check that the user has CAP_SYS_ADMIN in current_user_ns(). - sget() will fail with EBUSY when the filesystem it's looking for is already mounted from another user namespace. proc requires some special handling. The user namespace of current isn't appropriate when forking as a result of clone (2) with CLONE_NEWPID|CLONE_NEWUSER, as it will set s_user_ns to the namespace of the parent and make proc unmountable in the new user namespace. Instead, the user namespace which owns the new pid namespace is used. sget_userns() is allowed to allow passing in a namespace other than that of current, and sget becomes a wrapper around sget_userns() which passes current_user_ns(). Changes to original version of this patch * Documented @user_ns in sget_userns, alloc_super and fs.h * Kept an blank line in fs.h * Removed unncessary include of user_namespace.h from fs.h * Tweaked the location of get_user_ns and put_user_ns so the security modules can (if they wish) depend on it. -- EWB Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Xiangliang Yu authored
BugLink: http://bugs.launchpad.net/bugs/1542071 This adds support for AMD's PCI-Express Non-Transparent Bridge (NTB) device on the Zeppelin platform. The driver connnects to the standard NTB sub-system interface, with modification to add hooks for power management in a separate patch. The AMD NTB device has 3 memory windows, 16 doorbell, 16 scratch-pad registers, and supports up to 16 PCIe lanes running a Gen3 speeds. Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com> Reviewed-by: Allen Hubbe <Allen.Hubbe@emc.com> Signed-off-by: Jon Mason <jdmason@kudzu.us> (cherry picked from commit a1b36958) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Tim Gardner authored
BugLink: http://bugs.launchpad.net/bugs/1542071Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Joseph Salisbury authored
BugLink: http://bugs.launchpad.net/bugs/1495983 OriginalAuthor: Olaf Hering <olaf@aepfle.de> Signed-off-by: Joseph Salisbury <joseph.salisbury@canonical.com> Acked-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Brad Figg <brad.figg@canonical.com> Acked-by: Tim Gardner <tim.gardner@canonical.com> Signed-off-by: Brad Figg <brad.figg@canonical.com>
-
Tim Gardner authored
BugLink: http://bugs.launchpad.net/bugs/1545542Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Suman Tripathi authored
ahci_xgene: Implement the workaround to fix the missing of the edge interrupt for the HOST_IRQ_STAT. Due to H/W errata, the HOST_IRQ_STAT register misses the edge interrupt when clearing the HOST_IRQ_STAT register and hardware reporting the PORT_IRQ_STAT register happens to be at the same clock cycle. Signed-off-by: Suman Tripathi <stripathi@apm.com> Signed-off-by: Tejun Heo <tj@kernel.org> (cherry picked from linux-next commit 32aea268) Signed-off-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Suman Tripathi authored
The flexibility to override the irq handles in the LLD's are already present, so controllers implementing a edge trigger latch can implement their own interrupt handler inside the driver. This patch removes the AHCI_HFLAG_EDGE_IRQ support from libahci and moves edge irq handling to ahci_xgene. tj: Minor update to description. Signed-off-by: Suman Tripathi <stripathi@apm.com> Signed-off-by: Tejun Heo <tj@kenrel.org> (cherry picked from linux-next commit d867b95f) [ dannf: offset adjustments ] Signed-off-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Suman Tripathi authored
This patch implements the capability to override the generic AHCI interrupt handler so that specific ahci drivers can implement their own custom interrupt handler routines. It also exports ahci_handle_port_intr so that custom irq_handler implementations can use it. tj: s/ahci_irq_handler/irq_handler/ and updated description. Signed-off-by: Suman Tripathi <stripathi@apm.com> Signed-off-by: Tejun Heo <tj@kernel.org> (cherry picked from linux-next commit f070d671) [ dannf: backported to v4.4 ] Signed-off-by: dann frazier <dann.frazier@canonical.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Nicholas Krause authored
This adds the needed check after the call to the function mraid_mm_alloc_kioc in order to make sure that this function has not returned NULL and therefore makes sure we do not deference a NULL pointer if one is returned by mraid_mm_alloc_kioc. Further more add needed comments explaining that this function call can return NULL if the list head is empty for the pointer passed in order to allow furture users to understand this required pointer check. Signed-off-by: Nicholas Krause <xerofoify@gmail.com> Acked-by: Sumit Saxena <sumit.saxena@avagotech.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> (cherry picked from commit 7296f62f) Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-