IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Function ima_appraise_flag() returns the flag to be set in
temp_ima_appraise depending on the hook identifier passed as an argument.
It is not necessary to set the flag again for the POLICY_CHECK hook.
Signed-off-by: Krzysztof Struczynski <krzysztof.struczynski@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Function hash_long() accepts unsigned long, while currently only one byte
is passed from ima_hash_key(), which calculates a key for ima_htable.
Given that hashing the digest does not give clear benefits compared to
using the digest itself, remove hash_long() and return the modulus
calculated on the first two bytes of the digest with the number of slots.
Also reduce the depth of the hash table by doubling the number of slots.
Cc: stable@vger.kernel.org
Fixes: 3323eec921 ("integrity: IMA as an integrity service provider")
Co-developed-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Krzysztof Struczynski <krzysztof.struczynski@huawei.com>
Acked-by: David.Laight@aculab.com (big endian system concerns)
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
This patch fixes the following warning and few other instances of
traversal of evm_config_xattrnames list:
[ 32.848432] =============================
[ 32.848707] WARNING: suspicious RCU usage
[ 32.848966] 5.7.0-rc1-00006-ga8d5875ce5f0b #1 Not tainted
[ 32.849308] -----------------------------
[ 32.849567] security/integrity/evm/evm_main.c:231 RCU-list traversed in non-reader section!!
Since entries are only added to the list and never deleted, use
list_for_each_entry_lockless() instead of list_for_each_entry_rcu for
traversing the list. Also, add a relevant comment in evm_secfs.c to
indicate this fact.
Reported-by: kernel test robot <lkp@intel.com>
Suggested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com>
Acked-by: Paul E. McKenney <paulmck@kernel.org> (RCU viewpoint)
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
This patch fixes the return value of ima_write_policy() when a new policy
is directly passed to IMA and the current policy requires appraisal of the
file containing the policy. Currently, if appraisal is not in ENFORCE mode,
ima_write_policy() returns 0 and leads user space applications to an
endless loop. Fix this issue by denying the operation regardless of the
appraisal mode.
Cc: stable@vger.kernel.org # 4.10.x
Fixes: 19f8a84713 ("ima: measure and appraise the IMA policy itself")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Reviewed-by: Krzysztof Struczynski <krzysztof.struczynski@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
This patch avoids a kernel panic due to accessing an error pointer set by
crypto_alloc_shash(). It occurs especially when there are many files that
require an unsupported algorithm, as it would increase the likelihood of
the following race condition:
Task A: *tfm = crypto_alloc_shash() <= error pointer
Task B: if (*tfm == NULL) <= *tfm is not NULL, use it
Task B: rc = crypto_shash_init(desc) <= panic
Task A: *tfm = NULL
This patch uses the IS_ERR_OR_NULL macro to determine whether or not a new
crypto context must be created.
Cc: stable@vger.kernel.org
Fixes: d46eb36995 ("evm: crypto hash replaced by shash")
Co-developed-by: Krzysztof Struczynski <krzysztof.struczynski@huawei.com>
Signed-off-by: Krzysztof Struczynski <krzysztof.struczynski@huawei.com>
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Commit a408e4a86b ("ima: open a new file instance if no read
permissions") tries to create a new file descriptor to calculate a file
digest if the file has not been opened with O_RDONLY flag. However, if a
new file descriptor cannot be obtained, it sets the FMODE_READ flag to
file->f_flags instead of file->f_mode.
This patch fixes this issue by replacing f_flags with f_mode as it was
before that commit.
Cc: stable@vger.kernel.org # 4.20.x
Fixes: a408e4a86b ("ima: open a new file instance if no read permissions")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Reviewed-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
The inode_smack cache is no longer used.
Remove it.
Signed-off-by: Vishal Goel <vishal.goel@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
"smk_lock" mutex is used during inode instantiation in
smack_d_instantiate()function. It has been used to avoid
simultaneous access on same inode security structure.
Since smack related initialization is done only once i.e during
inode creation. If the inode has already been instantiated then
smack_d_instantiate() function just returns without doing
anything.
So it means mutex lock is required only during inode creation.
But since 2 processes can't create same inodes or files
simultaneously. Also linking or some other file operation can't
be done simultaneously when the file is getting created since
file lookup will fail before dentry inode linkup which is done
after smack initialization.
So no mutex lock is required in inode_smack structure.
It will save memory as well as improve some performance.
If 40000 inodes are created in system, it will save 1.5 MB on
32-bit systems & 2.8 MB on 64-bit systems.
Signed-off-by: Vishal Goel <vishal.goel@samsung.com>
Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
The mix of IS_ENABLED() and #ifdef checks has left a combination
that causes a warning about an unused variable:
security/smack/smack_lsm.c: In function 'smack_socket_connect':
security/smack/smack_lsm.c:2838:24: error: unused variable 'sip' [-Werror=unused-variable]
2838 | struct sockaddr_in6 *sip = (struct sockaddr_in6 *)sap;
Change the code to use C-style checks consistently so the compiler
can handle it correctly.
Fixes: 87fbfffcc8 ("broken ping to ipv6 linklocal addresses on debian buster")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
It is simpler to allocate them statically in the corresponding
structure, avoiding unnecessary kmalloc() calls and pointer
dereferencing.
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
[PM: manual merging required in policydb.c]
Signed-off-by: Paul Moore <paul@paul-moore.com>
The value of rc is still zero from the last assignment when the error
path is taken. Fix it by setting it to -ENOMEM before the
hashtab_create() call.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: e67b2ec9f6 ("selinux: store role transitions in a hash table")
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
No need to traverse the hashtab to count its elements, hashtab already
tracks it for us.
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Fix to return negative error code -ENOMEM from the kvcalloc() error
handling case instead of 0, as done elsewhere in this function.
Fixes: acdf52d97f ("selinux: convert to kvmalloc")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAl6rPswUHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQ6iDy2pc3iXMJBQ//dAU7VS01kQUUsFjd8xUIOk9aSbNy
gjFkzcpbTsS4Mhqk0FSP3mfqDWP3lvxt9gx6WfnCf+a2KE3eTtf9bISW2OB2evIl
9ydae6frJLiP6yIeAEZBb1PBQ6AxwBT8j8drKi57sOBC8rkmF66wiMaG2nybYW/j
rvkOQCFtWj/A3b+Y7y8fVs8sjTHWvcsvkN7kwYmmdjyn7h/C1Tqc6TOOrt1jtLUG
dgeak9bCIvK7JB/W6squ1iKqvkJhld7h5fZn6WB/6Xd1DKD1LVjGT8HsKpI/ei49
0tAybqaLv8WxVc5ZGcAGoTt/X0hq3lXRiMG1Qgmed85wxjrLEpU12L6yprEtgtao
0yY1JNizuC3Ehbi02o4gHf+RffLPWDrT8Kmu00/IuridCesNZCrEFpbAZmOwPU67
nFufU0YlSnsVJ63C8TMhkI2eg/VejGjN4I2PEgcxEbZKBBW+nAcJfoKl1y2tzEo9
ZNdZcetY9yJdpewjsF6VgsXs4qUrm1NUiG8pCXdK23+w/qYvZ4UqoYfoRNYIuRS0
nRN40OkRYN6GzDZ+NCPYqgIhoEpps0p96VYQI6mp+PyOpwlCq8epKVjXD/TkteVG
mevM2Ffy8xVaL47ufXxAHn+pA6F6Mdmo/rIwe+U5Olase96DTcFr90JPmVz58mcP
6lWZhje3wS3mSpk=
=HTrZ
-----END PGP SIGNATURE-----
Merge tag 'selinux-pr-20200430' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull SELinux fixes from Paul Moore:
"Two more SELinux patches to fix problems in the v5.7-rcX releases.
Wei Yongjun's patch fixes a return code in an error path, and my patch
fixes a problem where we were not correctly applying access controls
to all of the netlink messages in the netlink_send LSM hook"
* tag 'selinux-pr-20200430' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: properly handle multiple messages in selinux_netlink_send()
selinux: fix error return code in cond_read_list()
Fix the SELinux netlink_send hook to properly handle multiple netlink
messages in a single sk_buff; each message is parsed and subject to
SELinux access control. Prior to this patch, SELinux only inspected
the first message in the sk_buff.
Cc: stable@vger.kernel.org
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Pull in Christoph Hellwig's series that changes the sysctl's ->proc_handler
methods to take kernel pointers instead. It gets rid of the set_fs address
space overrides used by BPF. As per discussion, pull in the feature branch
into bpf-next as it relates to BPF sysctl progs.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200427071508.GV23230@ZenIV.linux.org.uk/T/
Fix to return negative error code -ENOMEM from the error handling
case instead of 0, as done elsewhere in this function.
Fixes: 60abd3181d ("selinux: convert cond_list to array")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Instead of having all the sysctl handlers deal with user pointers, which
is rather hairy in terms of the BPF interaction, copy the input to and
from userspace in common code. This also means that the strings are
always NUL-terminated by the common code, making the API a little bit
safer.
As most handler just pass through the data to one of the common handlers
a lot of the changes are mechnical.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
I thought I fixed the counting in filename_trans_read_helper() to count
the compat rule count correctly in the final version, but it's still
wrong. To really count the same thing as in the compat path, we'd need
to add up the cardinalities of stype bitmaps of all datums.
Since the kernel currently doesn't implement an ebitmap_cardinality()
function (and computing the proper count would just waste CPU cycles
anyway), just document that we use the field only in case of the old
format and stop updating it in filename_trans_read_helper().
Fixes: 4300590243 ("selinux: implement new format of filename transitions")
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
kernel + tools/perf:
Alexey Budankov:
- Introduce CAP_PERFMON to kernel and user space.
callchains:
Adrian Hunter:
- Allow using Intel PT to synthesize callchains for regular events.
Kan Liang:
- Stitch LBR records from multiple samples to get deeper backtraces,
there are caveats, see the csets for details.
perf script:
Andreas Gerstmayr:
- Add flamegraph.py script
BPF:
Jiri Olsa:
- Synthesize bpf_trampoline/dispatcher ksymbol events.
perf stat:
Arnaldo Carvalho de Melo:
- Honour --timeout for forked workloads.
Stephane Eranian:
- Force error in fallback on :k events, to avoid counting nothing when
the user asks for kernel events but is not allowed to.
perf bench:
Ian Rogers:
- Add event synthesis benchmark.
tools api fs:
Stephane Eranian:
- Make xxx__mountpoint() more scalable
libtraceevent:
He Zhe:
- Handle return value of asprintf.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCXp2LlQAKCRCyPKLppCJ+
J95oAP0ZihVUhESv/gdeX0IDE5g6Rd2V6LNcRj+jb7gX9NlQkwD/UfS454WV1ftQ
qTwrkKPzY/5Tm2cLuVE7r7fJ6naDHgU=
=FHm4
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo-5.8-20200420' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core fixes and improvements from Arnaldo Carvalho de Melo:
kernel + tools/perf:
Alexey Budankov:
- Introduce CAP_PERFMON to kernel and user space.
callchains:
Adrian Hunter:
- Allow using Intel PT to synthesize callchains for regular events.
Kan Liang:
- Stitch LBR records from multiple samples to get deeper backtraces,
there are caveats, see the csets for details.
perf script:
Andreas Gerstmayr:
- Add flamegraph.py script
BPF:
Jiri Olsa:
- Synthesize bpf_trampoline/dispatcher ksymbol events.
perf stat:
Arnaldo Carvalho de Melo:
- Honour --timeout for forked workloads.
Stephane Eranian:
- Force error in fallback on :k events, to avoid counting nothing when
the user asks for kernel events but is not allowed to.
perf bench:
Ian Rogers:
- Add event synthesis benchmark.
tools api fs:
Stephane Eranian:
- Make xxx__mountpoint() more scalable
libtraceevent:
He Zhe:
- Handle return value of asprintf.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Before calculating a digest for each PCR bank, collisions were detected
with a SHA1 digest. This patch includes ima_hash_algo among the algorithms
used to calculate the template digest and checks collisions on that digest.
The position in the measurement entry array of the template digest
calculated with the IMA default hash algorithm is stored in the
ima_hash_algo_idx global variable and is determined at IMA initialization
time.
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
This patch modifies ima_calc_field_array_hash() to calculate a template
digest for each allocated PCR bank and SHA1. It also passes the tpm_digest
array of the template entry to ima_pcr_extend() or in case of a violation,
the pre-initialized digests array filled with 0xff.
Padding with zeros is still done if the mapping between TPM algorithm ID
and crypto ID is unknown.
This patch calculates again the template digest when a measurement list is
restored. Copying only the SHA1 digest (due to the limitation of the
current measurement list format) is not sufficient, as hash collision
detection will be done on the digest calculated with the IMA default hash
algorithm.
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
This patch creates a crypto_shash structure for each allocated PCR bank and
for SHA1 if a bank with that algorithm is not currently allocated.
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
This patch dynamically allocates the array of tpm_digest structures in
ima_alloc_init_template() and ima_restore_template_data(). The size of the
array is equal to the number of PCR banks plus ima_extra_slots, to make
room for SHA1 and the IMA default hash algorithm, when PCR banks with those
algorithms are not allocated.
Calculating the SHA1 digest is mandatory, as SHA1 still remains the default
hash algorithm for the measurement list. When IMA will support the Crypto
Agile format, remaining digests will be also provided.
The position in the measurement entry array of the SHA1 digest is stored in
the ima_sha1_idx global variable and is determined at IMA initialization
time.
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
In preparation for the patch that calculates a digest for each allocated
PCR bank, this patch passes to ima_calc_field_array_hash() the
ima_template_entry structure, so that digests can be directly stored in
that structure instead of ima_digest_data.
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Evaluate error in init_ima() before register_blocking_lsm_notifier() and
return if not zero.
Cc: stable@vger.kernel.org # 5.3.x
Fixes: b169424551 ("ima: use the lsm policy update notifier")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
boot_aggregate is the first entry of IMA measurement list. Its purpose is
to link pre-boot measurements to IMA measurements. As IMA was designed to
work with a TPM 1.2, the SHA1 PCR bank was always selected even if a
TPM 2.0 with support for stronger hash algorithms is available.
This patch first tries to find a PCR bank with the IMA default hash
algorithm. If it does not find it, it selects the SHA256 PCR bank for
TPM 2.0 and SHA1 for TPM 1.2. Ultimately, it selects SHA1 also for TPM 2.0
if the SHA256 PCR bank is not found.
If none of the PCR banks above can be found, boot_aggregate file digest is
filled with zeros, as for TPM bypass, making it impossible to perform a
remote attestation of the system.
Cc: stable@vger.kernel.org # 5.1.x
Fixes: 879b589210 ("tpm: retrieve digest size of unknown algorithms with PCR read")
Reported-by: Jerry Snitselaar <jsnitsel@redhat.com>
Suggested-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Implement a new, more space-efficient way of storing filename
transitions in the binary policy. The internal structures have already
been converted to this new representation; this patch just implements
reading/writing an equivalent represntation from/to the binary policy.
This new format reduces the size of Fedora policy from 7.6 MB to only
3.3 MB (with policy optimization enabled in both cases). With the
unconfined module disabled, the size is reduced from 3.3 MB to 2.4 MB.
The time to load policy into kernel is also shorter with the new format.
On Fedora Rawhide x86_64 it dropped from 157 ms to 106 ms; without the
unconfined module from 115 ms to 105 ms.
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Now that context hash computation no longer depends on policydb, we can
simplify things by moving the context hashing completely under sidtab.
The hash is still cached in sidtab entries, but not for the in-flight
context structures.
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Always hashing the string representation is inefficient. Just hash the
contents of the structure directly (using jhash). If the context is
invalid (str & len are set), then hash the string as before, otherwise
hash the structured data.
Since the context hashing function is now faster (about 10 times), this
patch decreases the overhead of security_transition_sid(), which is
called from many hooks.
The jhash function seemed as a good choice, since it is used as the
default hashing algorithm in rhashtable.
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Reviewed-by: Jeff Vander Stoep <jeffv@google.com>
Tested-by: Jeff Vander Stoep <jeffv@google.com>
[PM: fixed some spelling errors in the comments pointed out by JVS]
Signed-off-by: Paul Moore <paul@paul-moore.com>
Currently, they are stored in a linked list, which adds significant
overhead to security_transition_sid(). On Fedora, with 428 role
transitions in policy, converting this list to a hash table cuts down
its run time by about 50%. This was measured by running 'stress-ng --msg
1 --msg-ops 100000' under perf with and without this patch.
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAl6YmC0UHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQ6iDy2pc3iXPplBAAzu5Fi0grInLr/IGXQKN2ZWcnx6KC
OIo28vpBhie0Q9tRtHTux2ec57IBYGAVomhZDGWcHvVHdm84T3/+/Fnb/cL9FIBy
GX2XgQjvAIyIPsscnq47eHbGdAk8o9E1mxuGD7Sgyql5834j3XbRN1yoOMEXfIOg
0sDjv7/4EzIymI/jiEaZ6LyVA/bXT2L0CcXEyLD4RSUJEgBaejrx8k1jAwz2w/De
NoXUqSnRpzN+ti2T0u/kt77cnshmK7w5AyjedA340LAqtvpMIWseeFmeTvlxQeOK
bIZaTmwgGdkKo8hdgayns1/A3FNSr9lnlOOfn04/SpGHpGOvmC/b+xrw3ENJLHJG
r+hanFAKkUlYGVY3dK82g3gAbfRQL3n48Cb0qmujqlqfLLAwc5VG0AN8WfDm0c8D
kZEe3Hbf7NAx9KUOIfclcqYvDaCE7F6DyXJs2ToO0rHDyuWXJ6T6kPQtSGdB7Qd3
fzi8XsN6fS2yCxEDyymUxRt5V+cJ+eNUuc52p+RTes3xh+31TGeIWmRudeNFfDTx
XawXjypvZTxOfoo+3WcLq0qPVp9bc3lzORKAX28nSGb/6Ytijctf5iS3f1VmZVM8
whY7UiSkTCFwix4SE3MwzJ1+kzJVngHY2woYxC02E5Lw972tiVT8LORvLU6G6P2G
Nf4aDz3SNGiYM3o=
=/dym
-----END PGP SIGNATURE-----
Merge tag 'selinux-pr-20200416' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull SELinux fix from Paul Moore:
"One small SELinux fix to ensure we cleanup properly on an error
condition"
* tag 'selinux-pr-20200416' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: free str on error in str_read()
If seq_file .next function does not change position index,
read after some lseek can generate unexpected output:
$ dd if=/proc/keys bs=1 # full usual output
0f6bfdf5 I--Q--- 2 perm 3f010000 1000 1000 user 4af2f79ab8848d0a: 740
1fb91b32 I--Q--- 3 perm 1f3f0000 1000 65534 keyring _uid.1000: 2
27589480 I--Q--- 1 perm 0b0b0000 0 0 user invocation_id: 16
2f33ab67 I--Q--- 152 perm 3f030000 0 0 keyring _ses: 2
33f1d8fa I--Q--- 4 perm 3f030000 1000 1000 keyring _ses: 1
3d427fda I--Q--- 2 perm 3f010000 1000 1000 user 69ec44aec7678e5a: 740
3ead4096 I--Q--- 1 perm 1f3f0000 1000 65534 keyring _uid_ses.1000: 1
521+0 records in
521+0 records out
521 bytes copied, 0,00123769 s, 421 kB/s
But a read after lseek in middle of last line results in the partial
last line and then a repeat of the final line:
$ dd if=/proc/keys bs=500 skip=1
dd: /proc/keys: cannot skip to specified offset
g _uid_ses.1000: 1
3ead4096 I--Q--- 1 perm 1f3f0000 1000 65534 keyring _uid_ses.1000: 1
0+1 records in
0+1 records out
97 bytes copied, 0,000135035 s, 718 kB/s
and a read after lseek beyond end of file results in the last line being
shown:
$ dd if=/proc/keys bs=1000 skip=1 # read after lseek beyond end of file
dd: /proc/keys: cannot skip to specified offset
3ead4096 I--Q--- 1 perm 1f3f0000 1000 65534 keyring _uid_ses.1000: 1
0+1 records in
0+1 records out
76 bytes copied, 0,000119981 s, 633 kB/s
See https://bugzilla.kernel.org/show_bug.cgi?id=206283
Fixes: 1f4aace60b ("fs/seq_file.c: simplify seq_file iteration code ...")
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Introduce the CAP_PERFMON capability designed to secure system
performance monitoring and observability operations so that CAP_PERFMON
can assist CAP_SYS_ADMIN capability in its governing role for
performance monitoring and observability subsystems.
CAP_PERFMON hardens system security and integrity during performance
monitoring and observability operations by decreasing attack surface that
is available to a CAP_SYS_ADMIN privileged process [2]. Providing the access
to system performance monitoring and observability operations under CAP_PERFMON
capability singly, without the rest of CAP_SYS_ADMIN credentials, excludes
chances to misuse the credentials and makes the operation more secure.
Thus, CAP_PERFMON implements the principle of least privilege for
performance monitoring and observability operations (POSIX IEEE 1003.1e:
2.2.2.39 principle of least privilege: A security design principle that
states that a process or program be granted only those privileges
(e.g., capabilities) necessary to accomplish its legitimate function,
and only for the time that such privileges are actually required)
CAP_PERFMON meets the demand to secure system performance monitoring and
observability operations for adoption in security sensitive, restricted,
multiuser production environments (e.g. HPC clusters, cloud and virtual compute
environments), where root or CAP_SYS_ADMIN credentials are not available to
mass users of a system, and securely unblocks applicability and scalability
of system performance monitoring and observability operations beyond root
and CAP_SYS_ADMIN use cases.
CAP_PERFMON takes over CAP_SYS_ADMIN credentials related to system performance
monitoring and observability operations and balances amount of CAP_SYS_ADMIN
credentials following the recommendations in the capabilities man page [1]
for CAP_SYS_ADMIN: "Note: this capability is overloaded; see Notes to kernel
developers, below." For backward compatibility reasons access to system
performance monitoring and observability subsystems of the kernel remains
open for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN capability
usage for secure system performance monitoring and observability operations
is discouraged with respect to the designed CAP_PERFMON capability.
Although the software running under CAP_PERFMON can not ensure avoidance
of related hardware issues, the software can still mitigate these issues
following the official hardware issues mitigation procedure [2]. The bugs
in the software itself can be fixed following the standard kernel development
process [3] to maintain and harden security of system performance monitoring
and observability operations.
[1] http://man7.org/linux/man-pages/man7/capabilities.7.html
[2] https://www.kernel.org/doc/html/latest/process/embargoed-hardware-issues.html
[3] https://www.kernel.org/doc/html/latest/admin-guide/security-bugs.html
Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Acked-by: James Morris <jamorris@linux.microsoft.com>
Acked-by: Serge E. Hallyn <serge@hallyn.com>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Igor Lubashev <ilubashe@akamai.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: intel-gfx@lists.freedesktop.org
Cc: linux-doc@vger.kernel.org
Cc: linux-man@vger.kernel.org
Cc: linux-security-module@vger.kernel.org
Cc: selinux@vger.kernel.org
Link: http://lore.kernel.org/lkml/5590d543-82c6-490a-6544-08e6a5517db0@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In commit 66f8e2f03c ("selinux: sidtab reverse lookup hash table") the
corresponding load is moved under the spin lock, so there is no race
possible and we can read the count directly. The smp_store_release() is
still needed to avoid racing with the lock-free readers.
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
In [see "Fixes:"] I missed the fact that str_read() may give back an
allocated pointer even if it returns an error, causing a potential
memory leak in filename_trans_read_one(). Fix this by making the
function free the allocated string whenever it returns a non-zero value,
which also makes its behavior more obvious and prevents repeating the
same mistake in the future.
Reported-by: coverity-bot <keescook+coverity-bot@chromium.org>
Addresses-Coverity-ID: 1461665 ("Resource leaks")
Fixes: c3a276111e ("selinux: optimize storage of filename transitions")
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
fix below warnings reported by coccicheck
security/selinux/ss/mls.c:539:39-43: WARNING: Comparison to bool
security/selinux/ss/services.c:1815:46-50: WARNING: Comparison to bool
security/selinux/ss/services.c:1827:46-50: WARNING: Comparison to bool
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zou Wei <zou_wei@huawei.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Original cgroup v2 eBPF code for filtering device access made it
possible to compile with CONFIG_CGROUP_DEVICE=n and still use the eBPF
filtering. Change
commit 4b7d4d453f ("device_cgroup: Export devcgroup_check_permission")
reverted this, making it required to set it to y.
Since the device filtering (and all the docs) for cgroup v2 is no longer
a "device controller" like it was in v1, someone might compile their
kernel with CONFIG_CGROUP_DEVICE=n. Then (for linux 5.5+) the eBPF
filter will not be invoked, and all processes will be allowed access
to all devices, no matter what the eBPF filter says.
Signed-off-by: Odin Ugedal <odin@ugedal.com>
Acked-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Currently it is possible to specify a state machine table with 0 length,
this is not valid as optional tables are specified by not defining
the table as present. Further this allows by-passing the base tables
range check against the next/check tables.
Fixes: d901d6a298 ("apparmor: dfa split verification of table headers")
Reported-by: Mike Salvatore <mike.salvatore@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEqG5UsNXhtOCrfGQP+7dXa6fLC2sFAl6AiV4ACgkQ+7dXa6fL
C2uCDg/+NevZHjgevGpFS9WByBtzFXKbewSSO0VZqS8RcFcy7+jRdTkcP66HJkGD
3JxfI4wQm2LOiaX8tHDlmWIPfp3G5Tnjae8peOEVBrCZ5WZMcD3CzquMv18kd8KK
5iiFzsTYDCywP/BwHCJgVQjPNpSp9drNCL5T+oql0nWeEUVAmiVziTIZgM9cAiyj
S/sfe76KxdNzaEEvpL5Mg/ieq1es/ssGA1jZ8jZI+YfN8mtBBH8KebckskKSgVTK
OjBLQAEanCbq3UsEqSsvEqbBpK7JkQJPOE153VRr6Nq/0MDSniwZjqIYOkgeB9pR
YStIrK9LyL/D0aMe8A52I7Ml1zuUUXb9zVAo3yLgubWPDYXvLs8n8Cgl8ZCQBFXy
0t86rlSq9SooGf3M2ket8Gk/XgymcRNP9LHr6MUGEO23l2ELBoO8hsWvUTAHoKVx
kn27S4YceW4+5UYfYm87ZQpUTbKPNATuBkts+QxiSrZMCnk82G6keA8JqNgKCe4d
xvUJew/JEGx2J0T8vYiBfpB1zEbtYdluglYyyzHpl0wkafYGRj1tTvwRZwQNhxw5
IFQvxEK2kVFTKcoLBGq909udb4QlQOfMAYal0u/4iCNiipNgxZawcWSfIMGN4qbG
tPvvBL2ocRmycmoMS7EHwWUSWIzwggi7zosDsM8jfLHBshBLeAE=
=vvN4
-----END PGP SIGNATURE-----
Merge tag 'keys-fixes-20200329' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull keyrings fixes from David Howells:
"Here's a couple of patches that fix a circular dependency between
holding key->sem and mm->mmap_sem when reading data from a key.
One potential issue is that a filesystem looking to use a key inside,
say, ->readpages() could deadlock if the key being read is the key
that's required and the buffer the key is being read into is on a page
that needs to be fetched.
The case actually detected is a bit more involved - with a filesystem
calling request_key() and locking the target keyring for write - which
could be being read"
* tag 'keys-fixes-20200329' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
KEYS: Avoid false positive ENOMEM error on key read
KEYS: Don't write out to userspace while holding key semaphore
Here are 3 SPDX patches for 5.7-rc1.
One fixes up the SPDX tag for a single driver, while the other two go
through the tree and add SPDX tags for all of the .gitignore files as
needed.
Nothing too complex, but you will get a merge conflict with your current
tree, that should be trivial to handle (one file modified by two things,
one file deleted.)
All 3 of these have been in linux-next for a while, with no reported
issues other than the merge conflict.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXodg5A8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykySQCgy9YDrkz7nWq6v3Gohl6+lW/L+rMAnRM4uTZm
m5AuCzO3Azt9KBi7NL+L
=2Lm5
-----END PGP SIGNATURE-----
Merge tag 'spdx-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx
Pull SPDX updates from Greg KH:
"Here are three SPDX patches for 5.7-rc1.
One fixes up the SPDX tag for a single driver, while the other two go
through the tree and add SPDX tags for all of the .gitignore files as
needed.
Nothing too complex, but you will get a merge conflict with your
current tree, that should be trivial to handle (one file modified by
two things, one file deleted.)
All three of these have been in linux-next for a while, with no
reported issues other than the merge conflict"
* tag 'spdx-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx:
ASoC: MT6660: make spdxcheck.py happy
.gitignore: add SPDX License Identifier
.gitignore: remove too obvious comments
Pull integrity updates from Mimi Zohar:
"Just a couple of updates for linux-5.7:
- A new Kconfig option to enable IMA architecture specific runtime
policy rules needed for secure and/or trusted boot, as requested.
- Some message cleanup (eg. pr_fmt, additional error messages)"
* 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
ima: add a new CONFIG for loading arch-specific policies
integrity: Remove duplicate pr_fmt definitions
IMA: Add log statements for failure conditions
IMA: Update KBUILD_MODNAME for IMA files to ima
Pull networking updates from David Miller:
"Highlights:
1) Fix the iwlwifi regression, from Johannes Berg.
2) Support BSS coloring and 802.11 encapsulation offloading in
hardware, from John Crispin.
3) Fix some potential Spectre issues in qtnfmac, from Sergey
Matyukevich.
4) Add TTL decrement action to openvswitch, from Matteo Croce.
5) Allow paralleization through flow_action setup by not taking the
RTNL mutex, from Vlad Buslov.
6) A lot of zero-length array to flexible-array conversions, from
Gustavo A. R. Silva.
7) Align XDP statistics names across several drivers for consistency,
from Lorenzo Bianconi.
8) Add various pieces of infrastructure for offloading conntrack, and
make use of it in mlx5 driver, from Paul Blakey.
9) Allow using listening sockets in BPF sockmap, from Jakub Sitnicki.
10) Lots of parallelization improvements during configuration changes
in mlxsw driver, from Ido Schimmel.
11) Add support to devlink for generic packet traps, which report
packets dropped during ACL processing. And use them in mlxsw
driver. From Jiri Pirko.
12) Support bcmgenet on ACPI, from Jeremy Linton.
13) Make BPF compatible with RT, from Thomas Gleixnet, Alexei
Starovoitov, and your's truly.
14) Support XDP meta-data in virtio_net, from Yuya Kusakabe.
15) Fix sysfs permissions when network devices change namespaces, from
Christian Brauner.
16) Add a flags element to ethtool_ops so that drivers can more simply
indicate which coalescing parameters they actually support, and
therefore the generic layer can validate the user's ethtool
request. Use this in all drivers, from Jakub Kicinski.
17) Offload FIFO qdisc in mlxsw, from Petr Machata.
18) Support UDP sockets in sockmap, from Lorenz Bauer.
19) Fix stretch ACK bugs in several TCP congestion control modules,
from Pengcheng Yang.
20) Support virtual functiosn in octeontx2 driver, from Tomasz
Duszynski.
21) Add region operations for devlink and use it in ice driver to dump
NVM contents, from Jacob Keller.
22) Add support for hw offload of MACSEC, from Antoine Tenart.
23) Add support for BPF programs that can be attached to LSM hooks,
from KP Singh.
24) Support for multiple paths, path managers, and counters in MPTCP.
From Peter Krystad, Paolo Abeni, Florian Westphal, Davide Caratti,
and others.
25) More progress on adding the netlink interface to ethtool, from
Michal Kubecek"
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2121 commits)
net: ipv6: rpl_iptunnel: Fix potential memory leak in rpl_do_srh_inline
cxgb4/chcr: nic-tls stats in ethtool
net: dsa: fix oops while probing Marvell DSA switches
net/bpfilter: remove superfluous testing message
net: macb: Fix handling of fixed-link node
net: dsa: ksz: Select KSZ protocol tag
netdevsim: dev: Fix memory leak in nsim_dev_take_snapshot_write
net: stmmac: add EHL 2.5Gbps PCI info and PCI ID
net: stmmac: add EHL PSE0 & PSE1 1Gbps PCI info and PCI ID
net: stmmac: create dwmac-intel.c to contain all Intel platform
net: dsa: bcm_sf2: Support specifying VLAN tag egress rule
net: dsa: bcm_sf2: Add support for matching VLAN TCI
net: dsa: bcm_sf2: Move writing of CFP_DATA(5) into slicing functions
net: dsa: bcm_sf2: Check earlier for FLOW_EXT and FLOW_MAC_EXT
net: dsa: bcm_sf2: Disable learning for ASP port
net: dsa: b53: Deny enslaving port 7 for 7278 into a bridge
net: dsa: b53: Prevent tagged VLAN on port 7 for 7278
net: dsa: b53: Restore VLAN entries upon (re)configuration
net: dsa: bcm_sf2: Fix overflow checks
hv_netvsc: Remove unnecessary round_up for recv_completion_cnt
...
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAl6Ch6wUHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQ6iDy2pc3iXPdcg/9FDMS/n0Xl1HQBUyu26EwLu3aUpNE
BdghXW1LKSTp7MrOENE60PGzZSAiC07ci1DqFd7zfLPZf2q5IwPwOBa/Avy8z95V
oHKqcMT6WO1SPOm/PxZn16FCKyET4gZDTXvHBAyiyFsbk36R522ZY615P9T3eLu/
ZA1NFsSjj68SqMCUlAWfeqjcbQiX63bryEpugOIg0qWy7R/+rtWxj9TjriZ+v9tq
uC45UcjBqphpmoPG8BifA3jjyclwO3DeQb5u7E8//HPPraGeB19ntsymUg7ljoGk
NrqCkZtv6E+FRCDTR5f0O7M1T4BWJodxw2NwngnTwKByLC25EZaGx80o+VyMt0eT
Pog+++JZaa5zZr2OYOtdlPVMLc2ALL6p/8lHOqFU3GKfIf04hWOm6/Lb2IWoXs3f
CG2b6vzoXYyjbF0Q7kxadb8uBY2S1Ds+CVu2HMBBsXsPdwbbtFWOT/6aRAQu61qn
PW+f47NR8By3SO6nMzWts2SZEERZNIEdSKeXHuR7As1jFMXrHLItreb4GCSPay5h
2bzRpxt2br5CDLh7Jv2pZnHtUqBWOSbslTix77+Z/hPKaNowvD9v3tc5hX87rDmB
dYXROD6/KoyXFYDcMdphlvORFhqGqd5bEYuHHum/VjSIRh237+/nxFY/vZ4i4bzU
2fvpAmUlVX1c4rw=
=LlWA
-----END PGP SIGNATURE-----
Merge tag 'selinux-pr-20200330' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull SELinux updates from Paul Moore:
"We've got twenty SELinux patches for the v5.7 merge window, the
highlights are below:
- Deprecate setting /sys/fs/selinux/checkreqprot to 1.
This flag was originally created to deal with legacy userspace and
the READ_IMPLIES_EXEC personality flag. We changed the default from
1 to 0 back in Linux v4.4 and now we are taking the next step of
deprecating it, at some point in the future we will take the final
step of rejecting 1.
- Allow kernfs symlinks to inherit the SELinux label of the parent
directory. In order to preserve backwards compatibility this is
protected by the genfs_seclabel_symlinks SELinux policy capability.
- Optimize how we store filename transitions in the kernel, resulting
in some significant improvements to policy load times.
- Do a better job calculating our internal hash table sizes which
resulted in additional policy load improvements and likely general
SELinux performance improvements as well.
- Remove the unused initial SIDs (labels) and improve how we handle
initial SIDs.
- Enable per-file labeling for the bpf filesystem.
- Ensure that we properly label NFS v4.2 filesystems to avoid a
temporary unlabeled condition.
- Add some missing XFS quota command types to the SELinux quota
access controls.
- Fix a problem where we were not updating the seq_file position
index correctly in selinuxfs.
- We consolidate some duplicated code into helper functions.
- A number of list to array conversions.
- Update Stephen Smalley's email address in MAINTAINERS"
* tag 'selinux-pr-20200330' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: clean up indentation issue with assignment statement
NFS: Ensure security label is set for root inode
MAINTAINERS: Update my email address
selinux: avtab_init() and cond_policydb_init() return void
selinux: clean up error path in policydb_init()
selinux: remove unused initial SIDs and improve handling
selinux: reduce the use of hard-coded hash sizes
selinux: Add xfs quota command types
selinux: optimize storage of filename transitions
selinux: factor out loop body from filename_trans_read()
security: selinux: allow per-file labeling for bpffs
selinux: generalize evaluate_cond_node()
selinux: convert cond_expr to array
selinux: convert cond_av_list to array
selinux: convert cond_list to array
selinux: sel_avc_get_stat_idx should increase position index
selinux: allow kernfs symlinks to inherit parent directory context
selinux: simplify evaluate_cond_node()
Documentation,selinux: deprecate setting checkreqprot to 1
selinux: move status variables out of selinux_ss
The assignment of e->type_names is indented one level too deep,
clean this up by removing the extraneous tab.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Pull EFI updates from Ingo Molnar:
"The EFI changes in this cycle are much larger than usual, for two
(positive) reasons:
- The GRUB project is showing signs of life again, resulting in the
introduction of the generic Linux/UEFI boot protocol, instead of
x86 specific hacks which are increasingly difficult to maintain.
There's hope that all future extensions will now go through that
boot protocol.
- Preparatory work for RISC-V EFI support.
The main changes are:
- Boot time GDT handling changes
- Simplify handling of EFI properties table on arm64
- Generic EFI stub cleanups, to improve command line handling, file
I/O, memory allocation, etc.
- Introduce a generic initrd loading method based on calling back
into the firmware, instead of relying on the x86 EFI handover
protocol or device tree.
- Introduce a mixed mode boot method that does not rely on the x86
EFI handover protocol either, and could potentially be adopted by
other architectures (if another one ever surfaces where one
execution mode is a superset of another)
- Clean up the contents of 'struct efi', and move out everything that
doesn't need to be stored there.
- Incorporate support for UEFI spec v2.8A changes that permit
firmware implementations to return EFI_UNSUPPORTED from UEFI
runtime services at OS runtime, and expose a mask of which ones are
supported or unsupported via a configuration table.
- Partial fix for the lack of by-VA cache maintenance in the
decompressor on 32-bit ARM.
- Changes to load device firmware from EFI boot service memory
regions
- Various documentation updates and minor code cleanups and fixes"
* 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (114 commits)
efi/libstub/arm: Fix spurious message that an initrd was loaded
efi/libstub/arm64: Avoid image_base value from efi_loaded_image
partitions/efi: Fix partition name parsing in GUID partition entry
efi/x86: Fix cast of image argument
efi/libstub/x86: Use ULONG_MAX as upper bound for all allocations
efi: Fix a mistype in comments mentioning efivar_entry_iter_begin()
efi/libstub: Avoid linking libstub/lib-ksyms.o into vmlinux
efi/x86: Preserve %ebx correctly in efi_set_virtual_address_map()
efi/x86: Ignore the memory attributes table on i386
efi/x86: Don't relocate the kernel unless necessary
efi/x86: Remove extra headroom for setup block
efi/x86: Add kernel preferred address to PE header
efi/x86: Decompress at start of PE image load address
x86/boot/compressed/32: Save the output address instead of recalculating it
efi/libstub/x86: Deal with exit() boot service returning
x86/boot: Use unsigned comparison for addresses
efi/x86: Avoid using code32_start
efi/x86: Make efi32_pe_entry() more readable
efi/x86: Respect 32-bit ABI in efi32_pe_entry()
efi/x86: Annotate the LOADED_IMAGE_PROTOCOL_GUID with SYM_DATA
...
* The hooks are initialized using the definitions in
include/linux/lsm_hook_defs.h.
* The LSM can be enabled / disabled with CONFIG_BPF_LSM.
Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Brendan Jackman <jackmanb@google.com>
Reviewed-by: Florent Revest <revest@google.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: James Morris <jamorris@linux.microsoft.com>
Link: https://lore.kernel.org/bpf/20200329004356.27286-6-kpsingh@chromium.org
The information about the different types of LSM hooks is scattered
in two locations i.e. union security_list_options and
struct security_hook_heads. Rather than duplicating this information
even further for BPF_PROG_TYPE_LSM, define all the hooks with the
LSM_HOOK macro in lsm_hook_defs.h which is then used to generate all
the data structures required by the LSM framework.
The LSM hooks are defined as:
LSM_HOOK(<return_type>, <default_value>, <hook_name>, args...)
with <default_value> acccessible in security.c as:
LSM_RET_DEFAULT(<hook_name>)
Signed-off-by: KP Singh <kpsingh@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Brendan Jackman <jackmanb@google.com>
Reviewed-by: Florent Revest <revest@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: James Morris <jamorris@linux.microsoft.com>
Link: https://lore.kernel.org/bpf/20200329004356.27286-3-kpsingh@chromium.org
By allocating a kernel buffer with a user-supplied buffer length, it
is possible that a false positive ENOMEM error may be returned because
the user-supplied length is just too large even if the system do have
enough memory to hold the actual key data.
Moreover, if the buffer length is larger than the maximum amount of
memory that can be returned by kmalloc() (2^(MAX_ORDER-1) number of
pages), a warning message will also be printed.
To reduce this possibility, we set a threshold (PAGE_SIZE) over which we
do check the actual key length first before allocating a buffer of the
right size to hold it. The threshold is arbitrary, it is just used to
trigger a buffer length check. It does not limit the actual key length
as long as there is enough memory to satisfy the memory request.
To further avoid large buffer allocation failure due to page
fragmentation, kvmalloc() is used to allocate the buffer so that vmapped
pages can be used when there is not a large enough contiguous set of
pages available for allocation.
In the extremely unlikely scenario that the key keeps on being changed
and made longer (still <= buflen) in between 2 __keyctl_read_key()
calls, the __keyctl_read_key() calling loop in keyctl_read_key() may
have to be iterated a large number of times, but definitely not infinite.
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Some .gitignore files have comments like "Generated files",
"Ignore generated files" at the header part, but they are
too obvious.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, when we add a new user key, the calltrace as below:
add_key()
key_create_or_update()
key_alloc()
__key_instantiate_and_link
generic_key_instantiate
key_payload_reserve
......
Since commit a08bf91ce2 ("KEYS: allow reaching the keys quotas exactly"),
we can reach max bytes/keys in key_alloc, but we forget to remove this
limit when we reserver space for payload in key_payload_reserve. So we
can only reach max keys but not max bytes when having delta between plen
and type->def_datalen. Remove this limit when instantiating the key, so we
can keep consistent with key_alloc.
Also, fix the similar problem in keyctl_chown_key().
Fixes: 0b77f5bfb4 ("keys: make the keyring quotas controllable through /proc/sys")
Fixes: a08bf91ce2 ("KEYS: allow reaching the keys quotas exactly")
Cc: stable@vger.kernel.org # 5.0.x
Cc: Eric Biggers <ebiggers@google.com>
Signed-off-by: Yang Xu <xuyang2018.jy@cn.fujitsu.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Every time a new architecture defines the IMA architecture specific
functions - arch_ima_get_secureboot() and arch_ima_get_policy(), the IMA
include file needs to be updated. To avoid this "noise", this patch
defines a new IMA Kconfig IMA_SECURE_AND_OR_TRUSTED_BOOT option, allowing
the different architectures to select it.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Nayna Jain <nayna@linux.ibm.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Philipp Rudo <prudo@linux.ibm.com> (s390)
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
The avtab_init() and cond_policydb_init() functions always return
zero so mark them as returning void and update the callers not to
check for a return value.
Suggested-by: Stephen Smalley <stephen.smalley.work@gmail.com>
Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Commit e0ac568de1 ("selinux: reduce the use of hard-coded hash sizes")
moved symtab initialization out of policydb_init(), but left the cleanup
of symtabs from the error path. This patch fixes the oversight.
Suggested-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
The #define for formatting log messages, pr_fmt, is duplicated in the
files under security/integrity.
This change moves the definition to security/integrity/integrity.h and
removes the duplicate definitions in the other files under
security/integrity.
With this change, the messages in the following files will be prefixed
with 'integrity'.
security/integrity/platform_certs/platform_keyring.c
security/integrity/platform_certs/load_powerpc.c
security/integrity/platform_certs/load_uefi.c
security/integrity/iint.c
e.g. "integrity: Error adding keys to platform keyring %s\n"
And the messages in the following file will be prefixed with 'ima'.
security/integrity/ima/ima_mok.c
e.g. "ima: Allocating IMA blacklist keyring.\n"
For the rest of the files under security/integrity, there will be no
change in the message format.
Suggested-by: Shuah Khan <skhan@linuxfoundation.org>
Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
process_buffer_measurement() does not have log messages for failure
conditions.
This change adds a log statement in the above function.
Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
The kbuild Makefile specifies object files for vmlinux in the $(obj-y)
lists. These lists depend on the kernel configuration[1].
The kbuild Makefile for IMA combines the object files for IMA into a
single object file namely ima.o. All the object files for IMA should be
combined into ima.o. But certain object files are being added to their
own $(obj-y). This results in the log messages from those modules getting
prefixed with their respective base file name, instead of "ima". This is
inconsistent with the log messages from the IMA modules that are combined
into ima.o.
This change fixes the above issue.
[1] Documentation\kbuild\makefiles.rst
Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Reviewed-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Remove initial SIDs that have never been used or are no longer used by
the kernel from its string table, which is also used to generate the
SECINITSID_* symbols referenced in code. Update the code to
gracefully handle the fact that these can now be NULL. Stop treating
it as an error if a policy defines additional initial SIDs unknown to
the kernel. Do not load unused initial SID contexts into the sidtab.
Fix the incorrect usage of the name from the ocontext in error
messages when loading initial SIDs since these are not presently
written to the kernel policy and are therefore always NULL.
After this change, it is possible to safely reclaim and reuse some of
the unused initial SIDs without compatibility issues. Specifically,
unused initial SIDs that were being assigned the same context as the
unlabeled initial SID in policies can be reclaimed and reused for
another purpose, with existing policies still treating them as having
the unlabeled context and future policies having the option of mapping
them to a more specific context. For example, this could have been
used when the infiniband labeling support was introduced to define
initial SIDs for the default pkey and endport SIDs similar to the
handling of port/netif/node SIDs rather than always using
SECINITSID_UNLABELED as the default.
The set of safely reclaimable unused initial SIDs across all known
policies is igmp_packet (13), icmp_socket (14), tcp_socket (15), kmod
(24), policy (25), and scmp_packet (26); these initial SIDs were
assigned the same context as unlabeled in all known policies including
mls. If only considering non-mls policies (i.e. assuming that mls
users always upgrade policy with their kernels), the set of safely
reclaimable unused initial SIDs further includes file_labels (6), init
(7), sysctl_modprobe (16), and sysctl_fs (18) through sysctl_dev (23).
Adding new initial SIDs beyond SECINITSID_NUM to policy unfortunately
became a fatal error in commit 24ed7fdae6 ("selinux: use separate
table for initial SID lookup") and even before that it could cause
problems on a policy reload (collision between the new initial SID and
one allocated at runtime) ever since commit 42596eafdd ("selinux:
load the initial SIDs upon every policy load") so we cannot safely
start adding new initial SIDs to policies beyond SECINITSID_NUM (27)
until such a time as all such kernels do not need to be supported and
only those that include this commit are relevant. That is not a big
deal since we haven't added a new initial SID since 2004 (v2.6.7) and
we have plenty of unused ones we can reclaim if we truly need one.
If we want to avoid the wasted storage in initial_sid_to_string[]
and/or sidtab->isids[] for the unused initial SIDs, we could introduce
an indirection between the kernel initial SID values and the policy
initial SID values and just map the policy SID values in the ocontexts
to the kernel values during policy_load_isids(). Originally I thought
we'd do this by preserving the initial SID names in the kernel policy
and creating a mapping at load time like we do for the security
classes and permissions but that would require a new kernel policy
format version and associated changes to libsepol/checkpolicy and I'm
not sure it is justified. Simpler approach is just to create a fixed
mapping table in the kernel from the existing fixed policy values to
the kernel values. Less flexible but probably sufficient.
A separate selinux userspace change was applied in
8677ce5e8f
to enable removal of most of the unused initial SID contexts from
policies, but there is no dependency between that change and this one.
That change permits removing all of the unused initial SID contexts
from policy except for the fs and sysctl SID contexts. The initial
SID declarations themselves would remain in policy to preserve the
values of subsequent ones but the contexts can be dropped. If/when
the kernel decides to reuse one of them, future policies can change
the name and start assigning a context again without breaking
compatibility.
Here is how I would envision staging changes to the initial SIDs in a
compatible manner after this commit is applied:
1. At any time after this commit is applied, the kernel could choose
to reclaim one of the safely reclaimable unused initial SIDs listed
above for a new purpose (i.e. replace its NULL entry in the
initial_sid_to_string[] table with a new name and start using the
newly generated SECINITSID_name symbol in code), and refpolicy could
at that time rename its declaration of that initial SID to reflect its
new purpose and start assigning it a context going
forward. Existing/old policies would map the reclaimed initial SID to
the unlabeled context, so that would be the initial default behavior
until policies are updated. This doesn't depend on the selinux
userspace change; it will work with existing policies and userspace.
2. In 6 months or so we'll have another SELinux userspace release that
will include the libsepol/checkpolicy support for omitting unused
initial SID contexts.
3. At any time after that release, refpolicy can make that release its
minimum build requirement and drop the sid context statements (but not
the sid declarations) for all of the unused initial SIDs except for
fs and sysctl, which must remain for compatibility on policy
reload with old kernels and for compatibility with kernels that were
still using SECINITSID_SYSCTL (< 2.6.39). This doesn't depend on this
kernel commit; it will work with previous kernels as well.
4. After N years for some value of N, refpolicy decides that it no
longer cares about policy reload compatibility for kernels that
predate this kernel commit, and refpolicy drops the fs and sysctl
SID contexts from policy too (but retains the declarations).
5. After M years for some value of M, the kernel decides that it no
longer cares about compatibility with refpolicies that predate step 4
(dropping the fs and sysctl SIDs), and those two SIDs also become
safely reclaimable. This step is optional and need not ever occur unless
we decide that the need to reclaim those two SIDs outweighs the
compatibility cost.
6. After O years for some value of O, refpolicy decides that it no
longer cares about policy load (not just reload) compatibility for
kernels that predate this kernel commit, and both kernel and refpolicy
can then start adding and using new initial SIDs beyond 27. This does
not depend on the previous change (step 5) and can occur independent
of it.
Fixes: https://github.com/SELinuxProject/selinux-kernel/issues/12
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Instead allocate hash tables with just the right size based on the
actual number of elements (which is almost always known beforehand, we
just need to defer the hashtab allocation to the right time). The only
case when we don't know the size (with the current policy format) is the
new filename transitions hashtable. Here I just left the existing value.
After this patch, the time to load Fedora policy on x86_64 decreases
from 790 ms to 167 ms. If the unconfined module is removed, it decreases
from 750 ms to 122 ms. It is also likely that other operations are going
to be faster, mainly string_to_context_struct() or mls_compute_sid(),
but I didn't try to quantify that.
The memory usage of all hash table arrays increases from ~58 KB to
~163 KB (with Fedora policy on x86_64).
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
This time, the set of changes for the EFI subsystem is much larger than
usual. The main reasons are:
- Get things cleaned up before EFI support for RISC-V arrives, which will
increase the size of the validation matrix, and therefore the threshold to
making drastic changes,
- After years of defunct maintainership, the GRUB project has finally started
to consider changes from the distros regarding UEFI boot, some of which are
highly specific to the way x86 does UEFI secure boot and measured boot,
based on knowledge of both shim internals and the layout of bootparams and
the x86 setup header. Having this maintenance burden on other architectures
(which don't need shim in the first place) is hard to justify, so instead,
we are introducing a generic Linux/UEFI boot protocol.
Summary of changes:
- Boot time GDT handling changes (Arvind)
- Simplify handling of EFI properties table on arm64
- Generic EFI stub cleanups, to improve command line handling, file I/O,
memory allocation, etc.
- Introduce a generic initrd loading method based on calling back into
the firmware, instead of relying on the x86 EFI handover protocol or
device tree.
- Introduce a mixed mode boot method that does not rely on the x86 EFI
handover protocol either, and could potentially be adopted by other
architectures (if another one ever surfaces where one execution mode
is a superset of another)
- Clean up the contents of struct efi, and move out everything that
doesn't need to be stored there.
- Incorporate support for UEFI spec v2.8A changes that permit firmware
implementations to return EFI_UNSUPPORTED from UEFI runtime services at
OS runtime, and expose a mask of which ones are supported or unsupported
via a configuration table.
- Various documentation updates and minor code cleanups (Heinrich)
- Partial fix for the lack of by-VA cache maintenance in the decompressor
on 32-bit ARM. Note that these patches were deliberately put at the
beginning so they can be used as a stable branch that will be shared with
a PR containing the complete fix, which I will send to the ARM tree.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEnNKg2mrY9zMBdeK7wjcgfpV0+n0FAl5S7WYACgkQwjcgfpV0
+n1jmQgAmwV3V8pbgB4mi4P2Mv8w5Zj5feUe6uXnTR2AFv5nygLcTzdxN+TU/6lc
OmZv2zzdsAscYlhuUdI/4t4cXIjHAZI39+UBoNRuMqKbkbvXCFscZANLxvJjHjZv
FFbgUk0DXkF0BCEDuSLNavidAv4b3gZsOmHbPfwuB8xdP05LbvbS2mf+2tWVAi2z
ULfua/0o9yiwl19jSS6iQEPCvvLBeBzTLW7x5Rcm/S0JnotzB59yMaeqD7jO8JYP
5PvI4WM/l5UfVHnZp2k1R76AOjReALw8dQgqAsT79Q7+EH3sNNuIjU6omdy+DFf4
tnpwYfeLOaZ1SztNNfU87Hsgnn2CGw==
=pDE3
-----END PGP SIGNATURE-----
Merge tag 'efi-next' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi into efi/core
Pull EFI updates for v5.7 from Ard Biesheuvel:
This time, the set of changes for the EFI subsystem is much larger than
usual. The main reasons are:
- Get things cleaned up before EFI support for RISC-V arrives, which will
increase the size of the validation matrix, and therefore the threshold to
making drastic changes,
- After years of defunct maintainership, the GRUB project has finally started
to consider changes from the distros regarding UEFI boot, some of which are
highly specific to the way x86 does UEFI secure boot and measured boot,
based on knowledge of both shim internals and the layout of bootparams and
the x86 setup header. Having this maintenance burden on other architectures
(which don't need shim in the first place) is hard to justify, so instead,
we are introducing a generic Linux/UEFI boot protocol.
Summary of changes:
- Boot time GDT handling changes (Arvind)
- Simplify handling of EFI properties table on arm64
- Generic EFI stub cleanups, to improve command line handling, file I/O,
memory allocation, etc.
- Introduce a generic initrd loading method based on calling back into
the firmware, instead of relying on the x86 EFI handover protocol or
device tree.
- Introduce a mixed mode boot method that does not rely on the x86 EFI
handover protocol either, and could potentially be adopted by other
architectures (if another one ever surfaces where one execution mode
is a superset of another)
- Clean up the contents of struct efi, and move out everything that
doesn't need to be stored there.
- Incorporate support for UEFI spec v2.8A changes that permit firmware
implementations to return EFI_UNSUPPORTED from UEFI runtime services at
OS runtime, and expose a mask of which ones are supported or unsupported
via a configuration table.
- Various documentation updates and minor code cleanups (Heinrich)
- Partial fix for the lack of by-VA cache maintenance in the decompressor
on 32-bit ARM. Note that these patches were deliberately put at the
beginning so they can be used as a stable branch that will be shared with
a PR containing the complete fix, which I will send to the ARM tree.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Testing the value of the efi.get_variable function pointer is not
the right way to establish whether the platform supports EFI
variables at runtime. Instead, use the newly added granular check
that can test for the presence of each EFI runtime service
individually.
Acked-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Add Q_XQUOTAOFF, Q_XQUOTAON and Q_XSETQLIM to trigger filesystem quotamod
permission check.
Add Q_XGETQUOTA, Q_XGETQSTAT, Q_XGETQSTATV and Q_XGETNEXTQUOTA to trigger
filesystem quotaget permission check.
Signed-off-by: Richard Haines <richard_c_haines@btinternet.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Paul Moore <paul@paul-moore.com>
In these rules, each rule with the same (target type, target class,
filename) values is (in practice) always mapped to the same result type.
Therefore, it is much more efficient to group the rules by (ttype,
tclass, filename).
Thus, this patch drops the stype field from the key and changes the
datum to be a linked list of one or more structures that contain a
result type and an ebitmap of source types that map the given target to
the given result type under the given filename. The size of the hash
table is also incremented to 2048 to be more optimal for Fedora policy
(which currently has ~2500 unique (ttype, tclass, filename) tuples,
regardless of whether the 'unconfined' module is enabled).
Not only does this dramtically reduce memory usage when the policy
contains a lot of unconfined domains (ergo a lot of filename based
transitions), but it also slightly reduces memory usage of strongly
confined policies (modeled on Fedora policy with 'unconfined' module
disabled) and significantly reduces lookup times of these rules on
Fedora (roughly matches the performance of the rhashtable conversion
patch [1] posted recently to selinux@vger.kernel.org).
An obvious next step is to change binary policy format to match this
layout, so that disk space is also saved. However, since that requires
more work (including matching userspace changes) and this patch is
already beneficial on its own, I'm posting it separately.
Performance/memory usage comparison:
Kernel | Policy load | Policy load | Mem usage | Mem usage | openbench
| | (-unconfined) | | (-unconfined) | (createfiles)
-----------------|-------------|---------------|-----------|---------------|--------------
reference | 1,30s | 0,91s | 90MB | 77MB | 55 us/file
rhashtable patch | 0.98s | 0,85s | 85MB | 75MB | 38 us/file
this patch | 0,95s | 0,87s | 75MB | 75MB | 40 us/file
(Memory usage is measured after boot. With SELinux disabled the memory
usage was ~60MB on the same system.)
[1] https://lore.kernel.org/selinux/20200116213937.77795-1-dev@lynxeye.de/T/
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Pull IMA fixes from Mimi Zohar:
"Two bug fixes and an associated change for each.
The one that adds SM3 to the IMA list of supported hash algorithms is
a simple change, but could be considered a new feature"
* 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
ima: add sm3 algorithm to hash algorithm configuration list
crypto: rename sm3-256 to sm3 in hash_algo_name
efi: Only print errors about failing to get certs if EFI vars are found
x86/ima: use correct identifier for SetupMode variable
sm3 has been supported by the ima hash algorithm, but it is not
yet in the Kconfig configuration list. After adding, both ima and tpm2
can support sm3 well.
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
If CONFIG_LOAD_UEFI_KEYS is enabled, the kernel attempts to load the certs
from the db, dbx and MokListRT EFI variables into the appropriate keyrings.
But it just assumes that the variables will be present and prints an error
if the certs can't be loaded, even when is possible that the variables may
not exist. For example the MokListRT variable will only be present if shim
is used.
So only print an error message about failing to get the certs list from an
EFI variable if this is found. Otherwise these printed errors just pollute
the kernel log ring buffer with confusing messages like the following:
[ 5.427251] Couldn't get size: 0x800000000000000e
[ 5.427261] MODSIGN: Couldn't get UEFI db list
[ 5.428012] Couldn't get size: 0x800000000000000e
[ 5.428023] Couldn't get UEFI MokListRT
Reported-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Tested-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
It simplifies cleanup in the error path. This will be extra useful in
later patch.
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Add support for genfscon per-file labeling of bpffs files. This allows
for separate permissions for different pinned bpf objects, which may
be completely unrelated to each other.
Signed-off-by: Connor O'Brien <connoro@google.com>
Signed-off-by: Steven Moreland <smoreland@google.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Both callers iterate the cond_list and call it for each node - turn it
into evaluate_cond_nodes(), which does the iteration for them.
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Since it is fixed-size after allocation and we know the size beforehand,
using a plain old array is simpler and more efficient.
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Reviewed-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Since it is fixed-size after allocation and we know the size beforehand,
using a plain old array is simpler and more efficient.
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Reviewed-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Since it is fixed-size after allocation and we know the size beforehand,
using a plain old array is simpler and more efficient.
While there, also fix signedness of some related variables/parameters.
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAl5ByJYUHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQ6iDy2pc3iXMNLg/+Nj3gdWa5z0lsFuvUs87uM4HYHLtp
9/wENEaqy7W92LAAEW5y+6sK8Pm9/CO3xf2d40SAr/xjwBo2Fbnfmx7QwDMUCQS6
vyrLDiikSMyFU5j1AQDFuWjWF36FZlfqix+oXMETJ1tqxjIFxFN+rvbsDICPiQTl
VxBG4CWWzNtSllYDDj3KYcVaGzNb3nleb8n436gFoHS++12BaypVA0kBlV3BqzPi
mC0+gDuY06hwFmc/tAl8WndRwZpJ71rgsKJ5opgel61Zxf1J39QdTxeap9hhldJl
5FbtoiGpnMXtHzpRGh6BHag/2gGX/0+J+t8ZATuk4GRtt1ueyYw9kNnAMblooJSV
TwJlv1LLIKpAmSSJnKoN1AUAsxhS3dAmVyPuYWtmqEq8wq7YEYi5UnK1fH4ziI+b
5LmYD0D/FoNE1Dr1TV78HmM9i0+AteH5FhkS0V6GD3v+wO6vCb3hTYaQpMWhhgnY
/SEWbvLUO1fvQoKA8GT3UtmiUFdrqvrQFoL/l3weo9tvfyG3Walxo8s0w0r/DVUH
AevIhUVi+snSLhI95fF26wskrZtEE9gwl/BoMjfKse0HK2t77xBE3kZW88NnSXOU
Wk+ozfgibCnzk/Do/Y4iRr9GbgmCfscjOqlq1Yzyt/ZkUsG2egWHMzpritnzVEio
NSQ6n6kpbkRgkHc=
=0cTj
-----END PGP SIGNATURE-----
Merge tag 'selinux-pr-20200210' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull SELinux fixes from Paul Moore:
"Two small fixes: one fixes a locking problem in the recently merged
label translation code, the other fixes an embarrassing 'binderfs' /
'binder' filesystem name check"
* tag 'selinux-pr-20200210' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: fix sidtab string cache locking
selinux: fix typo in filesystem name
If seq_file .next function does not change position index,
read after some lseek can generate unexpected output.
$ dd if=/sys/fs/selinux/avc/cache_stats # usual output
lookups hits misses allocations reclaims frees
817223 810034 7189 7189 6992 7037
1934894 1926896 7998 7998 7632 7683
1322812 1317176 5636 5636 5456 5507
1560571 1551548 9023 9023 9056 9115
0+1 records in
0+1 records out
189 bytes copied, 5,1564e-05 s, 3,7 MB/s
$# read after lseek to midle of last line
$ dd if=/sys/fs/selinux/avc/cache_stats bs=180 skip=1
dd: /sys/fs/selinux/avc/cache_stats: cannot skip to specified offset
056 9115 <<<< end of last line
1560571 1551548 9023 9023 9056 9115 <<< whole last line once again
0+1 records in
0+1 records out
45 bytes copied, 8,7221e-05 s, 516 kB/s
$# read after lseek beyond end of of file
$ dd if=/sys/fs/selinux/avc/cache_stats bs=1000 skip=1
dd: /sys/fs/selinux/avc/cache_stats: cannot skip to specified offset
1560571 1551548 9023 9023 9056 9115 <<<< generates whole last line
0+1 records in
0+1 records out
36 bytes copied, 9,0934e-05 s, 396 kB/s
https://bugzilla.kernel.org/show_bug.cgi?id=206283
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Currently symlinks on kernel filesystems, like sysfs, are labeled on
creation with the parent filesystem root sid.
Allow symlinks to inherit the parent directory context, so fine-grained
kernfs labeling can be applied to symlinks too and checking contexts
doesn't complain about them.
For backward-compatibility this behavior is contained in a new policy
capability: genfs_seclabel_symlinks
Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
It never fails, so it can just return void.
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Reviewed-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Deprecate setting the SELinux checkreqprot tunable to 1 via kernel
parameter or /sys/fs/selinux/checkreqprot. Setting it to 0 is left
intact for compatibility since Android and some Linux distributions
do so for security and treat an inability to set it as a fatal error.
Eventually setting it to 0 will become a no-op and the kernel will
stop using checkreqprot's value internally altogether.
checkreqprot was originally introduced as a compatibility mechanism
for legacy userspace and the READ_IMPLIES_EXEC personality flag.
However, if set to 1, it weakens security by allowing mappings to be
made executable without authorization by policy. The default value
for the SECURITY_SELINUX_CHECKREQPROT_VALUE config option was changed
from 1 to 0 in commit 2a35d196c1 ("selinux: change
CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE default") and both Android
and Linux distributions began explicitly setting
/sys/fs/selinux/checkreqprot to 0 some time ago.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
It fits more naturally in selinux_state, since it reflects also global
state (the enforcing and policyload fields).
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Reviewed-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Pull vfs file system parameter updates from Al Viro:
"Saner fs_parser.c guts and data structures. The system-wide registry
of syntax types (string/enum/int32/oct32/.../etc.) is gone and so is
the horror switch() in fs_parse() that would have to grow another case
every time something got added to that system-wide registry.
New syntax types can be added by filesystems easily now, and their
namespace is that of functions - not of system-wide enum members. IOW,
they can be shared or kept private and if some turn out to be widely
useful, we can make them common library helpers, etc., without having
to do anything whatsoever to fs_parse() itself.
And we already get that kind of requests - the thing that finally
pushed me into doing that was "oh, and let's add one for timeouts -
things like 15s or 2h". If some filesystem really wants that, let them
do it. Without somebody having to play gatekeeper for the variants
blessed by direct support in fs_parse(), TYVM.
Quite a bit of boilerplate is gone. And IMO the data structures make a
lot more sense now. -200LoC, while we are at it"
* 'merge.nfs-fs_parse.1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (25 commits)
tmpfs: switch to use of invalfc()
cgroup1: switch to use of errorfc() et.al.
procfs: switch to use of invalfc()
hugetlbfs: switch to use of invalfc()
cramfs: switch to use of errofc() et.al.
gfs2: switch to use of errorfc() et.al.
fuse: switch to use errorfc() et.al.
ceph: use errorfc() and friends instead of spelling the prefix out
prefix-handling analogues of errorf() and friends
turn fs_param_is_... into functions
fs_parse: handle optional arguments sanely
fs_parse: fold fs_parameter_desc/fs_parameter_spec
fs_parser: remove fs_parameter_description name field
add prefix to fs_context->log
ceph_parse_param(), ceph_parse_mon_ips(): switch to passing fc_log
new primitive: __fs_parse()
switch rbd and libceph to p_log-based primitives
struct p_log, variants of warnf() et.al. taking that one instead
teach logfc() to handle prefices, give it saner calling conventions
get rid of cg_invalf()
...
Unused now.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJeO2KSAAoJEDilFXyMQ8gRvdAP/iQU5AUl4an9ZZpWIcgbotF7
EyDQrvtvfdVaDMOdq4Pf7uGs9CFYCpuhOTGE3nN4lpIHmpuS8hPqAC04Nmezcsrg
oHPFsaGPShD6npQIwJhsrXIvj1ah7pAOn0uE7QCIKWd+dJGsWqL2uDtRZO1KV6Js
N2w+T4ulrIs0dRqx6P8pGx8igGdpBCkS3Hz39+Ac+ng5SXgrS8QFivVspp4UgYqV
3E4tDw0QRsxJAyOigphndsmjtRWjTh3BKDX9q0hWUtNfk17OfqIbsvT6GlOMa5FX
Ru2o6DG/DBjgZl+1nRomKTeEfmkD+8EQJVn8HXqsGdZRjcUeXJ1A9jw+c1OgHJg0
kYzC3EvFbj47x2N7EQiIRefOKJlGLEqjy4ASuEcgemLertBhu2bIp8g8fa0lgfpN
2RdGjXME7XnpVlmf8jhL+oJKvu8Us/5lw5sjDmx1lRA448aB+QMzUcj6bVBel8Ls
UbjjNM4LA3+G7AQdtfOhth6P5ueqlKDrXIN7j/sijmMXrysfY4tgk15mSG1TtbQb
y/9yVMsdatyRugdzEU/UyAN7D/RH011DVTztoans+SMvKFS9jxesIYSlopJJNDsI
MMd8Ptirxa8sStjLnealZAIqFqe+S58torxm85nFwxTzkRuYLCLKmFH0KEI904Gn
/oZZ6CIYoLYlerzQDNwF
=WBOT
-----END PGP SIGNATURE-----
Merge tag 'Smack-for-5.6' of git://github.com/cschaufler/smack-next
Pull smack fix from Casey Schaufler:
"One fix for an obscure error found using an old version of ping(1)
that did not use IPv6 sockets in the documented way"
* tag 'Smack-for-5.6' of git://github.com/cschaufler/smack-next:
broken ping to ipv6 linklocal addresses on debian buster
Avoiding taking a lock in an IRQ context is not enough to prevent
deadlocks, as discovered by syzbot:
===
WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
5.5.0-syzkaller #0 Not tainted
-----------------------------------------------------
syz-executor.0/8927 [HC0[0]:SC0[2]:HE1:SE0] is trying to acquire:
ffff888027c94098 (&(&s->cache_lock)->rlock){+.+.}, at: spin_lock include/linux/spinlock.h:338 [inline]
ffff888027c94098 (&(&s->cache_lock)->rlock){+.+.}, at: sidtab_sid2str_put.part.0+0x36/0x880 security/selinux/ss/sidtab.c:533
and this task is already holding:
ffffffff898639b0 (&(&nf_conntrack_locks[i])->rlock){+.-.}, at: spin_lock include/linux/spinlock.h:338 [inline]
ffffffff898639b0 (&(&nf_conntrack_locks[i])->rlock){+.-.}, at: nf_conntrack_lock+0x17/0x70 net/netfilter/nf_conntrack_core.c:91
which would create a new lock dependency:
(&(&nf_conntrack_locks[i])->rlock){+.-.} -> (&(&s->cache_lock)->rlock){+.+.}
but this new dependency connects a SOFTIRQ-irq-safe lock:
(&(&nf_conntrack_locks[i])->rlock){+.-.}
[...]
other info that might help us debug this:
Possible interrupt unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&(&s->cache_lock)->rlock);
local_irq_disable();
lock(&(&nf_conntrack_locks[i])->rlock);
lock(&(&s->cache_lock)->rlock);
<Interrupt>
lock(&(&nf_conntrack_locks[i])->rlock);
*** DEADLOCK ***
[...]
===
Fix this by simply locking with irqsave/irqrestore and stop giving up on
!in_task(). It makes the locking a bit slower, but it shouldn't make a
big difference in real workloads. Under the scenario from [1] (only
cache hits) it only increased the runtime overhead from the
security_secid_to_secctx() function from ~2% to ~3% (it was ~5-65%
before introducing the cache).
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1733259
Fixes: d97bd23c2d ("selinux: cache the SID -> context string translation")
Reported-by: syzbot+61cba5033e2072d61806@syzkaller.appspotmail.com
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Correct the filesystem name to "binder" to enable genfscon per-file
labelling for binderfs.
Fixes: 7a4b519474 ("selinux: allow per-file labelling for binderfs")
Signed-off-by: Hridya Valsaraju <hridya@google.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: slight style changes to the subj/description]
Signed-off-by: Paul Moore <paul@paul-moore.com>
I am seeing ping failures to IPv6 linklocal addresses with Debian
buster. Easiest example to reproduce is:
$ ping -c1 -w1 ff02::1%eth1
connect: Invalid argument
$ ping -c1 -w1 ff02::1%eth1
PING ff02::01%eth1(ff02::1%eth1) 56 data bytes
64 bytes from fe80::e0:f9ff:fe0c:37%eth1: icmp_seq=1 ttl=64 time=0.059 ms
git bisect traced the failure to
commit b9ef5513c9 ("smack: Check address length before reading address family")
Arguably ping is being stupid since the buster version is not setting
the address family properly (ping on stretch for example does):
$ strace -e connect ping6 -c1 -w1 ff02::1%eth1
connect(5, {sa_family=AF_UNSPEC,
sa_data="\4\1\0\0\0\0\377\2\0\0\0\0\0\0\0\0\0\0\0\0\0\1\3\0\0\0"}, 28)
= -1 EINVAL (Invalid argument)
but the command works fine on kernels prior to this commit, so this is
breakage which goes against the Linux paradigm of "don't break userspace"
Cc: stable@vger.kernel.org
Reported-by: David Ahern <dsahern@gmail.com>
Suggested-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
security/smack/smack_lsm.c | 41 +++++++++++++++++++----------------------
1 file changed, 19 insertions(+), 22 deletions(-)
This kunit update for Linux 5.6-rc1 consists of:
-- Support for building kunit as a module from Alan Maguire
-- AppArmor KUnit tests for policy unpack from Mike Salvatore
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAl4xz/wACgkQCwJExA0N
Qxyg2A//X0bnhN82oCchkTRW3GyGi5wTR2wGhoNzMZD0XUtCvn+4BlCSP20ttYdT
beiLCiewcuEdvXRyEV9Kikvet/67ovbjA/ce6ZrR7TlIHo8esKcy19/nu1OTvtI1
8eji1q7NSEV9iswz1ZoBAw+MTDHZfOI9qYY2UPcwjy7xWN84z2X1w+8UQ3EamOKd
6BfbohsYuuTTHhA2k1aUzvQcHqNz0YdH4yvNQpdunJXLUI04TeGZA6Ug66u6kWEd
1f5SSAu6r1vnU7DADrb1QwEDuIwL4KBuaMg2Rj5GLxTNp3wxmW9M2Dit+iN7+vNH
TS31kZW6KgxC5XuGVPENJaWlDX5Hm+5W8uiRZLNXsxDy927u53RzwrSZw/FbdbB1
HuPZZCzE1soWHdPIQz44HCCAg9XddypYlC1o4IYL1JkJknqG12ky4xgM8GRNCZAB
oUW3Ax3Lcr0EJALO/kFd/uEbl79PdmDk8uPMU1jtLyx5cs70yC3fsT2GB+DbP802
i/FxTtrOMGjU2OWcYfQcXapvZdgImf9nPsSZe3FJXjHfytNRbVZOZ2rHAMh03Keu
EBthDs6ejm6OUSGUXjngE9NaQKXsNSQ1Qor+6FrGnT4IxUMzWenudqHH7/dgF7Fr
fHlZGBilKMc/EYKb/6hj4kvEChrSIXj6TFknmI28I/epPiOr2gU=
=AFO4
-----END PGP SIGNATURE-----
Merge tag 'linux-kselftest-5.6-rc1-kunit' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull Kselftest kunit updates from Shuah Khan:
"This kunit update consists of:
- Support for building kunit as a module from Alan Maguire
- AppArmor KUnit tests for policy unpack from Mike Salvatore"
* tag 'linux-kselftest-5.6-rc1-kunit' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
kunit: building kunit as a module breaks allmodconfig
kunit: update documentation to describe module-based build
kunit: allow kunit to be loaded as a module
kunit: remove timeout dependence on sysctl_hung_task_timeout_seconds
kunit: allow kunit tests to be loaded as a module
kunit: hide unexported try-catch interface in try-catch-impl.h
kunit: move string-stream.h to lib/kunit
apparmor: add AppArmor KUnit tests for policy unpack
Pull openat2 support from Al Viro:
"This is the openat2() series from Aleksa Sarai.
I'm afraid that the rest of namei stuff will have to wait - it got
zero review the last time I'd posted #work.namei, and there had been a
leak in the posted series I'd caught only last weekend. I was going to
repost it on Monday, but the window opened and the odds of getting any
review during that... Oh, well.
Anyway, openat2 part should be ready; that _did_ get sane amount of
review and public testing, so here it comes"
From Aleksa's description of the series:
"For a very long time, extending openat(2) with new features has been
incredibly frustrating. This stems from the fact that openat(2) is
possibly the most famous counter-example to the mantra "don't silently
accept garbage from userspace" -- it doesn't check whether unknown
flags are present[1].
This means that (generally) the addition of new flags to openat(2) has
been fraught with backwards-compatibility issues (O_TMPFILE has to be
defined as __O_TMPFILE|O_DIRECTORY|[O_RDWR or O_WRONLY] to ensure old
kernels gave errors, since it's insecure to silently ignore the
flag[2]). All new security-related flags therefore have a tough road
to being added to openat(2).
Furthermore, the need for some sort of control over VFS's path
resolution (to avoid malicious paths resulting in inadvertent
breakouts) has been a very long-standing desire of many userspace
applications.
This patchset is a revival of Al Viro's old AT_NO_JUMPS[3] patchset
(which was a variant of David Drysdale's O_BENEATH patchset[4] which
was a spin-off of the Capsicum project[5]) with a few additions and
changes made based on the previous discussion within [6] as well as
others I felt were useful.
In line with the conclusions of the original discussion of
AT_NO_JUMPS, the flag has been split up into separate flags. However,
instead of being an openat(2) flag it is provided through a new
syscall openat2(2) which provides several other improvements to the
openat(2) interface (see the patch description for more details). The
following new LOOKUP_* flags are added:
LOOKUP_NO_XDEV:
Blocks all mountpoint crossings (upwards, downwards, or through
absolute links). Absolute pathnames alone in openat(2) do not
trigger this. Magic-link traversal which implies a vfsmount jump is
also blocked (though magic-link jumps on the same vfsmount are
permitted).
LOOKUP_NO_MAGICLINKS:
Blocks resolution through /proc/$pid/fd-style links. This is done
by blocking the usage of nd_jump_link() during resolution in a
filesystem. The term "magic-links" is used to match with the only
reference to these links in Documentation/, but I'm happy to change
the name.
It should be noted that this is different to the scope of
~LOOKUP_FOLLOW in that it applies to all path components. However,
you can do openat2(NO_FOLLOW|NO_MAGICLINKS) on a magic-link and it
will *not* fail (assuming that no parent component was a
magic-link), and you will have an fd for the magic-link.
In order to correctly detect magic-links, the introduction of a new
LOOKUP_MAGICLINK_JUMPED state flag was required.
LOOKUP_BENEATH:
Disallows escapes to outside the starting dirfd's
tree, using techniques such as ".." or absolute links. Absolute
paths in openat(2) are also disallowed.
Conceptually this flag is to ensure you "stay below" a certain
point in the filesystem tree -- but this requires some additional
to protect against various races that would allow escape using
"..".
Currently LOOKUP_BENEATH implies LOOKUP_NO_MAGICLINKS, because it
can trivially beam you around the filesystem (breaking the
protection). In future, there might be similar safety checks done
as in LOOKUP_IN_ROOT, but that requires more discussion.
In addition, two new flags are added that expand on the above ideas:
LOOKUP_NO_SYMLINKS:
Does what it says on the tin. No symlink resolution is allowed at
all, including magic-links. Just as with LOOKUP_NO_MAGICLINKS this
can still be used with NOFOLLOW to open an fd for the symlink as
long as no parent path had a symlink component.
LOOKUP_IN_ROOT:
This is an extension of LOOKUP_BENEATH that, rather than blocking
attempts to move past the root, forces all such movements to be
scoped to the starting point. This provides chroot(2)-like
protection but without the cost of a chroot(2) for each filesystem
operation, as well as being safe against race attacks that
chroot(2) is not.
If a race is detected (as with LOOKUP_BENEATH) then an error is
generated, and similar to LOOKUP_BENEATH it is not permitted to
cross magic-links with LOOKUP_IN_ROOT.
The primary need for this is from container runtimes, which
currently need to do symlink scoping in userspace[7] when opening
paths in a potentially malicious container.
There is a long list of CVEs that could have bene mitigated by
having RESOLVE_THIS_ROOT (such as CVE-2017-1002101,
CVE-2017-1002102, CVE-2018-15664, and CVE-2019-5736, just to name a
few).
In order to make all of the above more usable, I'm working on
libpathrs[8] which is a C-friendly library for safe path resolution.
It features a userspace-emulated backend if the kernel doesn't support
openat2(2). Hopefully we can get userspace to switch to using it, and
thus get openat2(2) support for free once it's ready.
Future work would include implementing things like
RESOLVE_NO_AUTOMOUNT and possibly a RESOLVE_NO_REMOTE (to allow
programs to be sure they don't hit DoSes though stale NFS handles)"
* 'work.openat2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
Documentation: path-lookup: include new LOOKUP flags
selftests: add openat2(2) selftests
open: introduce openat2(2) syscall
namei: LOOKUP_{IN_ROOT,BENEATH}: permit limited ".." resolution
namei: LOOKUP_IN_ROOT: chroot-like scoped resolution
namei: LOOKUP_BENEATH: O_BENEATH-like scoped resolution
namei: LOOKUP_NO_XDEV: block mountpoint crossing
namei: LOOKUP_NO_MAGICLINKS: block magic-link resolution
namei: LOOKUP_NO_SYMLINKS: block symlink resolution
namei: allow set_root() to produce errors
namei: allow nd_jump_link() to produce errors
nsfs: clean-up ns_get_path() signature to return int
namei: only return -ECHILD from follow_dotdot_rcu()
Pull security subsystem update from James Morris:
"Just one minor fix this time"
* 'for-v5.6' of git://git.kernel.org:/pub/scm/linux/kernel/git/jmorris/linux-security:
security: remove EARLY_LSM_COUNT which never used
Pull IMA updates from Mimi Zohar:
"Two new features - measuring certificates and querying IMA for a file
hash - and three bug fixes:
- Measuring certificates is like the rest of IMA, based on policy,
but requires loading a custom policy. Certificates loaded onto a
keyring, for example during early boot, before a custom policy has
been loaded, are queued and only processed after loading the custom
policy.
- IMA calculates and caches files hashes. Other kernel subsystems,
and possibly kernel modules, are interested in accessing these
cached file hashes.
The bug fixes prevent classifying a file short read (e.g. shutdown) as
an invalid file signature, add a missing blank when displaying the
securityfs policy rules containing LSM labels, and, lastly, fix the
handling of the IMA policy information for unknown LSM labels"
* 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
IMA: Defined delayed workqueue to free the queued keys
IMA: Call workqueue functions to measure queued keys
IMA: Define workqueue for early boot key measurements
IMA: pre-allocate buffer to hold keyrings string
ima: ima/lsm policy rule loading logic bug fixes
ima: add the ability to query the cached hash of a given file
ima: Add a space after printing LSM rules for readability
IMA: fix measuring asymmetric keys Kconfig
IMA: Read keyrings= option from the IMA policy
IMA: Add support to limit measuring keys
KEYS: Call the IMA hook to measure keys
IMA: Define an IMA hook to measure keys
IMA: Add KEY_CHECK func to measure keys
IMA: Check IMA policy flag
ima: avoid appraise error for hash calc interrupt
Pull networking updates from David Miller:
1) Add WireGuard
2) Add HE and TWT support to ath11k driver, from John Crispin.
3) Add ESP in TCP encapsulation support, from Sabrina Dubroca.
4) Add variable window congestion control to TIPC, from Jon Maloy.
5) Add BCM84881 PHY driver, from Russell King.
6) Start adding netlink support for ethtool operations, from Michal
Kubecek.
7) Add XDP drop and TX action support to ena driver, from Sameeh
Jubran.
8) Add new ipv4 route notifications so that mlxsw driver does not have
to handle identical routes itself. From Ido Schimmel.
9) Add BPF dynamic program extensions, from Alexei Starovoitov.
10) Support RX and TX timestamping in igc, from Vinicius Costa Gomes.
11) Add support for macsec HW offloading, from Antoine Tenart.
12) Add initial support for MPTCP protocol, from Christoph Paasch,
Matthieu Baerts, Florian Westphal, Peter Krystad, and many others.
13) Add Octeontx2 PF support, from Sunil Goutham, Geetha sowjanya, Linu
Cherian, and others.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1469 commits)
net: phy: add default ARCH_BCM_IPROC for MDIO_BCM_IPROC
udp: segment looped gso packets correctly
netem: change mailing list
qed: FW 8.42.2.0 debug features
qed: rt init valid initialization changed
qed: Debug feature: ilt and mdump
qed: FW 8.42.2.0 Add fw overlay feature
qed: FW 8.42.2.0 HSI changes
qed: FW 8.42.2.0 iscsi/fcoe changes
qed: Add abstraction for different hsi values per chip
qed: FW 8.42.2.0 Additional ll2 type
qed: Use dmae to write to widebus registers in fw_funcs
qed: FW 8.42.2.0 Parser offsets modified
qed: FW 8.42.2.0 Queue Manager changes
qed: FW 8.42.2.0 Expose new registers and change windows
qed: FW 8.42.2.0 Internal ram offsets modifications
MAINTAINERS: Add entry for Marvell OcteonTX2 Physical Function driver
Documentation: net: octeontx2: Add RVU HW and drivers overview
octeontx2-pf: ethtool RSS config support
octeontx2-pf: Add basic ethtool support
...
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAl4vRu8UHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQ6iDy2pc3iXPE1BAA0yg0npafRIrjIMU5IkpDh8TvywWF
DDcarqBXNSIXJtl3EWr7LynvKKqBs4jN7R0ZRMYc5e/6LrSUBvds4GTPm7dOOW4C
cIlAjXTlek2LvHf1/6aNE2SdlkNBYOYo//ifVH+zAn6VOQHGZXBd31oxwPLNW5mP
vVS7OIGhWPcviUebxD7mNmgS/ODoZS/ZL434RK07FhMnN/jEdfuNnu87uz7WAK1p
MWmqzB2tkwrj5uN5wRU6+9R82xYGbo6Xq5uEsFidMrlrn+cguuf+xPrrejT1qVnU
8r72MKKRjfObMRj1fQt3VC0feFt2WyC0qAk3XwKljmllGXZIzV1IPmrui9pLD5Ti
IhLgIEBtMpJgrYhFwl3yMe1EUwdQ/WAlbf8GnoIWyzm0oOo0kaN5BfrvlYtYYmN3
i2xpDOcQ0J4I3tA7zXMpD5tWzDzePxxadZ367qtwRp/AhbL4bnqbvP7vaPtZczz2
pTEGFYIbeqfLCwy2PWHZOVYj83bidmC0lQ3PTFsC26Upui750MdFa7toQV70Hiqo
EdpOzxUHbn6pPuGy7Rey26ybOiZPkL1q1Czoa6jbNyutv8ts2eZNyuCL25QKKzvE
42AvSzA0lt8taDbSbNu+FiexR619oEt15hSrHrRslunecumYfNjJyk85ZCloh+XL
dnD1bPytgl1G4i8=
=2jFm
-----END PGP SIGNATURE-----
Merge tag 'selinux-pr-20200127' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull SELinux update from Paul Moore:
"This is one of the bigger SELinux pull requests in recent years with
28 patches. Everything is passing our test suite and the highlights
are below:
- Mark CONFIG_SECURITY_SELINUX_DISABLE as deprecated. We're some time
away from actually attempting to remove this in the kernel, but the
only distro we know that still uses it (Fedora) is working on
moving away from this so we want to at least let people know we are
planning to remove it.
- Reorder the SELinux hooks to help prevent bad things when SELinux
is disabled at runtime. The proper fix is to remove the
CONFIG_SECURITY_SELINUX_DISABLE functionality (see above) and just
take care of it at boot time (e.g. "selinux=0").
- Add SELinux controls for the kernel lockdown functionality,
introducing a new SELinux class/permissions: "lockdown { integrity
confidentiality }".
- Add a SELinux control for move_mount(2) that reuses the "file {
mounton }" permission.
- Improvements to the SELinux security label data store lookup
functions to speed up translations between our internal label
representations and the visible string labels (both directions).
- Revisit a previous fix related to SELinux inode auditing and
permission caching and do it correctly this time.
- Fix the SELinux access decision cache to cleanup properly on error.
In some extreme cases this could limit the cache size and result in
a decrease in performance.
- Enable SELinux per-file labeling for binderfs.
- The SELinux initialized and disabled flags were wrapped with
accessors to ensure they are accessed correctly.
- Mark several key SELinux structures with __randomize_layout.
- Changes to the LSM build configuration to only build
security/lsm_audit.c when needed.
- Changes to the SELinux build configuration to only build the IB
object cache when CONFIG_SECURITY_INFINIBAND is enabled.
- Move a number of single-caller functions into their callers.
- Documentation fixes (/selinux -> /sys/fs/selinux).
- A handful of cleanup patches that aren't worth mentioning on their
own, the individual descriptions have plenty of detail"
* tag 'selinux-pr-20200127' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: (28 commits)
selinux: fix regression introduced by move_mount(2) syscall
selinux: do not allocate ancillary buffer on first load
selinux: remove redundant allocation and helper functions
selinux: remove redundant selinux_nlmsg_perm
selinux: fix wrong buffer types in policydb.c
selinux: reorder hooks to make runtime disable less broken
selinux: treat atomic flags more carefully
selinux: make default_noexec read-only after init
selinux: move ibpkeys code under CONFIG_SECURITY_INFINIBAND.
selinux: remove redundant msg_msg_alloc_security
Documentation,selinux: fix references to old selinuxfs mount point
selinux: deprecate disabling SELinux and runtime
selinux: allow per-file labelling for binderfs
selinuxfs: use scnprintf to get real length for inode
selinux: remove set but not used variable 'sidtab'
selinux: ensure the policy has been loaded before reading the sidtab stats
selinux: ensure we cleanup the internal AVC counters on error in avc_update()
selinux: randomize layout of key structures
selinux: clean up selinux_enabled/disabled/enforcing_boot
selinux: remove unnecessary selinux cred request
...
This macro is never used from it was introduced in commit e6b1db98cf
("security: Support early LSMs"), better to remove it.
Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: James Morris <jmorris@namei.org>
Keys queued for measurement should be freed if a custom IMA policy
was not loaded. Otherwise, the keys will remain queued forever
consuming kernel memory.
This patch defines a delayed workqueue to handle the above scenario.
The workqueue handler is setup to execute 5 minutes after IMA
initialization is completed.
If a custom IMA policy is loaded before the workqueue handler is
scheduled to execute, the workqueue task is cancelled and any queued keys
are processed for measurement. But if a custom policy was not loaded then
the queued keys are just freed when the delayed workqueue handler is run.
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Reported-by: kernel test robot <rong.a.chen@intel.com> # sleeping
function called from invalid context
Reported-by: kbuild test robot <lkp@intel.com> # redefinition of
ima_init_key_queue() function.
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Measuring keys requires a custom IMA policy to be loaded. Keys should
be queued for measurement if a custom IMA policy is not yet loaded.
Keys queued for measurement, if any, should be processed when a custom
policy is loaded.
This patch updates the IMA hook function ima_post_key_create_or_update()
to queue the key if a custom IMA policy has not yet been loaded. And,
ima_update_policy() function, which is called when a custom IMA policy
is loaded, is updated to process queued keys.
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Measuring keys requires a custom IMA policy to be loaded. Keys created
or updated before a custom IMA policy is loaded should be queued and
will be processed after a custom policy is loaded.
This patch defines a workqueue for queuing keys when a custom IMA policy
has not yet been loaded. An intermediate Kconfig boolean option namely
IMA_QUEUE_EARLY_BOOT_KEYS is used to declare the workqueue functions.
A flag namely ima_process_keys is used to check if the key should be
queued or should be processed immediately.
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
ima_match_keyring() is called while holding rcu read lock. Since this
function executes in atomic context, it should not call any function
that can sleep (such as kstrdup()).
This patch pre-allocates a buffer to hold the keyrings string read from
the IMA policy and uses that to match the given keyring.
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Fixes: e9085e0ad3 ("IMA: Add support to limit measuring keys")
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Keep the ima policy rules around from the beginning even if they appear
invalid at the time of loading, as they may become active after an lsm
policy load. However, loading a custom IMA policy with unknown LSM
labels is only safe after we have transitioned from the "built-in"
policy rules to a custom IMA policy.
Patch also fixes the rule re-use during the lsm policy reload and makes
some prints a bit more human readable.
Changelog:
v4:
- Do not allow the initial policy load refer to non-existing lsm rules.
v3:
- Fix too wide policy rule matching for non-initialized LSMs
v2:
- Fix log prints
Fixes: b169424551 ("ima: use the lsm policy update notifier")
Cc: Casey Schaufler <casey@schaufler-ca.com>
Reported-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Janne Karhunen <janne.karhunen@gmail.com>
Signed-off-by: Konsta Karsisto <konsta.karsisto@gmail.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
This allows other parts of the kernel (perhaps a stacked LSM allowing
system monitoring, eg. the proposed KRSI LSM [1]) to retrieve the hash
of a given file from IMA if it's present in the iint cache.
It's true that the existence of the hash means that it's also in the
audit logs or in /sys/kernel/security/ima/ascii_runtime_measurements,
but it can be difficult to pull that information out for every
subsequent exec. This is especially true if a given host has been up
for a long time and the file was first measured a long time ago.
It should be kept in mind that this function gives access to cached
entries which can be removed, for instance on security_inode_free().
This is based on Peter Moody's patch:
https://sourceforge.net/p/linux-ima/mailman/message/33036180/
[1] https://lkml.org/lkml/2019/9/10/393
Signed-off-by: Florent Revest <revest@google.com>
Reviewed-by: KP Singh <kpsingh@chromium.org>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
When reading ima_policy from securityfs, there is a missing
space between output string of LSM rules and the remaining
rules.
Signed-off-by: Clay Chang <clayc@hpe.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
The second check to ensure the xattrs are present and checked is
unneeded as this is already done in the profile attachment xmatch.
Signed-off-by: John Johansen <john.johansen@canonical.com>
There are cases where the a special out of band transition that can
not be triggered by input is useful in separating match conditions
in the dfa encoding.
The null_transition is currently used as an out of band transition
for match conditions that can not contain a \0 in their input
but apparmor needs an out of band transition for cases where
the match condition is allowed to contain any input character.
Achieve this by allowing for an explicit transition out of input
range that can only be triggered by code.
Signed-off-by: John Johansen <john.johansen@canonical.com>
The subset test is not taking into account the unconfined exception
which will cause profile transitions in the stacked confinement
case to fail when no_new_privs is applied.
This fixes a regression introduced in the fix for
https://bugs.launchpad.net/bugs/1839037
BugLink: https://bugs.launchpad.net/bugs/1844186
Signed-off-by: John Johansen <john.johansen@canonical.com>
commit 1180b4c757 ("apparmor: fix dangling symlinks to policy
rawdata after replacement") reworked how the rawdata symlink is
handled but failedto remove aafs_create_symlink which was reduced to a
useles stub.
Fixes: 1180b4c757 ("apparmor: fix dangling symlinks to policy rawdata after replacement")
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: John Johansen <john.johansen@canonical.com>
commit 2db154b3ea ("vfs: syscall: Add move_mount(2) to move mounts around")
introduced a new move_mount(2) system call and a corresponding new LSM
security_move_mount hook but did not implement this hook for any existing
LSM. This creates a regression for SELinux with respect to consistent
checking of mounts; the existing selinux_mount hook checks mounton
permission to the mount point path. Provide a SELinux hook
implementation for move_mount that applies this same check for
consistency. In the future we may wish to add a new move_mount
filesystem permission and check as well, but this addresses
the immediate regression.
Fixes: 2db154b3ea ("vfs: syscall: Add move_mount(2) to move mounts around")
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Check that a states diff encode flag is only set if diff encode is
enabled in the dfa header.
Signed-off-by: John Johansen <john.johansen@canonical.com>
Two strings which did not contain a data format specification should be put
into a sequence. Thus use the corresponding function “seq_puts”.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: John Johansen <john.johansen@canonical.com>
In security_load_policy(), we can defer allocating the newpolicydb
ancillary array to after checking state->initialized, thereby avoiding
the pointless allocation when loading policy the first time.
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
[PM: merged portions by hand]
Reviewed-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
This patch removes the inode, file, and superblock security blob
allocation functions and moves the associated code into the
respective LSM hooks. This patch also removes the inode_doinit()
function as it was a trivial wrapper around
inode_doinit_with_dentry() and called from one location in the code.
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
selinux_nlmsg_perm is used for only by selinux_netlink_send. Remove
the redundant function to simplify the code.
Fix a typo by suggestion from Stephen.
Signed-off-by: Huaisheng Ye <yehs1@lenovo.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Two places used u32 where there should have been __le32.
Fixes sparse warnings:
CHECK [...]/security/selinux/ss/services.c
[...]/security/selinux/ss/policydb.c:2669:16: warning: incorrect type in assignment (different base types)
[...]/security/selinux/ss/policydb.c:2669:16: expected unsigned int
[...]/security/selinux/ss/policydb.c:2669:16: got restricted __le32 [usertype]
[...]/security/selinux/ss/policydb.c:2674:24: warning: incorrect type in assignment (different base types)
[...]/security/selinux/ss/policydb.c:2674:24: expected unsigned int
[...]/security/selinux/ss/policydb.c:2674:24: got restricted __le32 [usertype]
[...]/security/selinux/ss/policydb.c:2675:24: warning: incorrect type in assignment (different base types)
[...]/security/selinux/ss/policydb.c:2675:24: expected unsigned int
[...]/security/selinux/ss/policydb.c:2675:24: got restricted __le32 [usertype]
[...]/security/selinux/ss/policydb.c:2676:24: warning: incorrect type in assignment (different base types)
[...]/security/selinux/ss/policydb.c:2676:24: expected unsigned int
[...]/security/selinux/ss/policydb.c:2676:24: got restricted __le32 [usertype]
[...]/security/selinux/ss/policydb.c:2681:32: warning: incorrect type in assignment (different base types)
[...]/security/selinux/ss/policydb.c:2681:32: expected unsigned int
[...]/security/selinux/ss/policydb.c:2681:32: got restricted __le32 [usertype]
[...]/security/selinux/ss/policydb.c:2701:16: warning: incorrect type in assignment (different base types)
[...]/security/selinux/ss/policydb.c:2701:16: expected unsigned int
[...]/security/selinux/ss/policydb.c:2701:16: got restricted __le32 [usertype]
[...]/security/selinux/ss/policydb.c:2706:24: warning: incorrect type in assignment (different base types)
[...]/security/selinux/ss/policydb.c:2706:24: expected unsigned int
[...]/security/selinux/ss/policydb.c:2706:24: got restricted __le32 [usertype]
[...]/security/selinux/ss/policydb.c:2707:24: warning: incorrect type in assignment (different base types)
[...]/security/selinux/ss/policydb.c:2707:24: expected unsigned int
[...]/security/selinux/ss/policydb.c:2707:24: got restricted __le32 [usertype]
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Reviewed-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
This patch adds vlan rtm definitions:
- NEWVLAN: to be used for creating vlans, setting options and
notifications
- DELVLAN: to be used for deleting vlans
- GETVLAN: used for dumping vlan information
Dumping vlans which can span multiple messages is added now with basic
information (vid and flags). We use nlmsg_parse() to validate the header
length in order to be able to extend the message with filtering
attributes later.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
kunit tests that do not support module build should depend
on KUNIT=y rather than just KUNIT in Kconfig, otherwise
they will trigger compilation errors for "make allmodconfig"
builds.
Fixes: 9fe124bf1b ("kunit: allow kunit to be loaded as a module")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Commit b1d9e6b064 ("LSM: Switch to lists of hooks") switched the LSM
infrastructure to use per-hook lists, which meant that removing the
hooks for a given module was no longer atomic. Even though the commit
clearly documents that modules implementing runtime revmoval of hooks
(only SELinux attempts this madness) need to take special precautions to
avoid race conditions, SELinux has never addressed this.
By inserting an artificial delay between the loop iterations of
security_delete_hooks() (I used 100 ms), booting to a state where
SELinux is enabled, but policy is not yet loaded, and running these
commands:
while true; do ping -c 1 <some IP>; done &
echo -n 1 >/sys/fs/selinux/disable
kill %1
wait
...I was able to trigger NULL pointer dereferences in various places. I
also have a report of someone getting panics on a stock RHEL-8 kernel
after setting SELINUX=disabled in /etc/selinux/config and rebooting
(without adding "selinux=0" to kernel command-line).
Reordering the SELinux hooks such that those that allocate structures
are removed last seems to prevent these panics. It is very much possible
that this doesn't make the runtime disable completely race-free, but at
least it makes the operation much less fragile.
Cc: stable@vger.kernel.org
Fixes: b1d9e6b064 ("LSM: Switch to lists of hooks")
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Reviewed-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
The disabled/enforcing/initialized flags are all accessed concurrently
by threads so use the appropriate accessors that ensure atomicity and
document that it is expected.
Use smp_load/acquire...() helpers (with memory barriers) for the
initialized flag, since it gates access to the rest of the state
structures.
Note that the disabled flag is currently not used for anything other
than avoiding double disable, but it will be used for bailing out of
hooks once security_delete_hooks() is removed.
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
SELinux checks whether VM_EXEC is set in the VM_DATA_DEFAULT_FLAGS
during initialization and saves the result in default_noexec for use
in its mmap and mprotect hook function implementations to decide
whether to apply EXECMEM, EXECHEAP, EXECSTACK, and EXECMOD checks.
Mark default_noexec as ro_after_init to prevent later clearing it
and thereby disabling these checks. It is only set legitimately from
init code.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Move cache based pkey sid retrieval code which was added
with commit "409dcf31" under CONFIG_SECURITY_INFINIBAND.
As its going to alloc a new cache which impacts
low RAM devices which was enabled by default.
Suggested-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Ravi Kumar Siddojigari <rsiddoji@codeaurora.org>
[PM: checkpatch.pl cleanups, fixed capitalization in the description]
Signed-off-by: Paul Moore <paul@paul-moore.com>
selinux_msg_msg_alloc_security only calls msg_msg_alloc_security but
do nothing else. And also msg_msg_alloc_security is just used by the
former.
Remove the redundant function to simplify the code.
Signed-off-by: Huaisheng Ye <yehs1@lenovo.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Add KUnit tests to test AppArmor unpacking of userspace policies.
AppArmor uses a serialized binary format for loading policies. To find
policy format documentation see
Documentation/admin-guide/LSM/apparmor.rst.
In order to write the tests against the policy unpacking code, some
static functions needed to be exposed for testing purposes. One of the
goals of this patch is to establish a pattern for which testing these
kinds of functions should be done in the future.
Signed-off-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Mike Salvatore <mike.salvatore@canonical.com>
Acked-by: John Johansen <john.johansen@canonical.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
As a result of the asymmetric public keys subtype Kconfig option being
defined as tristate, with the existing IMA Makefile, ima_asymmetric_keys.c
could be built as a kernel module. To prevent this from happening, this
patch defines and uses an intermediate Kconfig boolean option named
IMA_MEASURE_ASYMMETRIC_KEYS.
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Suggested-by: James.Bottomley <James.Bottomley@HansenPartnership.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Reported-by: kbuild test robot <lkp@intel.com> # ima_asymmetric_keys.c
is built as a kernel module.
Fixes: 88e70da170 ("IMA: Define an IMA hook to measure keys")
Fixes: cb1aa3823c ("KEYS: Call the IMA hook to measure keys")
[zohar@linux.ibm.com: updated patch description]
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
selinuxfs was originally mounted on /selinux, and various docs and
kconfig help texts referred to nodes under it. In Linux 3.0,
/sys/fs/selinux was introduced as the preferred mount point for selinuxfs.
Fix all the old references to /selinux/ to /sys/fs/selinux/.
While we are there, update the description of the selinux boot parameter
to reflect the fact that the default value is always 1 since
commit be6ec88f41 ("selinux: Remove SECURITY_SELINUX_BOOTPARAM_VALUE")
and drop discussion of runtime disable since it is deprecated.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Deprecate the CONFIG_SECURITY_SELINUX_DISABLE functionality. The
code was originally developed to make it easier for Linux
distributions to support architectures where adding parameters to the
kernel command line was difficult. Unfortunately, supporting runtime
disable meant we had to make some security trade-offs when it came to
the LSM hooks, as documented in the Kconfig help text:
NOTE: selecting this option will disable the '__ro_after_init'
kernel hardening feature for security hooks. Please consider
using the selinux=0 boot parameter instead of enabling this
option.
Fortunately it looks as if that the original motivation for the
runtime disable functionality is gone, and Fedora/RHEL appears to be
the only major distribution enabling this capability at build time
so we are now taking steps to remove it entirely from the kernel.
The first step is to mark the functionality as deprecated and print
an error when it is used (what this patch is doing). As Fedora/RHEL
makes progress in transitioning the distribution away from runtime
disable, we will introduce follow-up patches over several kernel
releases which will block for increasing periods of time when the
runtime disable is used. Finally we will remove the option entirely
once we believe all users have moved to the kernel cmdline approach.
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Acked-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
This patch allows genfscon per-file labeling for binderfs.
This is required to have separate permissions to allow
access to binder, hwbinder and vndbinder devices which are
relocating to binderfs.
Acked-by: Jeff Vander Stoep <jeffv@google.com>
Acked-by: Mark Salyzyn <salyzyn@android.com>
Signed-off-by: Hridya Valsaraju <hridya@google.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
The return value of snprintf maybe over the size of TMPBUFLEN, use
scnprintf instead in sel_read_class and sel_read_perm.
Signed-off-by: liuyang34 <liuyang34@xiaomi.com>
[PM: cleaned up the description]
Signed-off-by: Paul Moore <paul@paul-moore.com>
- performance regression: only get a label reference if the fast
path check fails
- fix aa_xattrs_match() may sleep while holding a RCU lock
- fix bind mounts aborting with -ENOMEM
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE7cSDD705q2rFEEf7BS82cBjVw9gFAl4RPEUACgkQBS82cBjV
w9gUwg/9EsFcegRD6T2NmiqUp7jY2KRjAf5JrKqZJvCbXFj9Mnurv9ug5A6b20JU
SbyXU1CNIxqPTmwPMRF1YGtq/vuAV5UIgb14pjepmsvzF5A/xpgKJucR0gokc5PU
6eFbPCQ4aXF83/d+/SiJfANP+oUY372b6cTzHKMouKWXBbeIG4F9vtVrPMEZlcjo
NNxDtmHNeFdASV17uA+8OKseGluPzPlWvnpq5va/Uz4avmxdeaBQxqh2N891IUqO
drRTpF6cgp/072cEoSrS+2dmliHOteS9785Vh4iLF4LVt8lsRjoT0Z66eGL175Ve
TYwp7EfgIwhcUYLZgKjz7t+wp3l7Yw5retlzbzbfFTIxTfGCKunUpZIUwssI+QzW
npMOU+0jTLZ9hVZICOicxzs40kHu8tR4dKYPHuYB2G3gNW8LGMEodVn2SBL1H6+2
r4vUjaLIsyUHBpfQDjWHjMEN/gdSekyBvRhpnC5qNMdOTnTeHSNow88igBxmGNhA
K9iymhYV0E1ZMw4KfGtyKZ2Zfd3E8F0ryH0cnavXYBYfvuVNs18M0TkaYwPb1+YH
02SyEGRnnhxtIO+GwLve6pmdRT9edZgL6Pc3yCaJ/e/g9FIpFgPbX44IxuO+XrL/
ku6NK9hXrpvk9Z/CAVUHYcRMSgQTz1lTsw9b9ooxgHQmsXuOUhM=
=uw+S
-----END PGP SIGNATURE-----
Merge tag 'apparmor-pr-2020-01-04' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor
Pull apparmor fixes from John Johansen:
- performance regression: only get a label reference if the fast path
check fails
- fix aa_xattrs_match() may sleep while holding a RCU lock
- fix bind mounts aborting with -ENOMEM
* tag 'apparmor-pr-2020-01-04' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor:
apparmor: fix aa_xattrs_match() may sleep while holding a RCU lock
apparmor: only get a label reference if the fast path check fails
apparmor: fix bind mounts aborting with -ENOMEM
aa_xattrs_match() is unfortunately calling vfs_getxattr_alloc() from a
context protected by an rcu_read_lock. This can not be done as
vfs_getxattr_alloc() may sleep regardles of the gfp_t value being
passed to it.
Fix this by breaking the rcu_read_lock on the policy search when the
xattr match feature is requested and restarting the search if a policy
changes occur.
Fixes: 8e51f9087f ("apparmor: Add support for attaching profiles via xattr, presence and value")
Reported-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: John Johansen <john.johansen@canonical.com>
The common fast path check can be done under rcu_read_lock() and
doesn't need a reference count on the label. Only take a reference
count if entering the slow path.
Fixes reported hackbench regression
- sha1 79e178a57d ("Merge tag 'apparmor-pr-2019-12-03' of
git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor")
hackbench -l (256000/#grp) -g #grp
128 groups 19.679 ±0.90%
- previous sha1 01d1dff646 ("Merge tag 's390-5.5-2' of
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux")
hackbench -l (256000/#grp) -g #grp
128 groups 3.1689 ±3.04%
Reported-by: Vincent Guittot <vincent.guittot@linaro.org>
Tested-by: Vincent Guittot <vincent.guittot@linaro.org>
Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Fixes: bce4e7e9c4 ("apparmor: reduce rcu_read_lock scope for aa_file_perm mediation")
Signed-off-by: John Johansen <john.johansen@canonical.com>
With commit df323337e5 ("apparmor: Use a memory pool instead per-CPU
caches, 2019-05-03"), AppArmor code was converted to use memory pools. In
that conversion, a bug snuck into the code that polices bind mounts that
causes all bind mounts to fail with -ENOMEM, as we erroneously error out
if `aa_get_buffer` returns a pointer instead of erroring out when it
does _not_ return a valid pointer.
Fix the issue by correctly checking for valid pointers returned by
`aa_get_buffer` to fix bind mounts with AppArmor.
Fixes: df323337e5 ("apparmor: Use a memory pool instead per-CPU caches")
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: John Johansen <john.johansen@canonical.com>
syzbot is reporting that there is a race at tomoyo_stat_update() [1].
Although it is acceptable to fail to track exact number of times policy
was updated, convert to atomic_t because this is not a hot path.
[1] https://syzkaller.appspot.com/bug?id=a4d7b973972eeed410596e6604580e0133b0fc04
Reported-by: syzbot <syzbot+efea72d4a0a1d03596cd@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
security/selinux/ss/services.c: In function security_port_sid:
security/selinux/ss/services.c:2346:17: warning: variable sidtab set but not used [-Wunused-but-set-variable]
security/selinux/ss/services.c: In function security_ib_endport_sid:
security/selinux/ss/services.c:2435:17: warning: variable sidtab set but not used [-Wunused-but-set-variable]
security/selinux/ss/services.c: In function security_netif_sid:
security/selinux/ss/services.c:2480:17: warning: variable sidtab set but not used [-Wunused-but-set-variable]
security/selinux/ss/services.c: In function security_fs_use:
security/selinux/ss/services.c:2831:17: warning: variable sidtab set but not used [-Wunused-but-set-variable]
Since commit 66f8e2f03c ("selinux: sidtab reverse lookup hash table")
'sidtab' is not used any more, so remove it.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Check to make sure we have loaded a policy before we query the
sidtab's hash stats. Failure to do so could result in a kernel
panic/oops due to a dereferenced NULL pointer.
Fixes: 66f8e2f03c ("selinux: sidtab reverse lookup hash table")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
In AVC update we don't call avc_node_kill() when avc_xperms_populate()
fails, resulting in the avc->avc_cache.active_nodes counter having a
false value. In last patch this changes was missed , so correcting it.
Fixes: fa1aa143ac ("selinux: extended permissions for ioctls")
Signed-off-by: Jaihind Yadav <jaihindyadav@codeaurora.org>
Signed-off-by: Ravi Kumar Siddojigari <rsiddoji@codeaurora.org>
[PM: merge fuzz, minor description cleanup]
Signed-off-by: Paul Moore <paul@paul-moore.com>
Randomize the layout of key selinux data structures.
Initially this is applied to the selinux_state, selinux_ss,
policydb, and task_security_struct data structures.
NB To test/use this mechanism, one must install the
necessary build-time dependencies, e.g. gcc-plugin-devel on Fedora,
and enable CONFIG_GCC_PLUGIN_RANDSTRUCT in the kernel configuration.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Reviewed-by: Kees Cook <keescook@chromium.org>
[PM: double semi-colon fixed]
Signed-off-by: Paul Moore <paul@paul-moore.com>
Rename selinux_enabled to selinux_enabled_boot to make it clear that
it only reflects whether SELinux was enabled at boot. Replace the
references to it in the MAC_STATUS audit log in sel_write_enforce()
with hardcoded "1" values because this code is only reachable if SELinux
is enabled and does not change its value, and update the corresponding
MAC_STATUS audit log in sel_write_disable(). Stop clearing
selinux_enabled in selinux_disable() since it is not used outside of
initialization code that runs before selinux_disable() can be reached.
Mark both selinux_enabled_boot and selinux_enforcing_boot as __initdata
since they are only used in initialization code.
Wrap the disabled field in the struct selinux_state with
CONFIG_SECURITY_SELINUX_DISABLE since it is only used for
runtime disable.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
-----BEGIN PGP SIGNATURE-----
iJYEABYIAD4WIQRE6pSOnaBC00OEHEIaerohdGur0gUCXfrLASAcamFya2tvLnNh
a2tpbmVuQGxpbnV4LmludGVsLmNvbQAKCRAaerohdGur0pZfAQD9F5Vjdqp3fWk+
pxt+eD9+xaD2MYuSVO2AEVBC949vdQD/TP7xnb66w7n9YtMtm9MgvysHAakJYeAe
l4XsHAiPHgI=
=CFIs
-----END PGP SIGNATURE-----
Merge tag 'tpmdd-next-20191219' of git://git.infradead.org/users/jjs/linux-tpmdd
Pull tpm fixes from Jarkko Sakkinen:
"Bunch of fixes for rc3"
* tag 'tpmdd-next-20191219' of git://git.infradead.org/users/jjs/linux-tpmdd:
tpm/tpm_ftpm_tee: add shutdown call back
tpm: selftest: cleanup after unseal with wrong auth/policy test
tpm: selftest: add test covering async mode
tpm: fix invalid locking in NONBLOCKING mode
security: keys: trusted: fix lost handle flush
tpm_tis: reserve chip for duration of tpm_tis_core_init
KEYS: asymmetric: return ENOMEM if akcipher_request_alloc() fails
KEYS: remove CONFIG_KEYS_COMPAT
The original code, before it was moved into security/keys/trusted-keys
had a flush after the blob unseal. Without that flush, the volatile
handles increase in the TPM until it becomes unusable and the system
either has to be rebooted or the TPM volatile area manually flushed.
Fix by adding back the lost flush, which we now have to export because
of the relocation of the trusted key code may cause the consumer to be
modular.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Fixes: 2e19e10131 ("KEYS: trusted: Move TPM2 trusted keys code")
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
John Garry has reported that allmodconfig kernel on arm64 causes flood of
"RCU-list traversed in non-reader section!!" warning. I don't know what
change caused this warning, but this warning is safe because TOMOYO uses
SRCU lock instead. Let's suppress this warning by explicitly telling that
the caller is holding SRCU lock.
Reported-and-tested-by: John Garry <john.garry@huawei.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
KEYS_COMPAT now always takes the value of COMPAT && KEYS. But the
security/keys/ directory is only compiled if KEYS is enabled, so in
practice KEYS_COMPAT is the same as COMPAT. Therefore, remove the
unnecessary KEYS_COMPAT and just use COMPAT directly.
(Also remove an outdated comment from compat.c.)
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Read "keyrings=" option, if specified in the IMA policy, and store in
the list of IMA rules when the configured IMA policy is read.
This patch defines a new policy token enum namely Opt_keyrings
and an option flag IMA_KEYRINGS for reading "keyrings=" option
from the IMA policy.
Updated ima_parse_rule() to parse "keyrings=" option in the policy.
Updated ima_policy_show() to display "keyrings=" option.
The following example illustrates how key measurement can be verified.
Sample "key" measurement rule in the IMA policy:
measure func=KEY_CHECK uid=0 keyrings=.ima|.evm template=ima-buf
Display "key" measurement in the IMA measurement list:
cat /sys/kernel/security/ima/ascii_runtime_measurements
10 faf3...e702 ima-buf sha256:27c915b8ddb9fae7214cf0a8a7043cc3eeeaa7539bcb136f8427067b5f6c3b7b .ima 308202863082...4aee
Verify "key" measurement data for a key added to ".ima" keyring:
cat /sys/kernel/security/integrity/ima/ascii_runtime_measurements | grep -m 1 "\.ima" | cut -d' ' -f 6 | xxd -r -p |tee ima-cert.der | sha256sum | cut -d' ' -f 1
The output of the above command should match the template hash
of the first "key" measurement entry in the IMA measurement list for
the key added to ".ima" keyring.
The file namely "ima-cert.der" generated by the above command
should be a valid x509 certificate (in DER format) and should match
the one that was used to import the key to the ".ima" keyring.
The certificate file can be verified using openssl tool.
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Limit measuring keys to those keys being loaded onto a given set of
keyrings only and when the user id (uid) matches if uid is specified
in the policy.
This patch defines a new IMA policy option namely "keyrings=" that
can be used to specify a set of keyrings. If this option is specified
in the policy for "measure func=KEY_CHECK" then only the keys
loaded onto a keyring given in the "keyrings=" option are measured.
If uid is specified in the policy then the key is measured only if
the current user id matches the one specified in the policy.
Added a new parameter namely "keyring" (name of the keyring) to
process_buffer_measurement(). The keyring name is passed to
ima_get_action() to determine the required action.
ima_match_rules() is updated to check keyring in the policy, if
specified, for KEY_CHECK function.
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Call the IMA hook from key_create_or_update() function to measure
the payload when a new key is created or an existing key is updated.
This patch adds the call to the IMA hook from key_create_or_update()
function to measure the key on key create or update.
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Measure asymmetric keys used for verifying file signatures,
certificates, etc.
This patch defines a new IMA hook namely ima_post_key_create_or_update()
to measure the payload used to create a new asymmetric key or
update an existing asymmetric key.
Asymmetric key structure is defined only when
CONFIG_ASYMMETRIC_PUBLIC_KEY_SUBTYPE is defined. Since the IMA hook
measures asymmetric keys, the IMA hook is defined in a new file namely
ima_asymmetric_keys.c which is built only if
CONFIG_ASYMMETRIC_PUBLIC_KEY_SUBTYPE is defined.
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Measure keys loaded onto any keyring.
This patch defines a new IMA policy func namely KEY_CHECK to
measure keys. Updated ima_match_rules() to check for KEY_CHECK
and ima_parse_rule() to handle KEY_CHECK.
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
process_buffer_measurement() may be called prior to IMA being
initialized (for instance, when the IMA hook is called when
a key is added to the .builtin_trusted_keys keyring), which
would result in a kernel panic.
This patch adds the check in process_buffer_measurement()
to return immediately if IMA is not initialized yet.
Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
The integrity_kernel_read() call in ima_calc_file_hash_tfm() can return
a value of 0 before all bytes of the file are read. A value of 0 would
normally indicate an EOF. This has been observed if a user process is
causing a file appraisal and is terminated with a SIGTERM signal. The
most common occurrence of seeing the problem is if a shutdown or systemd
reload is initiated while files are being appraised.
The problem is similar to commit <f5e1040196db> (ima: always return
negative code for error) that fixed the problem in
ima_calc_file_hash_atfm().
Suggested-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Patrick Callaghan <patrickc@linux.ibm.com>
Reviewed-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
task_security_struct was obtained at the beginning of may_create
and selinux_inode_init_security, no need to obtain again.
may_create will be called very frequently when create dir and file.
Cc: Paul Moore <paul@paul-moore.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Eric Paris <eparis@parisplace.org>
Signed-off-by: Yang Guo <guoyang2@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
syzbot is reporting that use of SOCKET_I()->sk from open() can result in
use after free problem [1], for socket's inode is still reachable via
/proc/pid/fd/n despite destruction of SOCKET_I()->sk already completed.
At first I thought that this race condition applies to only open/getattr
permission checks. But James Morris has pointed out that there are more
permission checks where this race condition applies to. Thus, get rid of
tomoyo_get_socket_name() instead of conditionally bypassing permission
checks on sockets. As a side effect of this patch,
"socket:[family=\$:type=\$:protocol=\$]" in the policy files has to be
rewritten to "socket:[\$]".
[1] https://syzkaller.appspot.com/bug?id=73d590010454403d55164cca23bd0565b1eb3b74
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: syzbot <syzbot+0341f6a4d729d4e0acf1@syzkaller.appspotmail.com>
Reported-by: James Morris <jmorris@namei.org>
Fix avc_insert() to call avc_node_kill() if we've already allocated
an AVC node and the code fails to insert the node in the cache.
Fixes: fa1aa143ac ("selinux: extended permissions for ioctls")
Reported-by: rsiddoji@codeaurora.org
Suggested-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
The lsm_audit code is only required when CONFIG_SECURITY is enabled.
It does not have a build dependency on CONFIG_AUDIT since audit.h
provides trivial static inlines for audit_log*() when CONFIG_AUDIT
is disabled. Hence, the Makefile should only add lsm_audit to the
obj lists based on CONFIG_SECURITY, not CONFIG_AUDIT.
Fixes: 59438b4647 ("security,lockdown,selinux: implement SELinux lockdown")
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Through a somewhat convoluted series of changes, we have ended up
with multiple unnecessary occurrences of (flags & MAY_NOT_BLOCK)
tests in selinux_inode_permission(). Clean it up and simplify.
No functional change.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
commit bda0be7ad9 ("security: make inode_follow_link RCU-walk aware")
passed down the rcu flag to the SELinux AVC, but failed to adjust the
test in slow_avc_audit() to also return -ECHILD on LSM_AUDIT_DATA_DENTRY.
Previously, we only returned -ECHILD if generating an audit record with
LSM_AUDIT_DATA_INODE since this was only relevant from inode_permission.
Move the handling of MAY_NOT_BLOCK to avc_audit() and its inlined
equivalent in selinux_inode_permission() immediately after we determine
that audit is required, and always fall back to ref-walk in this case.
Fixes: bda0be7ad9 ("security: make inode_follow_link RCU-walk aware")
Reported-by: Will Deacon <will@kernel.org>
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
This reverts commit e46e01eebb ("selinux: stop passing MAY_NOT_BLOCK
to the AVC upon follow_link"). The correct fix is to instead fall
back to ref-walk if audit is required irrespective of the specific
audit data type. This is done in the next commit.
Fixes: e46e01eebb ("selinux: stop passing MAY_NOT_BLOCK to the AVC upon follow_link")
Reported-by: Will Deacon <will@kernel.org>
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Implement a SELinux hook for lockdown. If the lockdown module is also
enabled, then a denial by the lockdown module will take precedence over
SELinux, so SELinux can only further restrict lockdown decisions.
The SELinux hook only distinguishes at the granularity of integrity
versus confidentiality similar to the lockdown module, but includes the
full lockdown reason as part of the audit record as a hint in diagnosing
what triggered the denial. To support this auditing, move the
lockdown_reasons[] string array from being private to the lockdown
module to the security framework so that it can be used by the lsm audit
code and so that it is always available even when the lockdown module
is disabled.
Note that the SELinux implementation allows the integrity and
confidentiality reasons to be controlled independently from one another.
Thus, in an SELinux policy, one could allow operations that specify
an integrity reason while blocking operations that specify a
confidentiality reason. The SELinux hook implementation is
stricter than the lockdown module in validating the provided reason value.
Sample AVC audit output from denials:
avc: denied { integrity } for pid=3402 comm="fwupd"
lockdown_reason="/dev/mem,kmem,port" scontext=system_u:system_r:fwupd_t:s0
tcontext=system_u:system_r:fwupd_t:s0 tclass=lockdown permissive=0
avc: denied { confidentiality } for pid=4628 comm="cp"
lockdown_reason="/proc/kcore access"
scontext=unconfined_u:unconfined_r:test_lockdown_integrity_t:s0-s0:c0.c1023
tcontext=unconfined_u:unconfined_r:test_lockdown_integrity_t:s0-s0:c0.c1023
tclass=lockdown permissive=0
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
[PM: some merge fuzz do the the perf hooks]
Signed-off-by: Paul Moore <paul@paul-moore.com>
Translating a context struct to string can be quite slow, especially if
the context has a lot of category bits set. This can cause quite
noticeable performance impact in situations where the translation needs
to be done repeatedly. A common example is a UNIX datagram socket with
the SO_PASSSEC option enabled, which is used e.g. by systemd-journald
when receiving log messages via datagram socket. This scenario can be
reproduced with:
cat /dev/urandom | base64 | logger &
timeout 30s perf record -p $(pidof systemd-journald) -a -g
kill %1
perf report -g none --pretty raw | grep security_secid_to_secctx
Before the caching introduced by this patch, computing the context
string (security_secid_to_secctx() function) takes up ~65% of
systemd-journald's CPU time (assuming a context with 1024 categories
set and Fedora x86_64 release kernel configs). After this patch
(assuming near-perfect cache hit ratio) this overhead is reduced to just
~2%.
This patch addresses the issue by caching a certain number (compile-time
configurable) of recently used context strings to speed up repeated
translations of the same context, while using only a small amount of
memory.
The cache is integrated into the existing sidtab table by adding a field
to each entry, which when not NULL contains an RCU-protected pointer to
a cache entry containing the cached string. The cache entries are kept
in a linked list sorted according to how recently they were used. On a
cache miss when the cache is full, the least recently used entry is
removed to make space for the new entry.
The patch migrates security_sid_to_context_core() to use the cache (also
a few other functions where it was possible without too much fuss, but
these mostly use the translation for logging in case of error, which is
rare).
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1733259
Cc: Michal Sekletar <msekleta@redhat.com>
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Reviewed-by: Stephen Smalley <sds@tycho.nsa.gov>
Tested-by: Stephen Smalley <sds@tycho.nsa.gov>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
[PM: lots of merge fixups due to collisions with other sidtab patches]
Signed-off-by: Paul Moore <paul@paul-moore.com>
This replaces the reverse table lookup and reverse cache with a
hashtable which improves cache-miss reverse-lookup times from
O(n) to O(1)* and maintains the same performance as a reverse
cache hit.
This reduces the time needed to add a new sidtab entry from ~500us
to 5us on a Pixel 3 when there are ~10,000 sidtab entries.
The implementation uses the kernel's generic hashtable API,
It uses the context's string represtation as the hash source,
and the kernels generic string hashing algorithm full_name_hash()
to reduce the string to a 32 bit value.
This change also maintains the improvement introduced in
commit ee1a84fdfe ("selinux: overhaul sidtab to fix bug and improve
performance") which removed the need to keep the current sidtab
locked during policy reload. It does however introduce periodic
locking of the target sidtab while converting the hashtable. Sidtab
entries are never modified or removed, so the context struct stored
in the sid_to_context tree can also be used for the context_to_sid
hashtable to reduce memory usage.
This bug was reported by:
- On the selinux bug tracker.
BUG: kernel softlockup due to too many SIDs/contexts #37https://github.com/SELinuxProject/selinux-kernel/issues/37
- Jovana Knezevic on Android's bugtracker.
Bug: 140252993
"During multi-user performance testing, we create and remove users
many times. selinux_android_restorecon_pkgdir goes from 1ms to over
20ms after about 200 user creations and removals. Accumulated over
~280 packages, that adds a significant time to user creation,
making perf benchmarks unreliable."
* Hashtable lookup is only O(1) when n < the number of buckets.
Signed-off-by: Jeff Vander Stoep <jeffv@google.com>
Reported-by: Stephen Smalley <sds@tycho.nsa.gov>
Reported-by: Jovana Knezevic <jovanak@google.com>
Reviewed-by: Stephen Smalley <sds@tycho.nsa.gov>
Tested-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: subj tweak, removed changelog from patch description]
Signed-off-by: Paul Moore <paul@paul-moore.com>
Replace all the occurrences of FIELD_SIZEOF() with sizeof_field() except
at places where these are defined. Later patches will remove the unused
definition of FIELD_SIZEOF().
This patch is generated using following script:
EXCLUDE_FILES="include/linux/stddef.h|include/linux/kernel.h"
git grep -l -e "\bFIELD_SIZEOF\b" | while read file;
do
if [[ "$file" =~ $EXCLUDE_FILES ]]; then
continue
fi
sed -i -e 's/\bFIELD_SIZEOF\b/sizeof_field/g' $file;
done
Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com>
Link: https://lore.kernel.org/r/20190924105839.110713-3-pankaj.laxminarayan.bharadiya@intel.com
Co-developed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: David Miller <davem@davemloft.net> # for net
In preparation for LOOKUP_NO_MAGICLINKS, it's necessary to add the
ability for nd_jump_link() to return an error which the corresponding
get_link() caller must propogate back up to the VFS.
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
- increase left match history buffer size to provide inproved conflict
resolution in overlapping execution rules.
- switch buffer allocation to use a memory pool and GFP_KERNEL
where possible.
- add compression of policy blobs to reduce memory usage.
+ Cleanups
- fix spelling mistake "immutible" -> "immutable"
+ Bug fixes
- fix unsigned len comparison in update_for_len macro
- fix sparse warning for type-casting of current->real_cred
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE7cSDD705q2rFEEf7BS82cBjVw9gFAl3mvPUACgkQBS82cBjV
w9gM8hAArhbBiGHlYlsGCOws4+ObCSIAxPkKw9ZC+FjTOKE6uN+GDUM+s4TWjbkL
65NKGBqHfHIzRYHD6BNi5I3Yf0xKCXuMenZVptiDHYQ+65mCL6QlZOA5K2Mp67fY
uMKoOIMSAkDkLJHEsH8o1YURAlvY5DjK2XfSrc2GeaExnBZTisfhDwbYjv9OYI6U
JPDP361zzJMSpkcDf5WX5vVuvfjTnAXjfH3av61hiSNAzivd4P1Mp34ellOkz7Ya
Ch6K+32agVcE8LIbalRKhWVw7Fhfbys2+/nBZ0Tb5HPG0tRWbm+ueggOsp8/liWQ
Ik9NigK61lHjd5ttDrswD0UfslTxac2pPFhlYRYoSUSMITOjJke50Q12ZosK4wUY
pdsBiWVDo2W3/E9sretmFpWlzish8q3tNJU55aKD+FTo0yqMC3X7H/l9xGLuLUt/
vHwUcGZNSrAWqc8yMamzEvqj9e1DECMJZQIlE3YJgGLCkcO6LFY+5pSWSvMQIG7v
451oob3QalzqIDyh3OOxlA8pfUVyk9HL48Kw7+0ZJrbJK6pAjHZhE8gFVMPECB7b
n22XrABdPdjAFvlqCzkm4qZ5sjqdk8T9Iexc5bnrFvBW4teHnAX0xrk+gxVpnEYf
dV6ERcxmRjnZhT6FtOQkLOia3gIiAQVi6Rd9K6HHhPH83wNyjjI=
=lPsA
-----END PGP SIGNATURE-----
Merge tag 'apparmor-pr-2019-12-03' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor
Pull apparmor updates from John Johansen:
"Features:
- increase left match history buffer size to provide improved
conflict resolution in overlapping execution rules.
- switch buffer allocation to use a memory pool and GFP_KERNEL where
possible.
- add compression of policy blobs to reduce memory usage.
Cleanups:
- fix spelling mistake "immutible" -> "immutable"
Bug fixes:
- fix unsigned len comparison in update_for_len macro
- fix sparse warning for type-casting of current->real_cred"
* tag 'apparmor-pr-2019-12-03' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor:
apparmor: make it so work buffers can be allocated from atomic context
apparmor: reduce rcu_read_lock scope for aa_file_perm mediation
apparmor: fix wrong buffer allocation in aa_new_mount
apparmor: fix unsigned len comparison with less than zero
apparmor: increase left match history buffer size
apparmor: Switch to GFP_KERNEL where possible
apparmor: Use a memory pool instead per-CPU caches
apparmor: Force type-casting of current->real_cred
apparmor: fix spelling mistake "immutible" -> "immutable"
apparmor: fix blob compression when ns is forced on a policy load
apparmor: fix missing ZLIB defines
apparmor: fix blob compression build failure on ppc
apparmor: Initial implementation of raw policy blob compression
This is a series of cleanups for the y2038 work, mostly intended
for namespace cleaning: the kernel defines the traditional
time_t, timeval and timespec types that often lead to y2038-unsafe
code. Even though the unsafe usage is mostly gone from the kernel,
having the types and associated functions around means that we
can still grow new users, and that we may be missing conversions
to safe types that actually matter.
There are still a number of driver specific patches needed to
get the last users of these types removed, those have been
submitted to the respective maintainers.
Link: https://lore.kernel.org/lkml/20191108210236.1296047-1-arnd@arndb.de/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJd3D+wAAoJEJpsee/mABjZfdcQAJvl6e+4ddKoDMIVJqVCE25N
meFRgA7S8jy6BefEVeUgI8TxK+amGO36szMBUEnZxSSxq9u+gd13m5bEK6Xq/ov7
4KTAiA3Irm/W5FBTktu1zc5ROIra1Xj7jLdubf8wEC3viSXIXB3+68Y28iBN7D2O
k9kSpwINC5lWeC8guZy2I+2yc4ywUEXao9nVh8C/J+FQtU02TcdLtZop9OhpAa8u
U19VVH3WHkQI7ZfLvBTUiYK6tlYTiYCnpr8l6sm850CnVv1fzBW+DzmVhPJ6FdFd
4m5staC0sQ6gVqtjVMBOtT5CdzREse6hpwbKo2GRWFroO5W9tljMOJJXHvv/f6kz
DxrpUmj37JuRbqAbr8KDmQqPo6M2CRkxFxjol1yh5ER63u1xMwLm/PQITZIMDvPO
jrFc2C2SdM2E9bKP/RMCVoKSoRwxCJ5IwJ2AF237rrU0sx/zB2xsrOGssx5CWEgc
3bbk6tDQujJJubnCfgRy1tTxpLZOHEEKw8YhFLLbR2LCtA9pA/0rfLLad16cjA5e
5jIHxfsFc23zgpzrJeB7kAF/9xgu1tlA5BotOs3VBE89LtWOA9nK5dbPXng6qlUe
er3xLCfS38ovhUw6DusQpaYLuaYuLM7DKO4iav9kuTMcY9GkbPk7vDD3KPGh2goy
hY5cSM8+kT1q/THLnUBH
=Bdbv
-----END PGP SIGNATURE-----
Merge tag 'y2038-cleanups-5.5' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground
Pull y2038 cleanups from Arnd Bergmann:
"y2038 syscall implementation cleanups
This is a series of cleanups for the y2038 work, mostly intended for
namespace cleaning: the kernel defines the traditional time_t, timeval
and timespec types that often lead to y2038-unsafe code. Even though
the unsafe usage is mostly gone from the kernel, having the types and
associated functions around means that we can still grow new users,
and that we may be missing conversions to safe types that actually
matter.
There are still a number of driver specific patches needed to get the
last users of these types removed, those have been submitted to the
respective maintainers"
Link: https://lore.kernel.org/lkml/20191108210236.1296047-1-arnd@arndb.de/
* tag 'y2038-cleanups-5.5' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground: (26 commits)
y2038: alarm: fix half-second cut-off
y2038: ipc: fix x32 ABI breakage
y2038: fix typo in powerpc vdso "LOPART"
y2038: allow disabling time32 system calls
y2038: itimer: change implementation to timespec64
y2038: move itimer reset into itimer.c
y2038: use compat_{get,set}_itimer on alpha
y2038: itimer: compat handling to itimer.c
y2038: time: avoid timespec usage in settimeofday()
y2038: timerfd: Use timespec64 internally
y2038: elfcore: Use __kernel_old_timeval for process times
y2038: make ns_to_compat_timeval use __kernel_old_timeval
y2038: socket: use __kernel_old_timespec instead of timespec
y2038: socket: remove timespec reference in timestamping
y2038: syscalls: change remaining timeval to __kernel_old_timeval
y2038: rusage: use __kernel_old_timeval
y2038: uapi: change __kernel_time_t to __kernel_old_time_t
y2038: stat: avoid 'time_t' in 'struct stat'
y2038: ipc: remove __kernel_time_t reference from headers
y2038: vdso: powerpc: avoid timespec references
...
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAl3dbJIUHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQ6iDy2pc3iXNEyA/+MYEvQz0BhGBdpHgoCLnUGbEafsxH
2pWnHSupsCi1/9gal4ztVqW6GMQAQe3C4PhEYAnJ0/sIvix5uBMDFehGEwxzj8aQ
fOAHYROOsQVQ4D2sfxBfliY0gouTTxFpCK95hvgXR1Y6WP6RBSowS5fmsvfqvz7i
qDDZvCLOg5ULPHvpmJR3wRjs8MIbZTOiKg1K0VjJgJGzdvEoFfm2faw6WiHkPwm5
GFiW5PABv1c1Jbe4NL1r85r5iPPZgt+4BygS2QWkx7IVoAvPGhtlkB4Eh0pSESbg
n141DwJ79uwI2UDUi0Zfcr1Y9OtH1YveUtxZUNi/NYSdxc9R4SjVZJC0TdjrwjWy
K2jKV6gTn6NBLPuk+HAlj29ETF07G/BJDNkaIGqJcJ97t3kjuODb+QA+eFyM/1t7
V1oGMtKm6qGhb6y2bMMzmgftFXB0qrppdklkj+Gs8bNEUTfxPRcJS7tk6CZgOTI0
D5Ikfeu0muflIqug7w7wK8I6s63ZMNoDxZ5Nyy3idm5ZjxcQXp1Ys5x6Axir4lwe
2tvXwgvNV+/1ZEXNBu8Yhh5EDnZ9Wp1ENd/MT8VlZ77KfAPNnhq48uBKjpQD3KY5
wPavTRHTu+mnElqkO2cnjDYVhMYrhtbTCeQBY40wLyQDzbqzCOfLav9jisJnblzU
R8iH6YlokDKPTeQ=
=x1yB
-----END PGP SIGNATURE-----
Merge tag 'selinux-pr-20191126' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux updates from Paul Moore:
"Only three SELinux patches for v5.5:
- Remove the size limit on SELinux policies, the limitation was a
lingering vestige and no longer necessary.
- Allow file labeling before the policy is loaded. This should ease
some of the burden when the policy is initially loaded (no need to
relabel files), but it should also help enable some new system
concepts which dynamically create the root filesystem in the
initrd.
- Add support for the "greatest lower bound" policy construct which
is defined as the intersection of the MLS range of two SELinux
labels"
* tag 'selinux-pr-20191126' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: default_range glblub implementation
selinux: allow labeling before policy is loaded
selinux: remove load size limit
Highlights:
- Infrastructure for secure boot on some bare metal Power9 machines. The
firmware support is still in development, so the code here won't actually
activate secure boot on any existing systems.
- A change to xmon (our crash handler / pseudo-debugger) to restrict it to
read-only mode when the kernel is lockdown'ed, otherwise it's trivial to drop
into xmon and modify kernel data, such as the lockdown state.
- Support for KASLR on 32-bit BookE machines (Freescale / NXP).
- Fixes for our flush_icache_range() and __kernel_sync_dicache() (VDSO) to work
with memory ranges >4GB.
- Some reworks of the pseries CMM (Cooperative Memory Management) driver to
make it behave more like other balloon drivers and enable some cleanups of
generic mm code.
- A series of fixes to our hardware breakpoint support to properly handle
unaligned watchpoint addresses.
Plus a bunch of other smaller improvements, fixes and cleanups.
Thanks to:
Alastair D'Silva, Andrew Donnellan, Aneesh Kumar K.V, Anthony Steinhauser,
Cédric Le Goater, Chris Packham, Chris Smart, Christophe Leroy, Christopher M.
Riedl, Christoph Hellwig, Claudio Carvalho, Daniel Axtens, David Hildenbrand,
Deb McLemore, Diana Craciun, Eric Richter, Geert Uytterhoeven, Greg
Kroah-Hartman, Greg Kurz, Gustavo L. F. Walbon, Hari Bathini, Harish, Jason
Yan, Krzysztof Kozlowski, Leonardo Bras, Mathieu Malaterre, Mauro S. M.
Rodrigues, Michal Suchanek, Mimi Zohar, Nathan Chancellor, Nathan Lynch, Nayna
Jain, Nick Desaulniers, Oliver O'Halloran, Qian Cai, Rasmus Villemoes, Ravi
Bangoria, Sam Bobroff, Santosh Sivaraj, Scott Wood, Thomas Huth, Tyrel
Datwyler, Vaibhav Jain, Valentin Longchamp, YueHaibing.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAl3hBycTHG1wZUBlbGxl
cm1hbi5pZC5hdQAKCRBR6+o8yOGlgApBEACk91MEQDYJ9MF9I6uN+85qb5p4pMsp
rGzqnpt+XFidbDAc3eP63pYfIDSo3jtkQ2YL7shAnDOTvkO0md+Vqkl9Aq/G6FIf
lDBlwbgkXMSxS/O2Lpvfn4NZAoK6dKmiV55LSgfliM62X3e2Saeg6TR55wBTgJ6/
SlYPDwZfcVHOAiFS3UmfB+hkiIZk+AI5Zr5VAZvT2ZmeH36yAWkq4JgJI1uAk6m1
/7iCnlfUjx/nl/BhnA3kjjmAgGCJ5s/WuVgwFMz47XpMBWGBhLWpMh/NqDTFb8ca
kpiVQoVPQe2xyO3pL/kOwBy6sii26ftfHDhLKMy1hJdEhVQzS5LerPIMeh1qsU8Q
hV/Cj+jfsrS/vBDOehj3jwx93t+861PmTOqgLnpYQ6Ltrt+2B/74+fufGMHE1kI3
Ffo7xvNw4sw6bSziDxDFqUx2P1dFN5D5EJsJsYM98ekkVAAkzNqCDRvfD2QI8Pif
VXWPYXqtNJTrVPJA0D7Yfo9FDNwhANd0f1zi7r/U5mVXBFUyKOlGqTQSkXgMrVeK
3I7wHPOVGgdA5UUkfcd3pcuqsY081U9E//o5PUfj8ybO5JCwly8NoatbG+xHmKia
a72uJT8MjCo9mGCHKDrwi9l/kqms6ZSv8RP+yMhGuB52YoiGc6PpVyab5jXIUd1N
yTtBlC0YGW1JYw==
=JHzg
-----END PGP SIGNATURE-----
Merge tag 'powerpc-5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
"Highlights:
- Infrastructure for secure boot on some bare metal Power9 machines.
The firmware support is still in development, so the code here
won't actually activate secure boot on any existing systems.
- A change to xmon (our crash handler / pseudo-debugger) to restrict
it to read-only mode when the kernel is lockdown'ed, otherwise it's
trivial to drop into xmon and modify kernel data, such as the
lockdown state.
- Support for KASLR on 32-bit BookE machines (Freescale / NXP).
- Fixes for our flush_icache_range() and __kernel_sync_dicache()
(VDSO) to work with memory ranges >4GB.
- Some reworks of the pseries CMM (Cooperative Memory Management)
driver to make it behave more like other balloon drivers and enable
some cleanups of generic mm code.
- A series of fixes to our hardware breakpoint support to properly
handle unaligned watchpoint addresses.
Plus a bunch of other smaller improvements, fixes and cleanups.
Thanks to: Alastair D'Silva, Andrew Donnellan, Aneesh Kumar K.V,
Anthony Steinhauser, Cédric Le Goater, Chris Packham, Chris Smart,
Christophe Leroy, Christopher M. Riedl, Christoph Hellwig, Claudio
Carvalho, Daniel Axtens, David Hildenbrand, Deb McLemore, Diana
Craciun, Eric Richter, Geert Uytterhoeven, Greg Kroah-Hartman, Greg
Kurz, Gustavo L. F. Walbon, Hari Bathini, Harish, Jason Yan, Krzysztof
Kozlowski, Leonardo Bras, Mathieu Malaterre, Mauro S. M. Rodrigues,
Michal Suchanek, Mimi Zohar, Nathan Chancellor, Nathan Lynch, Nayna
Jain, Nick Desaulniers, Oliver O'Halloran, Qian Cai, Rasmus Villemoes,
Ravi Bangoria, Sam Bobroff, Santosh Sivaraj, Scott Wood, Thomas Huth,
Tyrel Datwyler, Vaibhav Jain, Valentin Longchamp, YueHaibing"
* tag 'powerpc-5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (144 commits)
powerpc/fixmap: fix crash with HIGHMEM
x86/efi: remove unused variables
powerpc: Define arch_is_kernel_initmem_freed() for lockdep
powerpc/prom_init: Use -ffreestanding to avoid a reference to bcmp
powerpc: Avoid clang warnings around setjmp and longjmp
powerpc: Don't add -mabi= flags when building with Clang
powerpc: Fix Kconfig indentation
powerpc/fixmap: don't clear fixmap area in paging_init()
selftests/powerpc: spectre_v2 test must be built 64-bit
powerpc/powernv: Disable native PCIe port management
powerpc/kexec: Move kexec files into a dedicated subdir.
powerpc/32: Split kexec low level code out of misc_32.S
powerpc/sysdev: drop simple gpio
powerpc/83xx: map IMMR with a BAT.
powerpc/32s: automatically allocate BAT in setbat()
powerpc/ioremap: warn on early use of ioremap()
powerpc: Add support for GENERIC_EARLY_IOREMAP
powerpc/fixmap: Use __fix_to_virt() instead of fix_to_virt()
powerpc/8xx: use the fixmapped IMMR in cpm_reset()
powerpc/8xx: add __init to cpm1 init functions
...
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEqG5UsNXhtOCrfGQP+7dXa6fLC2sFAl3O0OoACgkQ+7dXa6fL
C2tAwA//VH9Y81azemXFdflDF90sSH3TCASlKHVYHbBNAkH/QP5F00G4BEM4nNqH
F3x7qcU9vzfGdumF1pc90Yt6XSYlsQEGF+xMyMw/VS2wKs40yv+b/doVbzOWbN9C
NfrklgHeuuBk+JzU2llDisVqKRTLt4SmDpYu1ZdcchUQFZCCl3BpgdSEC+xXrHay
+KlRPVNMSd2kXMCDuSWrr71lVNdCTdf3nNC5p1i780+VrgpIBIG/jmiNdCcd7PLH
1aesPlr8UZY3+bmRtqe587fVRAhT2qA2xibKtyf9R0hrDtUKR4NSnpPmaeIjb26e
LhVntcChhYxQqzy/T4ScTDNVjpSlwi6QMo5DwAwzNGf2nf/v5/CZ+vGYDVdXRFHj
tgH1+8eDpHsi7jJp6E4cmZjiolsUx/ePDDTrQ4qbdDMO7fmIV6YQKFAMTLJepLBY
qnJVqoBq3qn40zv6tVZmKgWiXQ65jEkBItZhEUmcQRBiSbBDPweIdEzx/mwzkX7U
1gShGdut6YP4GX7BnOhkiQmzucS85mgkUfG43+mBfYXb+4zNTEjhhkqhEduz2SQP
xnjHxEM+MTGCj3PozIpJxNKzMTEceYY7cAUdNEMDQcHog7OCnIdGBIc7BPnsN8yA
CPzntwP4mmLfK3weq3PIGC6d9xfc9PpmiR9docxQOvE6sk2Ifeo=
=FKC7
-----END PGP SIGNATURE-----
Merge tag 'notifications-pipe-prep-20191115' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull pipe rework from David Howells:
"This is my set of preparatory patches for building a general
notification queue on top of pipes. It makes a number of significant
changes:
- It removes the nr_exclusive argument from __wake_up_sync_key() as
this is always 1. This prepares for the next step:
- Adds wake_up_interruptible_sync_poll_locked() so that poll can be
woken up from a function that's holding the poll waitqueue
spinlock.
- Change the pipe buffer ring to be managed in terms of unbounded
head and tail indices rather than bounded index and length. This
means that reading the pipe only needs to modify one index, not
two.
- A selection of helper functions are provided to query the state of
the pipe buffer, plus a couple to apply updates to the pipe
indices.
- The pipe ring is allowed to have kernel-reserved slots. This allows
many notification messages to be spliced in by the kernel without
allowing userspace to pin too many pages if it writes to the same
pipe.
- Advance the head and tail indices inside the pipe waitqueue lock
and use wake_up_interruptible_sync_poll_locked() to poke poll
without having to take the lock twice.
- Rearrange pipe_write() to preallocate the buffer it is going to
write into and then drop the spinlock. This allows kernel
notifications to then be added the ring whilst it is filling the
buffer it allocated. The read side is stalled because the pipe
mutex is still held.
- Don't wake up readers on a pipe if there was already data in it
when we added more.
- Don't wake up writers on a pipe if the ring wasn't full before we
removed a buffer"
* tag 'notifications-pipe-prep-20191115' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
pipe: Remove sync on wake_ups
pipe: Increase the writer-wakeup threshold to reduce context-switch count
pipe: Check for ring full inside of the spinlock in pipe_write()
pipe: Remove redundant wakeup from pipe_write()
pipe: Rearrange sequence in pipe_write() to preallocate slot
pipe: Conditionalise wakeup in pipe_read()
pipe: Advance tail pointer inside of wait spinlock in pipe_read()
pipe: Allow pipes to have kernel-reserved slots
pipe: Use head and tail pointers for the ring, not cursor and length
Add wake_up_interruptible_sync_poll_locked()
Remove the nr_exclusive argument from __wake_up_sync_key()
pipe: Reduce #inclusion of pipe_fs_i.h
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJd3YntAAoJEAx081l5xIa+dcQP/ikABkpm+q23FLKteRpL1rtX
xqlg5+KHW+YVCDls2BrINF6vYzyisoa8fNPlKMmOHse/IgMhFe9vBbCj1KQQOUR1
apNycI1wrcw/mn2WDikoIcF6C5cjqK9YVknnYoM6HnF1VmpGd1ecSGrOHrunEkrK
cMAWYIeqWGU8Gj/HUOitAFpLWFUMNle0BJuRoGLcoMusgS8yuCIEcpNzRhgL8fvJ
bW4imuyv24OjPoQzbKD0oQ0VIP86H0eM4LIeGZ2uyK/BSPKmMDqI4z4isUheS7RL
w4a6BdobMIdhew5dBXS0LsUJ3JniVJdHy123q9KgpmQAhGpiNoLT6BujfoUTUeWx
Mu0vM8Xmv9n4npdBYC+fLEFQXYJlu9uBA490jP84Kz6Fg1c6GyBebDY7/c2O4Zmg
7pvygmUF6boD6v2sIC/3161crgwU4g8zoxm2V4i9naxes2QB13LiEuJWlaI/FdxY
fd3zpglFGdoF1ThNne4QDh6gMKpXvjITyu/QxZeZ67Dt6i0Aqw9cRGHSpiVhYyDc
cx2hAp+rDvUi5SHkJKFpVImjB2DDn2xUG2uFMHz0cy9wNg203L3fRDi0hVtnM1+W
VpCxyLs2Upz6kEjDRVsfMZ9chCcWAWpVuKhtuuMUDw/IKnbP3uV8kzgJpVpaRVkD
76s5uYWHHBlk1IVlkOUP
=Hj7G
-----END PGP SIGNATURE-----
Merge tag 'drm-next-2019-11-27' of git://anongit.freedesktop.org/drm/drm
Pull drm updates from Dave Airlie:
"Lots of stuff in here, though it hasn't been too insane this merge
apart from dealing with the security fun.
uapi:
- export different colorspace properties on DP vs HDMI
- new fourcc for ARM 16x16 block format
- syncobj: allow querying last submitted timeline value
- DRM_FORMAT_BIG_ENDIAN defined as unsigned
core:
- allow using gem vma manager in ttm
- connector/encoder/bridge doc fixes
- allow more than 3 encoders for a connector
- displayport mst suspend/resume reprobing support
- vram lazy unmapping, uniform vram mm and gem vram
- edid cleanups + AVI informframe bar info
- displayport helpers - dpcd parser added
dp_cec:
- Allow a connector to be associated with a cec device
ttm:
- pipelining with no_gpu_wait fix
- always keep BOs on the LRU
sched:
- allow free_job routine to sleep
i915:
- Block userptr from mappable GTT
- i915 perf uapi versioning
- OA stream dynamic reconfiguration
- make context persistence optional
- introduce DRM_I915_UNSTABLE Kconfig
- add fake lmem testing under unstable
- BT.2020 support for DP MSA
- struct mutex elimination
- Tigerlake display/PLL/power management improvements
- Jasper Lake PCH support
- refactor PMU for multiple GPUs
- Icelake firmware update
- Split out vga + switcheroo code
amdgpu:
- implement dma-buf import/export without helpers
- vega20 RAS enablement
- DC i2c over aux fixes
- renoir GPU reset
- DC HDCP support
- BACO support for CI/VI asics
- MSI-X support
- Arcturus EEPROM support
- Arcturus VCN encode support
- VCN dynamic powergating on RV/RV2
amdkfd:
- add navi12/14/renoir support to kfd
radeon:
- SI dpm fix ported from amdgpu
- fix bad DMA on ppc platforms
gma500:
- memory leak fixes
qxl:
- convert to new gem mmap
exynos:
- build warning fix
komeda:
- add aclk sysfs attribute
v3d:
- userspace cleanup uapi change
i810:
- fix for underflow in dispatch ioctls
ast:
- refactor show_cursor
mgag200:
- refactor show_cursor
arcgpu:
- encoder finding improvements
mediatek:
- mipi_tx, dsi and partial crtc support for MT8183 SoC
- rotation support
meson:
- add suspend/resume support
omap:
- misc refactors
tegra:
- DisplayPort support for Tegra 210, 186 and 194.
- IOMMU-backed DMA API fixes
panfrost:
- fix lockdep issue
- simplify devfreq integration
rcar-du:
- R8A774B1 SoC support
- fixes for H2 ES2.0
sun4i:
- vcc-dsi regulator support
virtio-gpu:
- vmexit vs spinlock fix
- move to gem shmem helpers
- handle large command buffers with cma"
* tag 'drm-next-2019-11-27' of git://anongit.freedesktop.org/drm/drm: (1855 commits)
drm/amdgpu: invalidate mmhub semaphore workaround in gmc9/gmc10
drm/amdgpu: initialize vm_inv_eng0_sem for gfxhub and mmhub
drm/amd/amdgpu/sriov skip RLCG s/r list for arcturus VF.
drm/amd/amdgpu/sriov temporarily skip ras,dtm,hdcp for arcturus VF
drm/amdgpu/gfx10: re-init clear state buffer after gpu reset
merge fix for "ftrace: Rework event_create_dir()"
drm/amdgpu: Update Arcturus golden registers
drm/amdgpu/gfx10: fix out-of-bound mqd_backup array access
drm/amdgpu/gfx10: explicitly wait for cp idle after halt/unhalt
Revert "drm/amd/display: enable S/G for RAVEN chip"
drm/amdgpu: disable gfxoff on original raven
drm/amdgpu: remove experimental flag for Navi14
drm/amdgpu: disable gfxoff when using register read interface
drm/amdgpu/powerplay: properly set PP_GFXOFF_MASK (v2)
drm/amdgpu: fix bad DMA from INTERRUPT_CNTL2
drm/radeon: fix bad DMA from INTERRUPT_CNTL2
drm/amd/display: Fix debugfs on MST connectors
drm/amdgpu/nv: add asic func for fetching vbios from rom directly
drm/amdgpu: put flush_delayed_work at first
drm/amdgpu/vcn2.5: fix the enc loop with hw fini
...
Pull networking fixes from David Miller:
"This is mostly to fix the iwlwifi regression:
1) Flush GRO state properly in iwlwifi driver, from Alexander Lobakin.
2) Validate TIPC link name with properly length macro, from John
Rutherford.
3) Fix completion init and device query timeouts in ibmvnic, from
Thomas Falcon.
4) Fix SKB size calculation for netlink messages in psample, from
Nikolay Aleksandrov.
5) Similar kind of fix for OVS flow dumps, from Paolo Abeni.
6) Handle queue allocation failure unwind properly in gve driver, we
could try to release pages we didn't allocate. From Jeroen de
Borst.
7) Serialize TX queue SKB list accesses properly in mscc ocelot
driver. From Yangbo Lu"
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net:
net: usb: aqc111: Use the correct style for SPDX License Identifier
net: phy: Use the correct style for SPDX License Identifier
net: wireless: intel: iwlwifi: fix GRO_NORMAL packet stalling
net: mscc: ocelot: use skb queue instead of skbs list
net: mscc: ocelot: avoid incorrect consuming in skbs list
gve: Fix the queue page list allocated pages count
net: inet_is_local_reserved_port() port arg should be unsigned short
openvswitch: fix flow command message size
net: phy: dp83869: Fix return paths to return proper values
net: psample: fix skb_over_panic
net: usbnet: Fix -Wcast-function-type
net: hso: Fix -Wcast-function-type
net: port < inet_prot_sock(net) --> inet_port_requires_bind_service(net, port)
ibmvnic: Serialize device queries
ibmvnic: Bound waits for device queries
ibmvnic: Terminate waiting device threads after loss of service
ibmvnic: Fix completion structure initialization
net-sctp: replace some sock_net(sk) with just 'net'
net: Fix a documentation bug wrt. ip_unprivileged_port_start
tipc: fix link name length check
Pull RCU updates from Ingo Molnar:
"The main changes in this cycle were:
- Dynamic tick (nohz) updates, perhaps most notably changes to force
the tick on when needed due to lengthy in-kernel execution on CPUs
on which RCU is waiting.
- Linux-kernel memory consistency model updates.
- Replace rcu_swap_protected() with rcu_prepace_pointer().
- Torture-test updates.
- Documentation updates.
- Miscellaneous fixes"
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits)
security/safesetid: Replace rcu_swap_protected() with rcu_replace_pointer()
net/sched: Replace rcu_swap_protected() with rcu_replace_pointer()
net/netfilter: Replace rcu_swap_protected() with rcu_replace_pointer()
net/core: Replace rcu_swap_protected() with rcu_replace_pointer()
bpf/cgroup: Replace rcu_swap_protected() with rcu_replace_pointer()
fs/afs: Replace rcu_swap_protected() with rcu_replace_pointer()
drivers/scsi: Replace rcu_swap_protected() with rcu_replace_pointer()
drm/i915: Replace rcu_swap_protected() with rcu_replace_pointer()
x86/kvm/pmu: Replace rcu_swap_protected() with rcu_replace_pointer()
rcu: Upgrade rcu_swap_protected() to rcu_replace_pointer()
rcu: Suppress levelspread uninitialized messages
rcu: Fix uninitialized variable in nocb_gp_wait()
rcu: Update descriptions for rcu_future_grace_period tracepoint
rcu: Update descriptions for rcu_nocb_wake tracepoint
rcu: Remove obsolete descriptions for rcu_barrier tracepoint
rcu: Ensure that ->rcu_urgent_qs is set before resched IPI
workqueue: Convert for_each_wq to use built-in list check
rcu: Several rcu_segcblist functions can be static
rcu: Remove unused function hlist_bl_del_init_rcu()
Documentation: Rename rcu_node_context_switch() to rcu_note_context_switch()
...
Pull perf updates from Ingo Molnar:
"The main kernel side changes in this cycle were:
- Various Intel-PT updates and optimizations (Alexander Shishkin)
- Prohibit kprobes on Xen/KVM emulate prefixes (Masami Hiramatsu)
- Add support for LSM and SELinux checks to control access to the
perf syscall (Joel Fernandes)
- Misc other changes, optimizations, fixes and cleanups - see the
shortlog for details.
There were numerous tooling changes as well - 254 non-merge commits.
Here are the main changes - too many to list in detail:
- Enhancements to core tooling infrastructure, perf.data, libperf,
libtraceevent, event parsing, vendor events, Intel PT, callchains,
BPF support and instruction decoding.
- There were updates to the following tools:
perf annotate
perf diff
perf inject
perf kvm
perf list
perf maps
perf parse
perf probe
perf record
perf report
perf script
perf stat
perf test
perf trace
- And a lot of other changes: please see the shortlog and Git log for
more details"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (279 commits)
perf parse: Fix potential memory leak when handling tracepoint errors
perf probe: Fix spelling mistake "addrees" -> "address"
libtraceevent: Fix memory leakage in copy_filter_type
libtraceevent: Fix header installation
perf intel-bts: Does not support AUX area sampling
perf intel-pt: Add support for decoding AUX area samples
perf intel-pt: Add support for recording AUX area samples
perf pmu: When using default config, record which bits of config were changed by the user
perf auxtrace: Add support for queuing AUX area samples
perf session: Add facility to peek at all events
perf auxtrace: Add support for dumping AUX area samples
perf inject: Cut AUX area samples
perf record: Add aux-sample-size config term
perf record: Add support for AUX area sampling
perf auxtrace: Add support for AUX area sample recording
perf auxtrace: Move perf_evsel__find_pmu()
perf record: Add a function to test for kernel support for AUX area sampling
perf tools: Add kernel AUX area sampling definitions
perf/core: Make the mlock accounting simple again
perf report: Jump to symbol source view from total cycles view
...
Note that the sysctl write accessor functions guarantee that:
net->ipv4.sysctl_ip_prot_sock <= net->ipv4.ip_local_ports.range[0]
invariant is maintained, and as such the max() in selinux hooks is actually spurious.
ie. even though
if (snum < max(inet_prot_sock(sock_net(sk)), low) || snum > high) {
per logic is the same as
if ((snum < inet_prot_sock(sock_net(sk)) && snum < low) || snum > high) {
it is actually functionally equivalent to:
if (snum < low || snum > high) {
which is equivalent to:
if (snum < inet_prot_sock(sock_net(sk)) || snum < low || snum > high) {
even though the first clause is spurious.
But we want to hold on to it in case we ever want to change what what
inet_port_requires_bind_service() means (for example by changing
it from a, by default, [0..1024) range to some sort of set).
Test: builds, git 'grep inet_prot_sock' finds no other references
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking updates from David Miller:
"Another merge window, another pull full of stuff:
1) Support alternative names for network devices, from Jiri Pirko.
2) Introduce per-netns netdev notifiers, also from Jiri Pirko.
3) Support MSG_PEEK in vsock/virtio, from Matias Ezequiel Vara
Larsen.
4) Allow compiling out the TLS TOE code, from Jakub Kicinski.
5) Add several new tracepoints to the kTLS code, also from Jakub.
6) Support set channels ethtool callback in ena driver, from Sameeh
Jubran.
7) New SCTP events SCTP_ADDR_ADDED, SCTP_ADDR_REMOVED,
SCTP_ADDR_MADE_PRIM, and SCTP_SEND_FAILED_EVENT. From Xin Long.
8) Add XDP support to mvneta driver, from Lorenzo Bianconi.
9) Lots of netfilter hw offload fixes, cleanups and enhancements,
from Pablo Neira Ayuso.
10) PTP support for aquantia chips, from Egor Pomozov.
11) Add UDP segmentation offload support to igb, ixgbe, and i40e. From
Josh Hunt.
12) Add smart nagle to tipc, from Jon Maloy.
13) Support L2 field rewrite by TC offloads in bnxt_en, from Venkat
Duvvuru.
14) Add a flow mask cache to OVS, from Tonghao Zhang.
15) Add XDP support to ice driver, from Maciej Fijalkowski.
16) Add AF_XDP support to ice driver, from Krzysztof Kazimierczak.
17) Support UDP GSO offload in atlantic driver, from Igor Russkikh.
18) Support it in stmmac driver too, from Jose Abreu.
19) Support TIPC encryption and auth, from Tuong Lien.
20) Introduce BPF trampolines, from Alexei Starovoitov.
21) Make page_pool API more numa friendly, from Saeed Mahameed.
22) Introduce route hints to ipv4 and ipv6, from Paolo Abeni.
23) Add UDP segmentation offload to cxgb4, Rahul Lakkireddy"
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1857 commits)
libbpf: Fix usage of u32 in userspace code
mm: Implement no-MMU variant of vmalloc_user_node_flags
slip: Fix use-after-free Read in slip_open
net: dsa: sja1105: fix sja1105_parse_rgmii_delays()
macvlan: schedule bc_work even if error
enetc: add support Credit Based Shaper(CBS) for hardware offload
net: phy: add helpers phy_(un)lock_mdio_bus
mdio_bus: don't use managed reset-controller
ax88179_178a: add ethtool_op_get_ts_info()
mlxsw: spectrum_router: Fix use of uninitialized adjacency index
mlxsw: spectrum_router: After underlay moves, demote conflicting tunnels
bpf: Simplify __bpf_arch_text_poke poke type handling
bpf: Introduce BPF_TRACE_x helper for the tracing tests
bpf: Add bpf_jit_blinding_enabled for !CONFIG_BPF_JIT
bpf, testing: Add various tail call test cases
bpf, x86: Emit patchable direct jump as tail call
bpf: Constant map key tracking for prog array pokes
bpf: Add poke dependency tracking for prog array maps
bpf: Add initial poke descriptor table for jit images
bpf: Move owner type, jited info into array auxiliary data
...
In some situations AppArmor needs to be able to use its work buffers
from atomic context. Add the ability to specify when in atomic context
and hold a set of work buffers in reserve for atomic context to
reduce the chance that a large work buffer allocation will need to
be done.
Fixes: df323337e5 ("apparmor: Use a memory pool instead per-CPU caches")
Signed-off-by: John Johansen <john.johansen@canonical.com>
Now that the buffers allocation has changed and no longer needs
the full mediation under an rcu_read_lock, reduce the rcu_read_lock
scope to only where it is necessary.
Fixes: df323337e5 ("apparmor: Use a memory pool instead per-CPU caches")
Signed-off-by: John Johansen <john.johansen@canonical.com>
The sanity check in macro update_for_len checks to see if len
is less than zero, however, len is a size_t so it can never be
less than zero, so this sanity check is a no-op. Fix this by
making len a ssize_t so the comparison will work and add ulen
that is a size_t copy of len so that the min() macro won't
throw warnings about comparing different types.
Addresses-Coverity: ("Macro compares unsigned to 0")
Fixes: f1bd904175 ("apparmor: add the base fns() for domain labels")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: John Johansen <john.johansen@canonical.com>
Preparing for a change to the itimer internals, stop using the
do_setitimer() symbol and instead use a new higher-level interface.
The do_getitimer()/do_setitimer functions can now be made static,
allowing the compiler to potentially produce better object code.
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl3IqJQeHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGOiUH+gOEDwid5OODaFAd
CggXugdFIlBZefKqGVNW5sjgX8pxFWHXuEMC8iNb6QXtQZdFrI6LFf9hhUDmzQtm
6y1LPxxEiTZjObMEsBNylb7tyzgujFHcAlp0Zro3w/HLCqmYTSP3FF46i2u6KZfL
XhkpM4X7R7qxlfpdhlfESv/ElRGocZe6SwXfC7pcPo5flFcmkdu9ijqhNd/6CZ/h
Nf9rTsD/wEDVUelFbgVN+LJzlaB0tsyc4Zbof07n8OsFZjhdEOop8gfM/kTBLcyY
6bh66SfDScdsNnC/l8csbPjSZRx+i+nQs67DyhGNnsSAFgHBZdC4Tb/2mDCwhCLR
dUvuYZc=
=1N6F
-----END PGP SIGNATURE-----
Merge v5.4-rc7 into drm-next
We have the i915 security fixes to backmerge, but first
let's clear the decks for other drivers to avoid a bigger
mess.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Merge the secureboot support, as well as the IMA changes needed to
support it.
From Nayna's cover letter:
In order to verify the OS kernel on PowerNV systems, secure boot
requires X.509 certificates trusted by the platform. These are
stored in secure variables controlled by OPAL, called OPAL secure
variables. In order to enable users to manage the keys, the secure
variables need to be exposed to userspace.
OPAL provides the runtime services for the kernel to be able to
access the secure variables. This patchset defines the kernel
interface for the OPAL APIs. These APIs are used by the hooks, which
load these variables to the keyring and expose them to the userspace
for reading/writing.
Overall, this patchset adds the following support:
* expose secure variables to the kernel via OPAL Runtime API interface
* expose secure variables to the userspace via kernel sysfs interface
* load kernel verification and revocation keys to .platform and
.blacklist keyring respectively.
The secure variables can be read/written using simple linux
utilities cat/hexdump.
For example:
Path to the secure variables is: /sys/firmware/secvar/vars
Each secure variable is listed as directory.
$ ls -l
total 0
drwxr-xr-x. 2 root root 0 Aug 20 21:20 db
drwxr-xr-x. 2 root root 0 Aug 20 21:20 KEK
drwxr-xr-x. 2 root root 0 Aug 20 21:20 PK
The attributes of each of the secure variables are (for example: PK):
$ ls -l
total 0
-r--r--r--. 1 root root 4096 Oct 1 15:10 data
-r--r--r--. 1 root root 65536 Oct 1 15:10 size
--w-------. 1 root root 4096 Oct 1 15:12 update
The "data" is used to read the existing variable value using
hexdump. The data is stored in ESL format. The "update" is used to
write a new value using cat. The update is to be submitted as AUTH
file.
Fixes gcc '-Wunused-but-set-variable' warning:
security/keys/trusted-keys/trusted_tpm1.c: In function tpm_unseal:
security/keys/trusted-keys/trusted_tpm1.c:588:11: warning: variable keyhndl set but not used [-Wunused-but-set-variable]
Fixes: 00aa975bd031 ("KEYS: trusted: Create trusted keys subsystem")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Move TPM2 trusted keys code to trusted keys subsystem. The reason
being it's better to consolidate all the trusted keys code to a single
location so that it can be maintained sanely.
Also, utilize existing tpm_send() exported API which wraps the internal
tpm_transmit_cmd() API.
Suggested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Move existing code to trusted keys subsystem. Also, rename files with
"tpm" as suffix which provides the underlying implementation.
Suggested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Switch to utilize common heap based tpm_buf code for TPM based trusted
and asymmetric keys rather than using stack based tpm1_buf code. Also,
remove tpm1_buf code.
Suggested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Move tpm_buf code to common include/linux/tpm.h header so that it can
be reused via other subsystems like trusted keys etc.
Also rename trusted keys and asymmetric keys usage of TPM 1.x buffer
implementation to tpm1_buf to avoid any compilation errors.
Suggested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
The keys used to verify the Host OS kernel are managed by firmware as
secure variables. This patch loads the verification keys into the
.platform keyring and revocation hashes into .blacklist keyring. This
enables verification and loading of the kernels signed by the boot
time keys which are trusted by firmware.
Signed-off-by: Nayna Jain <nayna@linux.ibm.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Eric Richter <erichte@linux.ibm.com>
[mpe: Search by compatible in load_powerpc_certs(), not using format]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1573441836-3632-5-git-send-email-nayna@linux.ibm.com
The handlers to add the keys to the .platform keyring and blacklisted
hashes to the .blacklist keyring is common for both the uefi and powerpc
mechanisms of loading the keys/hashes from the firmware.
This patch moves the common code from load_uefi.c to keyring_handler.c
Signed-off-by: Nayna Jain <nayna@linux.ibm.com>
Acked-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Eric Richter <erichte@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1573441836-3632-4-git-send-email-nayna@linux.ibm.com
Asymmetric private keys are used to sign multiple files. The kernel
currently supports checking against blacklisted keys. However, if the
public key is blacklisted, any file signed by the blacklisted key will
automatically fail signature verification. Blacklisting the public key
is not fine enough granularity, as we might want to only blacklist a
particular file.
This patch adds support for checking against the blacklisted hash of
the file, without the appended signature, based on the IMA policy. It
defines a new policy option "appraise_flag=check_blacklist".
In addition to the blacklisted binary hashes stored in the firmware
"dbx" variable, the Linux kernel may be configured to load blacklisted
binary hashes onto the .blacklist keyring as well. The following
example shows how to blacklist a specific kernel module hash.
$ sha256sum kernel/kheaders.ko
77fa889b35a05338ec52e51591c1b89d4c8d1c99a21251d7c22b1a8642a6bad3
kernel/kheaders.ko
$ grep BLACKLIST .config
CONFIG_SYSTEM_BLACKLIST_KEYRING=y
CONFIG_SYSTEM_BLACKLIST_HASH_LIST="blacklist-hash-list"
$ cat certs/blacklist-hash-list
"bin:77fa889b35a05338ec52e51591c1b89d4c8d1c99a21251d7c22b1a8642a6bad3"
Update the IMA custom measurement and appraisal policy
rules (/etc/ima-policy):
measure func=MODULE_CHECK template=ima-modsig
appraise func=MODULE_CHECK appraise_flag=check_blacklist
appraise_type=imasig|modsig
After building, installing, and rebooting the kernel:
545660333 ---lswrv 0 0 \_ blacklist:
bin:77fa889b35a05338ec52e51591c1b89d4c8d1c99a21251d7c22b1a8642a6bad3
measure func=MODULE_CHECK template=ima-modsig
appraise func=MODULE_CHECK appraise_flag=check_blacklist
appraise_type=imasig|modsig
modprobe: ERROR: could not insert 'kheaders': Permission denied
10 0c9834db5a0182c1fb0cdc5d3adcf11a11fd83dd ima-sig
sha256:3bc6ed4f0b4d6e31bc1dbc9ef844605abc7afdc6d81a57d77a1ec9407997c40
2 /usr/lib/modules/5.4.0-rc3+/kernel/kernel/kheaders.ko
10 82aad2bcc3fa8ed94762356b5c14838f3bcfa6a0 ima-modsig
sha256:3bc6ed4f0b4d6e31bc1dbc9ef844605abc7afdc6d81a57d77a1ec9407997c40
2 /usr/lib/modules/5.4.0rc3+/kernel/kernel/kheaders.ko sha256:77fa889b3
5a05338ec52e51591c1b89d4c8d1c99a21251d7c22b1a8642a6bad3
3082029a06092a864886f70d010702a082028b30820287020101310d300b0609608648
016503040201300b06092a864886f70d01070131820264....
10 25b72217cc1152b44b134ce2cd68f12dfb71acb3 ima-buf
sha256:8b58427fedcf8f4b20bc8dc007f2e232bf7285d7b93a66476321f9c2a3aa132
b blacklisted-hash
77fa889b35a05338ec52e51591c1b89d4c8d1c99a21251d7c22b1a8642a6bad3
Signed-off-by: Nayna Jain <nayna@linux.ibm.com>
[zohar@linux.ibm.com: updated patch description]
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1572492694-6520-8-git-send-email-zohar@linux.ibm.com
process_buffer_measurement() is limited to measuring the kexec boot
command line. This patch makes process_buffer_measurement() more
generic, allowing it to measure other types of buffer data (e.g.
blacklisted binary hashes or key hashes).
process_buffer_measurement() may be called directly from an IMA hook
or as an auxiliary measurement record. In both cases the buffer
measurement is based on policy. This patch modifies the function to
conditionally retrieve the policy defined PCR and template for the IMA
hook case.
Signed-off-by: Nayna Jain <nayna@linux.ibm.com>
[zohar@linux.ibm.com: added comment in process_buffer_measurement()]
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1572492694-6520-6-git-send-email-zohar@linux.ibm.com
The only slightly tricky merge conflict was the netdevsim because the
mutex locking fix overlapped a lot of driver reload reorganization.
The rest were (relatively) trivial in nature.
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver exposes EFI runtime services to user-space through an IOCTL
interface, calling the EFI services function pointers directly without
using the efivar API.
Disallow access to the /dev/efi_test character device when the kernel is
locked down to prevent arbitrary user-space to call EFI runtime services.
Also require CAP_SYS_ADMIN to open the chardev to prevent unprivileged
users to call the EFI runtime services, instead of just relying on the
chardev file mode bits for this.
The main user of this driver is the fwts [0] tool that already checks if
the effective user ID is 0 and fails otherwise. So this change shouldn't
cause any regression to this tool.
[0]: https://wiki.ubuntu.com/FirmwareTestSuite/Reference/uefivarinfo
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Matthew Garrett <mjg59@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: https://lkml.kernel.org/r/20191029173755.27149-7-ardb@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull RCU and LKMM changes from Paul E. McKenney:
- Documentation updates.
- Miscellaneous fixes.
- Dynamic tick (nohz) updates, perhaps most notably changes to
force the tick on when needed due to lengthy in-kernel execution
on CPUs on which RCU is waiting.
- Replace rcu_swap_protected() with rcu_prepace_pointer().
- Torture-test updates.
- Linux-kernel memory consistency model updates.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This commit replaces the use of rcu_swap_protected() with the more
intuitively appealing rcu_replace_pointer() as a step towards removing
rcu_swap_protected().
Link: https://lore.kernel.org/lkml/CAHk-=wiAsJLw1egFEE=Z7-GGtM6wcvtyytXZA1+BHqta4gg6Hw@mail.gmail.com/
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Reported-by: kbuild test robot <lkp@intel.com>
[ paulmck: From rcu_replace() to rcu_replace_pointer() per Ingo Molnar. ]
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Micah Morton <mortonm@chromium.org>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: <linux-security-module@vger.kernel.org>
Xmon should be either fully or partially disabled depending on the
kernel lockdown state.
Put xmon into read-only mode for lockdown=integrity and prevent user
entry into xmon when lockdown=confidentiality. Xmon checks the lockdown
state on every attempted entry:
(1) during early xmon'ing
(2) when triggered via sysrq
(3) when toggled via debugfs
(4) when triggered via a previously enabled breakpoint
The following lockdown state transitions are handled:
(1) lockdown=none -> lockdown=integrity
set xmon read-only mode
(2) lockdown=none -> lockdown=confidentiality
clear all breakpoints, set xmon read-only mode,
prevent user re-entry into xmon
(3) lockdown=integrity -> lockdown=confidentiality
clear all breakpoints, set xmon read-only mode,
prevent user re-entry into xmon
Suggested-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Christopher M. Riedl <cmr@informatik.wtf>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190907061124.1947-3-cmr@informatik.wtf
drm-next-5.5-2019-10-09:
amdgpu:
- Additional RAS enablement for vega20
- RAS page retirement and bad page storage in EEPROM
- No GPU reset with unrecoverable RAS errors
- Reserve vram for page tables rather than trying to evict
- Fix issues with GPU reset and xgmi hives
- DC i2c over aux fixes
- Direct submission for clears, PTE/PDE updates
- Improvements to help support recoverable GPU page faults
- Silence harmless SAD block messages
- Clean up code for creating a bo at a fixed location
- Initial DC HDCP support
- Lots of documentation fixes
- GPU reset for renoir
- Add IH clockgating support for soc15 asics
- Powerplay improvements
- DC MST cleanups
- Add support for MSI-X
- Misc cleanups and bug fixes
amdkfd:
- Query KFD device info by asic type rather than pci ids
- Add navi14 support
- Add renoir support
- Add navi12 support
- gfx10 trap handler improvements
- pasid cleanups
- Check against device cgroup
ttm:
- Return -EBUSY with pipelining with no_gpu_wait
radeon:
- Silence harmless SAD block messages
device_cgroup:
- Export devcgroup_check_permission
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191010041713.3412-1-alexander.deucher@amd.com
In current mainline, the degree of access to perf_event_open(2) system
call depends on the perf_event_paranoid sysctl. This has a number of
limitations:
1. The sysctl is only a single value. Many types of accesses are controlled
based on the single value thus making the control very limited and
coarse grained.
2. The sysctl is global, so if the sysctl is changed, then that means
all processes get access to perf_event_open(2) opening the door to
security issues.
This patch adds LSM and SELinux access checking which will be used in
Android to access perf_event_open(2) for the purposes of attaching BPF
programs to tracepoints, perf profiling and other operations from
userspace. These operations are intended for production systems.
5 new LSM hooks are added:
1. perf_event_open: This controls access during the perf_event_open(2)
syscall itself. The hook is called from all the places that the
perf_event_paranoid sysctl is checked to keep it consistent with the
systctl. The hook gets passed a 'type' argument which controls CPU,
kernel and tracepoint accesses (in this context, CPU, kernel and
tracepoint have the same semantics as the perf_event_paranoid sysctl).
Additionally, I added an 'open' type which is similar to
perf_event_paranoid sysctl == 3 patch carried in Android and several other
distros but was rejected in mainline [1] in 2016.
2. perf_event_alloc: This allocates a new security object for the event
which stores the current SID within the event. It will be useful when
the perf event's FD is passed through IPC to another process which may
try to read the FD. Appropriate security checks will limit access.
3. perf_event_free: Called when the event is closed.
4. perf_event_read: Called from the read(2) and mmap(2) syscalls for the event.
5. perf_event_write: Called from the ioctl(2) syscalls for the event.
[1] https://lwn.net/Articles/696240/
Since Peter had suggest LSM hooks in 2016 [1], I am adding his
Suggested-by tag below.
To use this patch, we set the perf_event_paranoid sysctl to -1 and then
apply selinux checking as appropriate (default deny everything, and then
add policy rules to give access to domains that need it). In the future
we can remove the perf_event_paranoid sysctl altogether.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Co-developed-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: James Morris <jmorris@namei.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: rostedt@goodmis.org
Cc: Yonghong Song <yhs@fb.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: jeffv@google.com
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: primiano@google.com
Cc: Song Liu <songliubraving@fb.com>
Cc: rsavitski@google.com
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Matthew Garrett <matthewgarrett@google.com>
Link: https://lkml.kernel.org/r/20191014170308.70668-1-joel@joelfernandes.org
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAl2bu6kUHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQ6iDy2pc3iXMsxhAAtoljww3Xur0JpD7y+g2yzKGZqn9F
ovqH103NOdpXY3vRN5TL0ZfKEWZz/a2Rjyjz/9+Ix5kKFQuaguk9TVenp4LuAWjy
yyo8aSArqwJEpPbrgQDRkjvq08zCcsHSQHwyR44L5MEB8w03Hr+GKFbroR7DkB8R
qthF5nRoarblEpdc88s3WbPN/Yz32zRwl3EppSRriIBSBUNr6OP5yO6YDvBdwJso
CvmQybMK/iGiZrDzm5jAXzUyI79MHkrrB55roNXIdam9Rnyb9Wqjt9SQgzDLTvO1
Z7c4pXqDn1iMSECAqR7EeKLmsEvnp8omDMqbZOsGiWwka93nuNM4NRhswMF6X3pf
EbmBAuj0CokWlRoJAxyxrw/Tn+KXWjyOpOMoNQR7dyyewenzPTWw4zLhiSsl4Epo
e1+3PDkJeZhlrtqMcQhep/OgfnPp/8FlgZXNkq1wsMK6SawIiwvxH3mpELE4I8Zk
3yzYZvnxIDNLcx6TmDgDcJyp+P/iuFGK707G6ogCoCK9VqyTs+nwdZn3s2o1KRDW
00LdiuXiqOyfdDthfY/q5suKJoWExh+K1dhQ7Llx169yx3uOjlnzTaSTt8dcvhkh
Y+Nf5pEk0MVgnldaIRy/Zzr4y81Q7QW6ZwD62NHCIhcSevYczFOP7K6V/mYFmDT1
xlCDPXeHyuR5DrM=
=btWt
-----END PGP SIGNATURE-----
Merge tag 'selinux-pr-20191007' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinuxfix from Paul Moore:
"One patch to ensure we don't copy bad memory up into userspace"
* tag 'selinux-pr-20191007' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: fix context string corruption in convert_context()
A policy developer can now specify glblub as a default_range default and
the computed transition will be the intersection of the mls range of
the two contexts.
The glb (greatest lower bound) lub (lowest upper bound) of a range is calculated
as the greater of the low sensitivities and the lower of the high sensitivities
and the and of each category bitmap.
This can be used by MLS solution developers to compute a context that satisfies,
for example, the range of a network interface and the range of a user logging in.
Some examples are:
User Permitted Range | Network Device Label | Computed Label
---------------------|----------------------|----------------
s0-s1:c0.c12 | s0 | s0
s0-s1:c0.c12 | s0-s1:c0.c1023 | s0-s1:c0.c12
s0-s4:c0.c512 | s1-s1:c0.c1023 | s1-s1:c0.c512
s0-s15:c0,c2 | s4-s6:c0.c128 | s4-s6:c0,c2
s0-s4 | s2-s6 | s2-s4
s0-s4 | s5-s8 | INVALID
s5-s8 | s0-s4 | INVALID
Signed-off-by: Joshua Brindle <joshua.brindle@crunchydata.com>
[PM: subject lines and checkpatch.pl fixes]
Signed-off-by: Paul Moore <paul@paul-moore.com>
For AMD compute (amdkfd) driver.
All AMD compute devices are exported via single device node /dev/kfd. As
a result devices cannot be controlled individually using device cgroup.
AMD compute devices will rely on its graphics counterpart that exposes
/dev/dri/renderN node for each device. For each task (based on its
cgroup), KFD driver will check if /dev/dri/renderN node is accessible
before exposing it.
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The ima/ and evm/ sub-directories contain built-in objects, so
obj-$(CONFIG_...) is the correct way to descend into them.
subdir-$(CONFIG_...) is redundant.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
I guess commit 15ea0e1e3e ("efi: Import certificates from UEFI Secure
Boot") attempted to add -fshort-wchar for building load_uefi.o, but it
has never worked as intended.
load_uefi.o is created in the platform_certs/ sub-directory. If you
really want to add -fshort-wchar, the correct code is:
$(obj)/platform_certs/load_uefi.o: KBUILD_CFLAGS += -fshort-wchar
But, you do not need to fix it.
Commit 8c97023cf0 ("Kbuild: use -fshort-wchar globally") had already
added -fshort-wchar globally. This code was unneeded in the first place.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
string_to_context_struct() may garble the context string, so we need to
copy back the contents again from the old context struct to avoid
storing the corrupted context.
Since string_to_context_struct() tokenizes (and therefore truncates) the
context string and we are later potentially copying it with kstrdup(),
this may eventually cause pieces of uninitialized kernel memory to be
disclosed to userspace (when copying to userspace based on the stored
length and not the null character).
How to reproduce on Fedora and similar:
# dnf install -y memcached
# systemctl start memcached
# semodule -d memcached
# load_policy
# load_policy
# systemctl stop memcached
# ausearch -m AVC
type=AVC msg=audit(1570090572.648:313): avc: denied { signal } for pid=1 comm="systemd" scontext=system_u:system_r:init_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=process permissive=0 trawcon=73797374656D5F75007400000000000070BE6E847296FFFF726F6D000096FFFF76
Cc: stable@vger.kernel.org
Reported-by: Milos Malik <mmalik@redhat.com>
Fixes: ee1a84fdfe ("selinux: overhaul sidtab to fix bug and improve performance")
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Add two commands to add and delete list of link properties. Implement
the first property type along - alternative ifnames.
Each net device can have multiple alternative names.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the SELinux LSM prevents one from setting the
`security.selinux` xattr on an inode without a policy first being
loaded. However, this restriction is problematic: it makes it impossible
to have newly created files with the correct label before actually
loading the policy.
This is relevant in distributions like Fedora, where the policy is
loaded by systemd shortly after pivoting out of the initrd. In such
instances, all files created prior to pivoting will be unlabeled. One
then has to relabel them after pivoting, an operation which inherently
races with other processes trying to access those same files.
Going further, there are use cases for creating the entire root
filesystem on first boot from the initrd (e.g. Container Linux supports
this today[1], and we'd like to support it in Fedora CoreOS as well[2]).
One can imagine doing this in two ways: at the block device level (e.g.
laying down a disk image), or at the filesystem level. In the former,
labeling can simply be part of the image. But even in the latter
scenario, one still really wants to be able to set the right labels when
populating the new filesystem.
This patch enables this by changing behaviour in the following two ways:
1. allow `setxattr` if we're not initialized
2. don't try to set the in-core inode SID if we're not initialized;
instead leave it as `LABEL_INVALID` so that revalidation may be
attempted at a later time
Note the first hunk of this patch is mostly the same as a previously
discussed one[3], though it was part of a larger series which wasn't
accepted.
[1] https://coreos.com/os/docs/latest/root-filesystem-placement.html
[2] https://github.com/coreos/fedora-coreos-tracker/issues/94
[3] https://www.spinics.net/lists/linux-initramfs/msg04593.html
Co-developed-by: Victor Kamensky <kamensky@cisco.com>
Signed-off-by: Victor Kamensky <kamensky@cisco.com>
Signed-off-by: Jonathan Lebon <jlebon@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Load size was limited to 64MB, this was legacy limitation due to vmalloc()
which was removed a while ago.
Signed-off-by: zhanglin <zhang.lin16@zte.com.cn>
[PM: removed comments in the description about 'real world use cases']
Signed-off-by: Paul Moore <paul@paul-moore.com>
Pull kernel lockdown mode from James Morris:
"This is the latest iteration of the kernel lockdown patchset, from
Matthew Garrett, David Howells and others.
From the original description:
This patchset introduces an optional kernel lockdown feature,
intended to strengthen the boundary between UID 0 and the kernel.
When enabled, various pieces of kernel functionality are restricted.
Applications that rely on low-level access to either hardware or the
kernel may cease working as a result - therefore this should not be
enabled without appropriate evaluation beforehand.
The majority of mainstream distributions have been carrying variants
of this patchset for many years now, so there's value in providing a
doesn't meet every distribution requirement, but gets us much closer
to not requiring external patches.
There are two major changes since this was last proposed for mainline:
- Separating lockdown from EFI secure boot. Background discussion is
covered here: https://lwn.net/Articles/751061/
- Implementation as an LSM, with a default stackable lockdown LSM
module. This allows the lockdown feature to be policy-driven,
rather than encoding an implicit policy within the mechanism.
The new locked_down LSM hook is provided to allow LSMs to make a
policy decision around whether kernel functionality that would allow
tampering with or examining the runtime state of the kernel should be
permitted.
The included lockdown LSM provides an implementation with a simple
policy intended for general purpose use. This policy provides a coarse
level of granularity, controllable via the kernel command line:
lockdown={integrity|confidentiality}
Enable the kernel lockdown feature. If set to integrity, kernel features
that allow userland to modify the running kernel are disabled. If set to
confidentiality, kernel features that allow userland to extract
confidential information from the kernel are also disabled.
This may also be controlled via /sys/kernel/security/lockdown and
overriden by kernel configuration.
New or existing LSMs may implement finer-grained controls of the
lockdown features. Refer to the lockdown_reason documentation in
include/linux/security.h for details.
The lockdown feature has had signficant design feedback and review
across many subsystems. This code has been in linux-next for some
weeks, with a few fixes applied along the way.
Stephen Rothwell noted that commit 9d1f8be5cf ("bpf: Restrict bpf
when kernel lockdown is in confidentiality mode") is missing a
Signed-off-by from its author. Matthew responded that he is providing
this under category (c) of the DCO"
* 'next-lockdown' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (31 commits)
kexec: Fix file verification on S390
security: constify some arrays in lockdown LSM
lockdown: Print current->comm in restriction messages
efi: Restrict efivar_ssdt_load when the kernel is locked down
tracefs: Restrict tracefs when the kernel is locked down
debugfs: Restrict debugfs when the kernel is locked down
kexec: Allow kexec_file() with appropriate IMA policy when locked down
lockdown: Lock down perf when in confidentiality mode
bpf: Restrict bpf when kernel lockdown is in confidentiality mode
lockdown: Lock down tracing and perf kprobes when in confidentiality mode
lockdown: Lock down /proc/kcore
x86/mmiotrace: Lock down the testmmiotrace module
lockdown: Lock down module params that specify hardware parameters (eg. ioport)
lockdown: Lock down TIOCSSERIAL
lockdown: Prohibit PCMCIA CIS storage when the kernel is locked down
acpi: Disable ACPI table override if the kernel is locked down
acpi: Ignore acpi_rsdp kernel param when the kernel has been locked down
ACPI: Limit access to custom_method when the kernel is locked down
x86/msr: Restrict MSR access when the kernel is locked down
x86: Lock down IO port access when the kernel is locked down
...
Pull integrity updates from Mimi Zohar:
"The major feature in this time is IMA support for measuring and
appraising appended file signatures. In addition are a couple of bug
fixes and code cleanup to use struct_size().
In addition to the PE/COFF and IMA xattr signatures, the kexec kernel
image may be signed with an appended signature, using the same
scripts/sign-file tool that is used to sign kernel modules.
Similarly, the initramfs may contain an appended signature.
This contained a lot of refactoring of the existing appended signature
verification code, so that IMA could retain the existing framework of
calculating the file hash once, storing it in the IMA measurement list
and extending the TPM, verifying the file's integrity based on a file
hash or signature (eg. xattrs), and adding an audit record containing
the file hash, all based on policy. (The IMA support for appended
signatures patch set was posted and reviewed 11 times.)
The support for appended signature paves the way for adding other
signature verification methods, such as fs-verity, based on a single
system-wide policy. The file hash used for verifying the signature and
the signature, itself, can be included in the IMA measurement list"
* 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
ima: ima_api: Use struct_size() in kzalloc()
ima: use struct_size() in kzalloc()
sefltest/ima: support appended signatures (modsig)
ima: Fix use after free in ima_read_modsig()
MODSIGN: make new include file self contained
ima: fix freeing ongoing ahash_request
ima: always return negative code for error
ima: Store the measurement again when appraising a modsig
ima: Define ima-modsig template
ima: Collect modsig
ima: Implement support for module-style appended signatures
ima: Factor xattr_verify() out of ima_appraise_measurement()
ima: Add modsig appraise_type option for module-style appended signatures
integrity: Select CONFIG_KEYS instead of depending on it
PKCS#7: Introduce pkcs7_get_digest()
PKCS#7: Refactor verify_pkcs7_signature()
MODSIGN: Export module signature definitions
ima: initialize the "template" field with the default template
Commit 0b6cf6b97b ("tpm: pass an array of tpm_extend_digest structures to
tpm_pcr_extend()") modifies tpm_pcr_extend() to accept a digest for each
PCR bank. After modification, tpm_pcr_extend() expects that digests are
passed in the same order as the algorithms set in chip->allocated_banks.
This patch fixes two issues introduced in the last iterations of the patch
set: missing initialization of the TPM algorithm ID in the tpm_digest
structures passed to tpm_pcr_extend() by the trusted key module, and
unreleased locks in the TPM driver due to returning from tpm_pcr_extend()
without calling tpm_put_ops().
Cc: stable@vger.kernel.org
Fixes: 0b6cf6b97b ("tpm: pass an array of tpm_extend_digest structures to tpm_pcr_extend()")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Suggested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
response to mechanically detected potential issues. The remaining
patch cleans up kernel-doc notations.
-----BEGIN PGP SIGNATURE-----
iQJLBAABCAA1FiEEC+9tH1YyUwIQzUIeOKUVfIxDyBEFAl2JI5wXHGNhc2V5QHNj
aGF1Zmxlci1jYS5jb20ACgkQOKUVfIxDyBEOJQ/5AXdQTd09LMp9jB54u9Usdm71
+kyJ/KudEja8/pCDDNboiXSfoagRqJ8AbuBAbGLtWLXc3smUcL1mncdfJDJAk88J
mbIB+qWMls5fC25udD+B2bF2py+eyVJ7dsnvHZg1mS5KUxYBMWVEqgX9zW0EFgNH
xd2/nB314GhULrfqagxxCd/HpbZ3GV1sM+BkfRPx2zm3gJ8xAuXm1xMMgchP9WqH
MFJDqk8r1wXCog8OkjQjAYR8zGRJTrP9W6UY9p1L6rp9rtfyPObBuAMLKv3WlXx8
Jz7idqSDNa49V7W3UrWcjXCunbjyPR7HszuuxhTC+EmB1MRU4IdX9I6ZdAaTuxEM
jFNwSSjIWRgXkJfLxrDX1ukFPU0JCd8ms7Lzw5YHq2TWt/V/7h4jyUCN8o9BN80r
7WzqdzT4v+Exc6TpqlpkHiQjJFL4ByEzNt3xNVZ3UFIyxnogVi45kL/78PsqDk/j
XWqM9bED8dBjM/K3EGqzj0mPCtILLnTm9ZyDvFF75jabf4rk0E354yGcuamoF+eM
UTT+3NTPQB/kI5i9av8ibGezInVVRQeHuI1/qIaD/Hsr8K7VJbqlB1k/rUxUZaSy
6g9e0mU2GLgM+eW0EKW0GWpV6/STqzskxu2TW46tobpOykwH9dNKJHhJzx7nEWJi
+5kMcGIvFCha6922/sM=
=QV1S
-----END PGP SIGNATURE-----
Merge tag 'smack-for-5.4-rc1' of git://github.com/cschaufler/smack-next
Pull smack updates from Casey Schaufler:
"Four patches for v5.4. Nothing is major.
All but one are in response to mechanically detected potential issues.
The remaining patch cleans up kernel-doc notations"
* tag 'smack-for-5.4-rc1' of git://github.com/cschaufler/smack-next:
smack: use GFP_NOFS while holding inode_smack::smk_lock
security: smack: Fix possible null-pointer dereferences in smack_socket_sock_rcv_skb()
smack: fix some kernel-doc notations
Smack: Don't ignore other bprm->unsafe flags if LSM_UNSAFE_PTRACE is set
Pull SafeSetID fix from Micah Morton:
"Jann Horn sent some patches to fix some bugs in SafeSetID for 5.3.
After he had done his testing there were a couple small code tweaks
that went in and caused this bug.
From what I can see SafeSetID is broken in 5.3 and crashes the kernel
every time during initialization if you try to use it. I came across
this bug when backporting Jann's changes for 5.3 to older kernels
(4.14 and 4.19). I've tested on a Chrome OS device with those kernels
and verified that this change fixes things.
It doesn't seem super useful to have this bake in linux-next, since it
is completely broken in 5.3 and nobody noticed"
* tag 'safesetid-bugfix-5.4' of git://github.com/micah-morton/linux:
LSM: SafeSetID: Stop releasing uninitialized ruleset
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAl2BLvcUHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQ6iDy2pc3iXP9pA/+Ls9sRGZoEipycbgRnwkL9/6yFtn4
UCFGMP0eobrjL82i8uMOa/72Budsp3ZaZRxf36NpbMDPyB9ohp5jf7o1WFTELESv
EwxVvOMNwrxO2UbzRv3iywnhdPVJ4gHPa4GWfBHu2EEfhz3/Bv0tPIBdeXAbq4aC
R0p+M9X0FFEp9eP4ftwOvFGpbZ8zKo1kwgdvCnqLhHDkyqtapqO/ByCTe1VATERP
fyxjYDZNnITmI0plaIxCeeudklOTtVSAL4JPh1rk8rZIkUznZ4EBDHxdKiaz3j9C
ZtAthiAA9PfAwf4DZSPHnGsfINxeNBKLD65jZn/PUne/gNJEx4DK041X9HXBNwjv
OoArw58LCzxtTNZ//WB4CovRpeSdKvmKv0oh61k8cdQahLeHhzXE1wLQbnnBJLI3
CTsumIp4ZPEOX5r4ogdS3UIQpo3KrZump7VO85yUTRni150JpZR3egYpmcJ0So1A
QTPemBhC2CHJVTpycYZ9fVTlPeC4oNwosPmvpB8XeGu3w5JpuNSId+BDR/ZlQAmq
xWiIocGL3UMuPuJUrTGChifqBAgzK+gLa7S7RYPEnTCkj6LVQwsuP4gBXf75QTG4
FPwVcoMSDFxUDF0oFqwz4GfJlCxBSzX+BkWUn6jIiXKXBnQjU+1gu6KTwE25mf/j
snJznFk25hFYFaM=
=n4ht
-----END PGP SIGNATURE-----
Merge tag 'selinux-pr-20190917' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux updates from Paul Moore:
- Add LSM hooks, and SELinux access control hooks, for dnotify,
fanotify, and inotify watches. This has been discussed with both the
LSM and fs/notify folks and everybody is good with these new hooks.
- The LSM stacking changes missed a few calls to current_security() in
the SELinux code; we fix those and remove current_security() for
good.
- Improve our network object labeling cache so that we always return
the object's label, even when under memory pressure. Previously we
would return an error if we couldn't allocate a new cache entry, now
we always return the label even if we can't create a new cache entry
for it.
- Convert the sidtab atomic_t counter to a normal u32 with
READ/WRITE_ONCE() and memory barrier protection.
- A few patches to policydb.c to clean things up (remove forward
declarations, long lines, bad variable names, etc)
* tag 'selinux-pr-20190917' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
lsm: remove current_security()
selinux: fix residual uses of current_security() for the SELinux blob
selinux: avoid atomic_t usage in sidtab
fanotify, inotify, dnotify, security: add security hook for fs notifications
selinux: always return a secid from the network caches if we find one
selinux: policydb - rename type_val_to_struct_array
selinux: policydb - fix some checkpatch.pl warnings
selinux: shuffle around policydb.c to get rid of forward declarations
The first time a rule set is configured for SafeSetID, we shouldn't be
trying to release the previously configured ruleset, since there isn't
one. Currently, the pointer that would point to a previously configured
ruleset is uninitialized on first rule set configuration, leading to a
crash when we try to call release_ruleset with that pointer.
Acked-by: Jann Horn <jannh@google.com>
Signed-off-by: Micah Morton <mortonm@chromium.org>
No reason for these not to be const.
Signed-off-by: Matthew Garrett <mjg59@google.com>
Suggested-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
If a request_key authentication token key gets revoked, there's a window in
which request_key_auth_describe() can see it with a NULL payload - but it
makes no check for this and something like the following oops may occur:
BUG: Kernel NULL pointer dereference at 0x00000038
Faulting instruction address: 0xc0000000004ddf30
Oops: Kernel access of bad area, sig: 11 [#1]
...
NIP [...] request_key_auth_describe+0x90/0xd0
LR [...] request_key_auth_describe+0x54/0xd0
Call Trace:
[...] request_key_auth_describe+0x54/0xd0 (unreliable)
[...] proc_keys_show+0x308/0x4c0
[...] seq_read+0x3d0/0x540
[...] proc_reg_read+0x90/0x110
[...] __vfs_read+0x3c/0x70
[...] vfs_read+0xb4/0x1b0
[...] ksys_read+0x7c/0x130
[...] system_call+0x5c/0x70
Fix this by checking for a NULL pointer when describing such a key.
Also make the read routine check for a NULL pointer to be on the safe side.
[DH: Modified to not take already-held rcu lock and modified to also check
in the read routine]
Fixes: 04c567d931 ("[PATCH] Keys: Fix race between two instantiators of a key")
Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Signed-off-by: Hillf Danton <hdanton@sina.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We need to use selinux_cred() to fetch the SELinux cred blob instead
of directly using current->security or current_security(). There
were a couple of lingering uses of current_security() in the SELinux code
that were apparently missed during the earlier conversions. IIUC, this
would only manifest as a bug if multiple security modules including
SELinux are enabled and SELinux is not first in the lsm order. After
this change, there appear to be no other users of current_security()
in-tree; perhaps we should remove it altogether.
Fixes: bbd3662a83 ("Infrastructure management of the cred security blob")
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
inode_smack::smk_lock is taken during smack_d_instantiate(), which is
called during a filesystem transaction when creating a file on ext4.
Therefore to avoid a deadlock, all code that takes this lock must use
GFP_NOFS, to prevent memory reclaim from waiting for the filesystem
transaction to complete.
Reported-by: syzbot+0eefc1e06a77d327a056@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
In smack_socket_sock_rcv_skb(), there is an if statement
on line 3920 to check whether skb is NULL:
if (skb && skb->secmark != 0)
This check indicates skb can be NULL in some cases.
But on lines 3931 and 3932, skb is used:
ad.a.u.net->netif = skb->skb_iif;
ipv6_skb_to_auditdata(skb, &ad.a, NULL);
Thus, possible null-pointer dereferences may occur when skb is NULL.
To fix these possible bugs, an if statement is added to check skb.
These bugs are found by a static analysis tool STCheck written by us.
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
There is a logic bug in the current smack_bprm_set_creds():
If LSM_UNSAFE_PTRACE is set, but the ptrace state is deemed to be
acceptable (e.g. because the ptracer detached in the meantime), the other
->unsafe flags aren't checked. As far as I can tell, this means that
something like the following could work (but I haven't tested it):
- task A: create task B with fork()
- task B: set NO_NEW_PRIVS
- task B: install a seccomp filter that makes open() return 0 under some
conditions
- task B: replace fd 0 with a malicious library
- task A: attach to task B with PTRACE_ATTACH
- task B: execve() a file with an SMACK64EXEC extended attribute
- task A: while task B is still in the middle of execve(), exit (which
destroys the ptrace relationship)
Make sure that if any flags other than LSM_UNSAFE_PTRACE are set in
bprm->unsafe, we reject the execve().
Cc: stable@vger.kernel.org
Fixes: 5663884caa ("Smack: unify all ptrace accesses in the smack")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Casey Schaufler <casey@schaufler-ca.com>
If check_cached_key() returns a non-NULL value, we still need to call
key_type::match_free() to undo key_type::match_preparse().
Fixes: 7743c48e54 ("keys: Cache result of request_key*() temporarily in task_struct")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:
struct ima_template_entry {
...
struct ima_field_data template_data[0]; /* template related data */
};
instance = kzalloc(sizeof(struct ima_template_entry) + count * sizeof(struct ima_field_data), GFP_NOFS);
Instead of leaving these open-coded and prone to type mistakes, we can
now use the new struct_size() helper:
instance = kzalloc(struct_size(instance, entry, count), GFP_NOFS);
This code was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:
struct foo {
int stuff;
struct boo entry[];
};
instance = kzalloc(sizeof(struct foo) + count * sizeof(struct boo), GFP_KERNEL);
Instead of leaving these open-coded and prone to type mistakes, we can
now use the new struct_size() helper:
instance = kzalloc(struct_size(instance, entry, count), GFP_KERNEL);
This code was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
If we can't parse the PKCS7 in the appended modsig, we will free the modsig
structure and then access one of its members to determine the error value.
Fixes: 39b0709636 ("ima: Implement support for module-style appended signatures")
Reported-by: kbuild test robot <lkp@intel.com>
Reported-by: Julia Lawall <julia.lawall@lip6.fr>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Reviewed-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
As noted in Documentation/atomic_t.txt, if we don't need the RMW atomic
operations, we should only use READ_ONCE()/WRITE_ONCE() +
smp_rmb()/smp_wmb() where necessary (or the combined variants
smp_load_acquire()/smp_store_release()).
This patch converts the sidtab code to use regular u32 for the counter
and reverse lookup cache and use the appropriate operations instead of
atomic_get()/atomic_set(). Note that when reading/updating the reverse
lookup cache we don't need memory barriers as it doesn't need to be
consistent or accurate. We can now also replace some atomic ops with
regular loads (when under spinlock) and stores (for conversion target
fields that are always accessed under the master table's spinlock).
We can now also bump SIDTAB_MAX to U32_MAX as we can use the full u32
range again.
Suggested-by: Jann Horn <jannh@google.com>
Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com>
Reviewed-by: Jann Horn <jannh@google.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Print the content of current->comm in messages generated by lockdown to
indicate a restriction that was hit. This makes it a bit easier to find
out what caused the message.
The message now patterned something like:
Lockdown: <comm>: <what> is restricted; see man kernel_lockdown.7
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Matthew Garrett <mjg59@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <jmorris@namei.org>
Tracefs may release more information about the kernel than desirable, so
restrict it when the kernel is locked down in confidentiality mode by
preventing open().
(Fixed by Ben Hutchings to avoid a null dereference in
default_file_open())
Signed-off-by: Matthew Garrett <mjg59@google.com>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: James Morris <jmorris@namei.org>
Disallow opening of debugfs files that might be used to muck around when
the kernel is locked down as various drivers give raw access to hardware
through debugfs. Given the effort of auditing all 2000 or so files and
manually fixing each one as necessary, I've chosen to apply a heuristic
instead. The following changes are made:
(1) chmod and chown are disallowed on debugfs objects (though the root dir
can be modified by mount and remount, but I'm not worried about that).
(2) When the kernel is locked down, only files with the following criteria
are permitted to be opened:
- The file must have mode 00444
- The file must not have ioctl methods
- The file must not have mmap
(3) When the kernel is locked down, files may only be opened for reading.
Normal device interaction should be done through configfs, sysfs or a
miscdev, not debugfs.
Note that this makes it unnecessary to specifically lock down show_dsts(),
show_devs() and show_call() in the asus-wmi driver.
I would actually prefer to lock down all files by default and have the
the files unlocked by the creator. This is tricky to manage correctly,
though, as there are 19 creation functions and ~1600 call sites (some of
them in loops scanning tables).
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Andy Shevchenko <andy.shevchenko@gmail.com>
cc: acpi4asus-user@lists.sourceforge.net
cc: platform-driver-x86@vger.kernel.org
cc: Matthew Garrett <mjg59@srcf.ucam.org>
cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Greg KH <greg@kroah.com>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Matthew Garrett <matthewgarrett@google.com>
Signed-off-by: James Morris <jmorris@namei.org>
Systems in lockdown mode should block the kexec of untrusted kernels.
For x86 and ARM we can ensure that a kernel is trustworthy by validating
a PE signature, but this isn't possible on other architectures. On those
platforms we can use IMA digital signatures instead. Add a function to
determine whether IMA has or will verify signatures for a given event type,
and if so permit kexec_file() even if the kernel is otherwise locked down.
This is restricted to cases where CONFIG_INTEGRITY_TRUSTED_KEYRING is set
in order to prevent an attacker from loading additional keys at runtime.
Signed-off-by: Matthew Garrett <mjg59@google.com>
Acked-by: Mimi Zohar <zohar@linux.ibm.com>
Cc: Dmitry Kasatkin <dmitry.kasatkin@gmail.com>
Cc: linux-integrity@vger.kernel.org
Signed-off-by: James Morris <jmorris@namei.org>
Disallow the use of certain perf facilities that might allow userspace to
access kernel data.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Matthew Garrett <mjg59@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: James Morris <jmorris@namei.org>
bpf_read() and bpf_read_str() could potentially be abused to (eg) allow
private keys in kernel memory to be leaked. Disable them if the kernel
has been locked down in confidentiality mode.
Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Matthew Garrett <mjg59@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
cc: netdev@vger.kernel.org
cc: Chun-Yi Lee <jlee@suse.com>
cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: James Morris <jmorris@namei.org>
Disallow the creation of perf and ftrace kprobes when the kernel is
locked down in confidentiality mode by preventing their registration.
This prevents kprobes from being used to access kernel memory to steal
crypto data, but continues to allow the use of kprobes from signed
modules.
Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Matthew Garrett <mjg59@google.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: davem@davemloft.net
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: James Morris <jmorris@namei.org>
Disallow access to /proc/kcore when the kernel is locked down to prevent
access to cryptographic data. This is limited to lockdown
confidentiality mode and is still permitted in integrity mode.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Matthew Garrett <mjg59@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <jmorris@namei.org>
The testmmiotrace module shouldn't be permitted when the kernel is locked
down as it can be used to arbitrarily read and write MMIO space. This is
a runtime check rather than buildtime in order to allow configurations
where the same kernel may be run in both locked down or permissive modes
depending on local policy.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David Howells <dhowells@redhat.com
Signed-off-by: Matthew Garrett <mjg59@google.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
cc: Thomas Gleixner <tglx@linutronix.de>
cc: Steven Rostedt <rostedt@goodmis.org>
cc: Ingo Molnar <mingo@kernel.org>
cc: "H. Peter Anvin" <hpa@zytor.com>
cc: x86@kernel.org
Signed-off-by: James Morris <jmorris@namei.org>
Provided an annotation for module parameters that specify hardware
parameters (such as io ports, iomem addresses, irqs, dma channels, fixed
dma buffers and other types).
Suggested-by: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Matthew Garrett <mjg59@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Jessica Yu <jeyu@kernel.org>
Signed-off-by: James Morris <jmorris@namei.org>
Lock down TIOCSSERIAL as that can be used to change the ioport and irq
settings on a serial port. This only appears to be an issue for the serial
drivers that use the core serial code. All other drivers seem to either
ignore attempts to change port/irq or give an error.
Reported-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Matthew Garrett <mjg59@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
cc: Jiri Slaby <jslaby@suse.com>
Cc: linux-serial@vger.kernel.org
Signed-off-by: James Morris <jmorris@namei.org>
Prohibit replacement of the PCMCIA Card Information Structure when the
kernel is locked down.
Suggested-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Matthew Garrett <mjg59@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <jmorris@namei.org>
custom_method effectively allows arbitrary access to system memory, making
it possible for an attacker to circumvent restrictions on module loading.
Disable it if the kernel is locked down.
Signed-off-by: Matthew Garrett <mjg59@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
cc: linux-acpi@vger.kernel.org
Signed-off-by: James Morris <jmorris@namei.org>
Writing to MSRs should not be allowed if the kernel is locked down, since
it could lead to execution of arbitrary code in kernel mode. Based on a
patch by Kees Cook.
Signed-off-by: Matthew Garrett <mjg59@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
cc: x86@kernel.org
Signed-off-by: James Morris <jmorris@namei.org>
IO port access would permit users to gain access to PCI configuration
registers, which in turn (on a lot of hardware) give access to MMIO
register space. This would potentially permit root to trigger arbitrary
DMA, so lock it down by default.
This also implicitly locks down the KDADDIO, KDDELIO, KDENABIO and
KDDISABIO console ioctls.
Signed-off-by: Matthew Garrett <mjg59@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
cc: x86@kernel.org
Signed-off-by: James Morris <jmorris@namei.org>