18139 Commits

Author SHA1 Message Date
Linus Torvalds
3eba620e7b - The usual round of smaller fixes and cleanups all over the tree
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmM8QskACgkQEsHwGGHe
 VUq2ZQ/+KwgmCojK54P05UOClpvf96CLDJA7r4m6ydKiM7GDWFg9wCZdews4JRk1
 /5hqfkFZsEAUlloRjRk3Qvd6PWRzDX8X/jjtHn3JyzRHT6ra31tyiZmD2LEb4eb6
 D0jIHfZQRYjZP39p3rYSuSMFrdWWE8gCETJLZEflR96ACwHXlm1fH/wSRI2RUG4c
 sH7nT/hGqtKiDsmOcb314yjmjraYEW1mKnLKRLfjUwksBET4mOiLTjH175MQ5Yv7
 cXZs0LsYvdfCqWSH5uefv32TX/yLsIi8ygaALpXawkoyXTmLr5MwJJykrm60AogV
 74gvxc3s3ItO0aKVM0J4ABTUWmU+wg+sjPcJD1MolafnJpsgGdfEKlWfTY4hjMV5
 onjtgr7byEdgZU25JtuI0BzPoggahnHvK6LiIvGy9vw8LRdKziKPXsyxuRF4rvXw
 0n9ofVRmBCuzUsRS8vbL65K2PcIS4oUmUUSEDmALtGQ9vG8j50k6vM3Fu6HayyJx
 7qgjVRpREemqRO21wS7SmR6z1RkT5J+zWv4TdacyyrA9QRqyM6ny/yZGCsfOZA77
 +LxBFzITwIXlTgfTDVYnLIi1ZPP2MCK74Gq0Buqsjxz8IOpV6yjB+PSajbJzZv35
 gIdbWKc5oHgmcDkrpBCoZ6KQ5ZNvDy6glSdnegkDFjRfVm5eCu0=
 =RqjF
 -----END PGP SIGNATURE-----

Merge tag 'x86_cleanups_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 cleanups from Borislav Petkov:

 - The usual round of smaller fixes and cleanups all over the tree

* tag 'x86_cleanups_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cpu: Include the header of init_ia32_feat_ctl()'s prototype
  x86/uaccess: Improve __try_cmpxchg64_user_asm() for x86_32
  x86: Fix various duplicate-word comment typos
  x86/boot: Remove superfluous type casting from arch/x86/boot/bitops.h
2022-10-04 10:24:11 -07:00
Linus Torvalds
193e2268a3 - More work by James Morse to disentangle the resctrl filesystem generic
code from the architectural one with the endgoal of plugging ARM's MPAM
 implementation into it too so that the user interface remains the same
 
 - Properly restore the MSR_MISC_FEATURE_CONTROL value instead of blindly
 overwriting it to 0
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmM8QhAACgkQEsHwGGHe
 VUqazA/8DIfBYMXe/M6qk+tZnLyBJPL3/3hzqOPc3fu2pmwzCHhb+1ksk7s0uLEO
 xdV4CK3SDc8WQnsiF9l4Hta1PhvD2Uhf6duCVv1DT0dmBQ6m9tks8SwhbgSCNrIh
 cQ8ABuTUsE0/PNW6Zx7x1JC0e2J6Yjhn55WGMGJD7kGl0eo1ClYSv8vnReBE/6cX
 YhgjVnWAeUNgwKayokbN7PFXwuP0WjDGmrn+7e8AF4emHWvdDYYw9F1MHIOvZoVO
 lLJi6f7ddjxCQSWPg3mG0KSvc4EXixhtEzq8Mk/16drkKlPdn89sHkqEyR7vP/jQ
 lEahxtzoWEfZXwVDPGCIIbfjab/lvvr4lTumKzxUgHEha+ORtWZGaukr4kPg6BRf
 IBrE12jCBKmYzzgE0e9EWGr0KCn6qXrnq37yzccQXVM0WxsBOUZWQXhInl6mSdz9
 uus1rKR/swJBT58ybzvw2LGFYUow0bb0qY6XvQxmriiyA60EVmf9/Nt/KgatXa63
 s9Q4mVii4W1tgxSmCjNVZnDFhXvvowclNU4TuJ6d+6kvEnrvoW5+vDRk2O7iJKqf
 K2zSe56lf0TnBe9WaUlxRFaTZg+UXZt7a+e7/hQ90wT/7fkIMk1uxVpqnQW4vDPi
 YskbKRPc5DlLBSJ+yxW9Ntff4QVIdUhhj0bcKBAo8nmd5Kj1hy4=
 =1iEb
 -----END PGP SIGNATURE-----

Merge tag 'x86_cache_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 cache resource control updates from Borislav Petkov:

 - More work by James Morse to disentangle the resctrl filesystem
   generic code from the architectural one with the endgoal of plugging
   ARM's MPAM implementation into it too so that the user interface
   remains the same

 - Properly restore the MSR_MISC_FEATURE_CONTROL value instead of
   blindly overwriting it to 0

* tag 'x86_cache_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits)
  x86/resctrl: Make resctrl_arch_rmid_read() return values in bytes
  x86/resctrl: Add resctrl_rmid_realloc_limit to abstract x86's boot_cpu_data
  x86/resctrl: Rename and change the units of resctrl_cqm_threshold
  x86/resctrl: Move get_corrected_mbm_count() into resctrl_arch_rmid_read()
  x86/resctrl: Move mbm_overflow_count() into resctrl_arch_rmid_read()
  x86/resctrl: Pass the required parameters into resctrl_arch_rmid_read()
  x86/resctrl: Abstract __rmid_read()
  x86/resctrl: Allow per-rmid arch private storage to be reset
  x86/resctrl: Add per-rmid arch private storage for overflow and chunks
  x86/resctrl: Calculate bandwidth from the previous __mon_event_count() chunks
  x86/resctrl: Allow update_mba_bw() to update controls directly
  x86/resctrl: Remove architecture copy of mbps_val
  x86/resctrl: Switch over to the resctrl mbps_val list
  x86/resctrl: Create mba_sc configuration in the rdt_domain
  x86/resctrl: Abstract and use supports_mba_mbps()
  x86/resctrl: Remove set_mba_sc()s control array re-initialisation
  x86/resctrl: Add domain offline callback for resctrl work
  x86/resctrl: Group struct rdt_hw_domain cleanup
  x86/resctrl: Add domain online callback for resctrl work
  x86/resctrl: Merge mon_capable and mon_enabled
  ...
2022-10-04 10:14:58 -07:00
Linus Torvalds
b5f0b11353 - Get rid of a single ksize() usage
- By popular demand, print the previous microcode revision an update
   was done over
 
 - Remove more code related to the now gone MICROCODE_OLD_INTERFACE
 
 - Document the problems stemming from microcode late loading
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmM8QL8ACgkQEsHwGGHe
 VUoDxw/9FA3rOAZD7N0PI/vspMUxEDQVYV60tfuuynao72HZv+tfJbRTXe42p3ZO
 B+kRPFud4lAOE1ykDHJ2A2OZzvthGfYlUnMyvk1IvK/gOwkkiSH4c6sVSrOYWtl7
 uoIN/3J83BMZoWNOKqrg1OOzotzkTyeucPXdWF+sRkfVzBIgbDqtplbFFCP4abPK
 WxatY2hkTfBCiN92OSOLaMGg0POpmycy+6roR2Qr5rWrC7nfREVNbKdOyEykZsfV
 U2gPm0A953sZ3Ye6waFib+qjJdyR7zBQRCJVEGOB6g8BlNwqGv/TY7NIUWSVFT9Y
 qcAnD3hI0g0UTYdToBUvYEpfD8zC9Wg3tZEpZSBRKh3AR2+Xt44VKQFO4L9uIt6g
 hWFMBLsFiYnBmKW3arNLQcdamE34GRhwUfXm0OjHTvTWb3aFO1I9+NBCaHp19KVy
 HD13wGSyj5V9SAVD0ztRFut4ZESejDyYBw9joB2IsjkY2IJmAAsRFgV0KXqUvQLX
 TX13hnhm894UfQ+4KCXnA0UeEDoXhwAbYFxR89yGeOxoGe1oaPXr9C1/r88YLq0n
 ekjIVZ3G97PIxmayj3cv9YrRIrrJi4PWF1Raey6go3Ma+rNBRnya5UF6Noch1lHh
 HeF7t84BZ5Ub6GweWYaMHQZCA+wMCZMYYuCMNzN7b54yRtQuvCc=
 =lWDD
 -----END PGP SIGNATURE-----

Merge tag 'x86_microcode_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x75 microcode loader updates from Borislav Petkov:

 - Get rid of a single ksize() usage

 - By popular demand, print the previous microcode revision an update
   was done over

 - Remove more code related to the now gone MICROCODE_OLD_INTERFACE

 - Document the problems stemming from microcode late loading

* tag 'x86_microcode_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/microcode/AMD: Track patch allocation size explicitly
  x86/microcode: Print previous version of microcode after reload
  x86/microcode: Remove ->request_microcode_user()
  x86/microcode: Document the whole late loading problem
2022-10-04 10:12:08 -07:00
Linus Torvalds
901735e51e - Drop misleading "RIP" from the opcodes dumping message
- Correct APM entry's Konfig help text
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmM8PoMACgkQEsHwGGHe
 VUpptBAAm31ZnZT1u4OMCWIfS8dUJYKQA0pg4Xb4Hj0epFFjqGvV15+JFLQrvNv8
 kWbCZS7wryNsch0dnNyxico6mBD6BGqUSTsiflcpk5rWzy0JOpIfeCs5dsNYhIkV
 Pqk0qXK25kfzYW/7I2HE/aNeE7jxAOTcKzLtW6J74UYj3jLWSmtlPP/o5qxMUeN+
 vYlLg45jhjcTH5cBD1U2MmbJP3M8WRcRgHpI7ZGrvOXUgGbTqdu0m+dsCZIkIzps
 ePDgX3mgAU6wjckjH333hbas256SAWWEtBYID/71ZIgK8EjabIu6FcwWAxyfexjp
 OU/k2iBNOXHbzhO4lEzqar/lcNHyfI4edxw4gsmIQddmmNI2b62s0Nis7Or8DDU6
 v5ZPrW6tcvVT7cZqP3naidRTNW2Cwo/+/Z09VQnhPQaZ7U8U5uCS4wgHiJuV3e0s
 nH02QU+t/9zrn46UurVXCSSHwWNaGNA4Gb3a4ZdwRUKdxMrf5frmU+dfiFhdMMoo
 I3lxQvMFtFb2rLNRA6zWDhsJFVU3F5cMu7zlGKHTNlD/TcO/qcKqiTkrAyM5trLG
 NqbPdwBVzdwLR6MgYOUDI+nB5Ad1Uq2Cx+SfRLT+C8L6dZFB+zvNiBH/8Q+25ohC
 c4ldgXSiZulfLApG3JY5chVvFWGa2H+EpoFexyQi+F9AAIRD14Q=
 =cNp0
 -----END PGP SIGNATURE-----

Merge tag 'x86_misc_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull misc x86 fixes from Borislav Petkov:

 - Drop misleading "RIP" from the opcodes dumping message

 - Correct APM entry's Konfig help text

* tag 'x86_misc_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/dumpstack: Don't mention RIP in "Code: "
  x86/Kconfig: Specify idle=poll instead of no-hlt
2022-10-04 10:00:27 -07:00
Linus Torvalds
8cded8fb12 - Make sure an INT3 is slapped after every unconditional retpoline JMP
as both vendors suggest
 
 - Clean up pciserial a bit
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmM8AxoACgkQEsHwGGHe
 VUrdqQ//c07XggioZRVWASkE5vXggSuM1BzCkf55kzx3Hp20F8ui26cgOW8nmNr9
 8pKjcTSGZvZPyZFHPBmZDFqver+viSAA3y+/DopNMd6DIeQWOCtJMFdoGPuPqCMq
 qYb3WkGu3i4AcjHxeiYKIlf76w+DfSomk+NspUYKLxGuPH6Hg5MFLwv32c0orh/g
 sMM5YCFP4xKtMTrpZ8GXO8+81dU57KwCPnmLy6RrPPCjIHkEmS9KBaib2T9udFvf
 DMQVZGEU4DQkl+SXoj5fOn03eQ56kDaH3OyV7JbHUwIXpASAEJZENwBD/HmP02aB
 uN+tm9brjyJLPD92FldoJPkcsYBcwS1vx+TBH7/KLsaGZziZ1+Oc8hk6fbAd6yp9
 W/ex491SCmdBA0wQelw/fZJ24/QXOT1PGIKl96srH0VPvhQtCESMDbKMdTY4p6Ow
 s6GYF+HW6SV5d/jUQQzfmd1D7Ur6RxwvUnRvfDg8/AZ+INQRciMG0aFJa3E1SUJf
 TkfXZ12XsL2LABK4MWOJqJsB+DSnEZbjOaPxWIKBhMUHxoc8nIny5ZFZbww5iVii
 pUdKEMgicm1zuaiIoQBY9K6oOlR2ixZJDw++NzfId2jwsiR4MeJH78sB61mItMfs
 WnwIkNxNfimgvcGtAogTMalJlnN56ukrjjwGrdhJGL3WovQ8qHc=
 =Ggh3
 -----END PGP SIGNATURE-----

Merge tag 'x86_core_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 core fixes from Borislav Petkov:

 - Make sure an INT3 is slapped after every unconditional retpoline JMP
   as both vendors suggest

 - Clean up pciserial a bit

* tag 'x86_core_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86,retpoline: Be sure to emit INT3 after JMP *%\reg
  x86/earlyprintk: Clean up pciserial
2022-10-04 09:46:22 -07:00
Linus Torvalds
5bb3a16dbe - Add support for locking the APIC in X2APIC mode to prevent SGX enclave leaks
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmM7/C0ACgkQEsHwGGHe
 VUp3jA//caeFnxfKU8F6B+D+sqXub+fRC88U/5j5TVgueiFdpgbHxbeb5isV2DW8
 /1as5LN6PCy0V9ax873PblaRdL+ir2KVjzTOQhv1fsBWtRJgRAte6ltHfncST3jM
 Qm9ELCSYlwX6wgPDFWrLd5/jCHLq/7ffvWATwt4ZS9/+ebwJJ3JRU82aydvSAe+M
 tYDLA0RYjrTKB7bfhlxft8n3/ttOwmaQ9sPw6J9KWL9VZ8d7uA0cZgHMFgclr9xb
 +cTuOQwJ+a+njS3tXYWH0lU4lLw6tRcWrl3/gHpiuiwU85xdoClyCcsmvZYMLy1s
 cv4CmAQqs0dSlcII/F+6pTh+E8gcpzm2+uiP8k56dCK3c9OhtdKcPRcxm169JID+
 hKZ9rcscx1e/Q3rzhRtC4w2LguDcARe2b0OSMsGH40uZ/UOHdnYVk8oD81O9LWrB
 T4DgjJCcSf4y0KGDXtjJ39POfe0zFHgx8nr5JOL9Las6gZtXZHRmCzBacIU6JeJo
 5VEl6Yx1ca7aiuNTTcsmmW3zbsR9J6VppFZ6ma9jTTM8HbINM6g4UDepkeubNUdv
 tAJVbb6MEDGtT1ybH4oalrYkPfhYDwRkahqPtMY0ROtlzxO65q2DI3rH2gtyboas
 oP4hfEDtzKiJj4FiIRMKbryKz7MBWImHMRovRGKjB/Y48mTxX3M=
 =e8H8
 -----END PGP SIGNATURE-----

Merge tag 'x86_apic_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 APIC update from Borislav Petkov:

 - Add support for locking the APIC in X2APIC mode to prevent SGX
   enclave leaks

* tag 'x86_apic_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/apic: Don't disable x2APIC if locked
2022-10-04 09:37:02 -07:00
Linus Torvalds
51eaa866a5 - Fix the APEI MCE callback handler to consult the hardware about the
granularity of the memory error instead of hard-coding it
 
 - Offline memory pages on Intel machines after 2 errors reported per page
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmM7+ZAACgkQEsHwGGHe
 VUp3Rg/9HkwMcl8HOEShQoN1HvhDh68DeR0pgZpFjooBgilqFuPopWn7rwOsebza
 b6Z1XYg3pmWwkG35ztHedFP8WkRb8aZDJV7czpGxiaxtRUCuXpGZJk45CFzU+3l8
 FD9XKryDf6rPSKmg6w/63FUPVXJXpM8+pAlEso9BVrb6xmUb3n6ul5A/kt3Z4OBZ
 beNDI54s7pfKXmCYze7VbK2nqyEM9fmDuPAhaicB/qhVEqxSWMNKDYDhHs1wt/Um
 dOlLC2paqcc9EaRhV/L/9Pvi1zasy7Q0+jSIXhgbag7EQmZRYQn3Pe0xqUJNhK2H
 ulm65Dy+AZ3y15+qBkBOgoNF/By5eMwJ+C9bucC6FWkPG0XMjDVGiphf+DPZmAHk
 msYSvyp++WidNYn/bXhbQv0epjheLoHj22lTvlHwLquI+eISuDV24DIcBQQpFFed
 S/8sMw3RqoOgAd3LHr1NGcd/7eF+b/Lc1mtUx7qhg/W0V2cYVTGIcNJMAJ8USHMy
 IxFiB+8x7ps0lRJGlaApylLQOGAhS9WxpH2UUIKIoURrmCi9Pf8OaHzfIA5jYK+t
 xEQTzZE5ZUB0V5mvNhAWTm9HvlaW1WNbokt0vElAEJ9WAjRfMVYKya7shlAK2MYI
 nrM3eqlUwOqTGLGCA7IhCS9Re7M5ZKZk5nsLygnYsadntqCmN0c=
 =5tV9
 -----END PGP SIGNATURE-----

Merge tag 'ras_core_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 RAS updates from Borislav Petkov:

 - Fix the APEI MCE callback handler to consult the hardware about the
   granularity of the memory error instead of hard-coding it

 - Offline memory pages on Intel machines after 2 errors reported per
   page

* tag 'ras_core_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mce: Retrieve poison range from hardware
  RAS/CEC: Reduce offline page threshold for Intel systems
2022-10-04 09:33:12 -07:00
Linus Torvalds
ba94a7a900 - Improve the documentation of a couple of SGX functions handling
backing storage
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmM77w8ACgkQEsHwGGHe
 VUq8uhAAifAom4dzMQ0zsLP1nPY9Y7jqNdMvR1f4XcnWVQ5Ipvlkooy+m6wZr4Zd
 Yy5HT9fvlFNcZzH8V0vy/8Kf86mT7K6NUalvcBJhiZUkSiFaDrzTgEi7+Znqw402
 PVe/MiaUTfK1xokYIdzMmibsVh7SrL/8U4r4YWV0mQUYaGu0Fh7I+474Hd+4zH5F
 12sfq0aw/IyCj+XrvVCIkvdus416wmdgs5KSgDqvfTRAm44rrLaz6xfT0ONjAv7g
 Pzs4eRgxOgxDJRpumoAw+UGC23FWDfKIvZOLiczrFdBHZMmFPYkOK2wrICacHZx0
 I/DpwPk4sHKFFKjdXAqHDtLPwXp1D0xRN1Nw4UO5uSuj1XzjgPvBFjZ+zjM3EZMe
 OjzUYC1Qt2OrCYEXAot9tb9e0jqrg2hD9nCXrVszlXMbOyKX6l8BURf+2uzsXvq0
 tAdfWl3awn6OpX3MxHwxuSq70BS8+KgizAgkHbimyJnPcPgFWa9b7fwp/iYbJF2U
 g6vMUyK0r3pomfmXFed+QCZ7CD1tGGharLoa2/aQxshcffPrhG9CN6As769oxt1Z
 m4I17yi8YViEQyAyvJ5GxpCJykVByZUsbjUnE4uDHBrQRmeN5D2SMcYvYF7J2AVV
 up81qj/TOsWEqnP/h1B6K1rz700P/tQK/ebk4uRdacN/+9JyxPo=
 =lezL
 -----END PGP SIGNATURE-----

Merge tag 'x86_sgx_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 SGX update from Borislav Petkov:

 - Improve the documentation of a couple of SGX functions handling
   backing storage

* tag 'x86_sgx_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/sgx: Improve comments for sgx_encl_lookup/alloc_backing()
2022-10-04 09:17:44 -07:00
Linus Torvalds
f8475a6749 - Cleanup x86/rtc.c and delete duplicated functionality in favor of
using the respective functionality from the RTC library
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmM77hwACgkQEsHwGGHe
 VUr0MBAAto17sVeaQUcJ9bw6qNu/IWwCt35X+dPXxXDANOxzZUdN+EEyEVcxIP45
 f9nL3IiMPXPBWCyLqVbxAHi5z1FUc8NC3EJdMDQuCvZ4/OXyt1jyoRZ1V7NOUHYm
 EJBS52GJJOZk0DCmv7baSrGZ0d5qytMPmMlg7Phs8uaHzzL4/PRKDQvWWkYw9eoA
 fqEVVrHXi9D18PDvFCwAA83jbDrtLk52KzLrqW57jx6+V1CdOO+GFytNNOgYvwef
 FFRRo/BynSEONQNSX3RiDQ7NkBhgdpOaE9RglNyeHk0VctAiawkH0yEjyz2zOsyQ
 PAe8KBpnzUW49PMUBYKbOTLaOa73OYhdTEbeuV2cntE1OrH35sCH7scKIjJXyAer
 FFhz8nDkT36wXYtzHMgioE7Q8vUO04D3IGRAA3CADmX5BeIhgLlU7KDXoDVdJfAP
 aJUVojCFM0CI3ASmZhQbwEH+otycrcIs6bhkq/X+qQmcnTD9Q+7GqU+WdSPfHTbg
 29Q+gn4IY/WWqdIgCH5i+XV0PnROcWNtb8YsgkN5GRyGV9AIRhNut20CREdBB6vG
 HVsWjimVJXbA6ysG4TNxcBENALaXYNyqICkgo/r0G1lOp90CseuOKhJI4JbZx+IF
 VuPyKKvk3r86sP1OkmFn46i52kyxl4OTPwFFYFpHOkfEczm4/tU=
 =9t5p
 -----END PGP SIGNATURE-----

Merge tag 'x86_timers_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 RTC cleanups from Borislav Petkov:

 - Cleanup x86/rtc.c and delete duplicated functionality in favor of
   using the respective functionality from the RTC library

* tag 'x86_timers_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/rtc: Rename mach_set_rtc_mmss() to mach_set_cmos_time()
  x86/rtc: Rewrite & simplify mach_get_cmos_time() by deleting duplicated functionality
2022-10-04 09:13:21 -07:00
Linus Torvalds
3339914a58 - Get TSC and CPU frequency from CPUID leaf 0x40000010 when the kernel
is running as a guest on the ACRN hypervisor
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmM7614ACgkQEsHwGGHe
 VUqwcBAAhCrQdAn7nV8MWfktmZ97/KyISdvX9zb6ecc55kgJgbFTpun9BuC7Tcso
 s7qwgIo2vihCpdsK2mBdydpG/VAWIsGpxSmf2GXpwC6S2CnlylsB15yoWgnV3dEV
 owvRHp9H+9NkIMPzWxTAN8RdmyHw05qyhvYfXUhSAlSYzDT2LcrUTIMMYVZ6rpXM
 rGKlqEmUeo4tinDmJhHB09W+D1LK2Aex10O/ESq/VT/5BAZ/Ie2QN4+6ShLqg23T
 sd+Q8ho+4nbKJmlrMaAsUqx1FfxNASbDhxKmdHSln4NWZBMDMoMrMBJVGcpgqFbk
 /qGAV+SRBNAz5MTusgKwp/6Cka3ms5Q5Ild0NGCSZK3M6QBKpzeFi8UPRpYDnS9J
 Gfy8CHOsfhc3g5AmPeiOnaJw9rKiinCUALf7nbLFyLcT4Kpr5QqC3qpKmmtHJjT/
 ksTrEs3t4bCXQB6aayIPKWjmRAEEPvI/seGE8mkKMQY26ENdwv4ZkXHNrXmf0Z/L
 YWplbvz4oBPqwPBGrzYmLdEqzhN9ywTfN5CF0pZ0HKhQyzJGHhxXEMURMM0loUQY
 M886q3Ur8z46+PPJdlzCNOBS0SKSUn2HU2Dl1YNBCqrTeVLWOVN1cTOVHTlh8PCN
 cB5Myz+eQoXD1uzfHUEfDCwGDWSmFw2aidx09KNL+HjJtXl2K1M=
 =Ds89
 -----END PGP SIGNATURE-----

Merge tag 'x86_platform_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 platform update from Borislav Petkov:
 "A single x86/platform improvement when the kernel is running as an
  ACRN guest:

   - Get TSC and CPU frequency from CPUID leaf 0x40000010 when the
     kernel is running as a guest on the ACRN hypervisor"

* tag 'x86_platform_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/acrn: Set up timekeeping
2022-10-04 09:06:35 -07:00
Linus Torvalds
865dad2022 kcfi updates for v6.1-rc1
This replaces the prior support for Clang's standard Control Flow
 Integrity (CFI) instrumentation, which has required a lot of special
 conditions (e.g. LTO) and work-arounds. The current implementation
 ("Kernel CFI") is specific to C, directly designed for the Linux kernel,
 and takes advantage of architectural features like x86's IBT. This
 series retains arm64 support and adds x86 support. Additional "generic"
 architectural support is expected soon:
 https://github.com/samitolvanen/llvm-project/commits/kcfi_generic
 
 - treewide: Remove old CFI support details
 
 - arm64: Replace Clang CFI support with Clang KCFI support
 
 - x86: Introduce Clang KCFI support
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmM4aAUWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJkgWD/4mUgb7xewNIG/+fuipGd620Iao
 K0T8q4BNxLNRltOxNc3Q0WMDCggX0qJGCeds7EdFQJQOGxWcbifM8MAS4idAGM0G
 fc3Gxl1imC/oF6goCAbQgndA6jYFIWXGsv8LsRjAXRidWLFr3GFAqVqYJyokSySr
 8zMQsEDuF4I1gQnOhEWdtPZbV3MQ4ZjfFzpv+33agbq6Gb72vKvDh3G6g2VXlxjt
 1qnMtS+eEpbBU65cJkOi4MSLgymWbnIAeTMb0dbsV4kJ08YoTl8uz1B+weeH6GgT
 WP73ZJ4nqh1kkkT9EqS9oKozNB9fObhvCokEuAjuQ7i1eCEZsbShvRc0iL7OKTGG
 UfuTJa5qQ4h7Z0JS35FCSJETa+fcG0lTyEd133nLXLMZP9K2antf+A6O//fd0J1V
 Jg4VN7DQmZ+UNGOzRkL6dTtQUy4PkxhniIloaClfSYXxhNirA+v//sHTnTK3z2Bl
 6qceYqmFmns2Laual7+lvnZgt6egMBcmAL/MOdbU74+KIR9Xw76wxQjifktHX+WF
 FEUQkUJDB5XcUyKlbvHoqobRMxvEZ8RIlC5DIkgFiPRE3TI0MqfzNSFnQ/6+lFNg
 Y0AS9HYJmcj8sVzAJ7ji24WPFCXzsbFn6baJa9usDNbWyQZokYeiv7ZPNPHPDVrv
 YEBP6aYko0lVSUS9qw==
 =Li4D
 -----END PGP SIGNATURE-----

Merge tag 'kcfi-v6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull kcfi updates from Kees Cook:
 "This replaces the prior support for Clang's standard Control Flow
  Integrity (CFI) instrumentation, which has required a lot of special
  conditions (e.g. LTO) and work-arounds.

  The new implementation ("Kernel CFI") is specific to C, directly
  designed for the Linux kernel, and takes advantage of architectural
  features like x86's IBT. This series retains arm64 support and adds
  x86 support.

  GCC support is expected in the future[1], and additional "generic"
  architectural support is expected soon[2].

  Summary:

   - treewide: Remove old CFI support details

   - arm64: Replace Clang CFI support with Clang KCFI support

   - x86: Introduce Clang KCFI support"

Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107048 [1]
Link: https://github.com/samitolvanen/llvm-project/commits/kcfi_generic [2]

* tag 'kcfi-v6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (22 commits)
  x86: Add support for CONFIG_CFI_CLANG
  x86/purgatory: Disable CFI
  x86: Add types to indirectly called assembly functions
  x86/tools/relocs: Ignore __kcfi_typeid_ relocations
  kallsyms: Drop CONFIG_CFI_CLANG workarounds
  objtool: Disable CFI warnings
  objtool: Preserve special st_shndx indexes in elf_update_symbol
  treewide: Drop __cficanonical
  treewide: Drop WARN_ON_FUNCTION_MISMATCH
  treewide: Drop function_nocfi
  init: Drop __nocfi from __init
  arm64: Drop unneeded __nocfi attributes
  arm64: Add CFI error handling
  arm64: Add types to indirect called assembly functions
  psci: Fix the function type for psci_initcall_t
  lkdtm: Emit an indirect call for CFI tests
  cfi: Add type helper macros
  cfi: Switch to -fsanitize=kcfi
  cfi: Drop __CFI_ADDRESSABLE
  cfi: Remove CONFIG_CFI_CLANG_SHADOW
  ...
2022-10-03 17:11:07 -07:00
Linus Torvalds
534b0abc62 - Add the respective UP last level cache mask accessors in order not to
cause segfaults when lscpu accesses their representation in sysfs
 
 - Fix for a race in the alternatives batch patching machinery when
 kprobes are set
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmM5ZDkACgkQEsHwGGHe
 VUrkaA//dXhnPu2AM9x/v7JMZw0BO2peKMNCmO7b6z4+xIXlxNGNYeO766ZqpjSd
 eFJj5Hv9ESOZw4UG5cvPA1Vj14nSa6/03Lo9JBFthl2KLOZEgVrD+GNQEJMqxPi/
 9s1+764NXYi8iILHj7N4epQmz+oIbCUlnHLWZRkmG5ys40cPPI/d5li/rKBK8yIQ
 W89f+WgbqCmpn9Ha8PFYy5uuLxQJnN/McDVZyW2d4MSxJ/FukRl4x1agrfnJq1fb
 xz9Y/ZpVRPQCc4fJbQcTTffyFyg42AAqC0O0jJ5ZsOJDjZoQS7WvkcKYO33FiwKv
 /wo61B+7SxbNMcZYhQGP8BxaBeSPlXmMKaifW+xZDS6RN4zfCq/M1+ziVB45GdUq
 S5hN699vhImciXM5t18wPw6mrpoBBkQYBv+xKkC9ykUw2vxEZ32DeFzwxrybdcGC
 hWKZJAVTQpvzr1FlrUAbBtQnhUTxSAB6EAdTtIuHQ+ts+OcraR8JNe59GCsEdCVI
 as+mfqMKB8lwoSyDwomkeMcx5yL9XYy+STLPsPTHLrYFjqwTBOZgWRGrVZzt0EBo
 0z12tqxpaFc7RI48Vi0qifkeX2Fi63HSBI/Ba+i11a2jM6NT2d2EcO26rDpO6R2S
 6K0N7cD3o0wO+QK2hwxBgGnX8e2aRUE8tjYmW40aclfxl4nh/08=
 =MiiB
 -----END PGP SIGNATURE-----

Merge tag 'x86_urgent_for_v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:

 - Add the respective UP last level cache mask accessors in order not to
   cause segfaults when lscpu accesses their representation in sysfs

 - Fix for a race in the alternatives batch patching machinery when
   kprobes are set

* tag 'x86_urgent_for_v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cacheinfo: Add a cpu_llc_shared_mask() UP variant
  x86/alternative: Fix race in try_get_desc()
2022-10-02 09:30:35 -07:00
Nadav Amit
efd608fa74 x86/alternative: Fix race in try_get_desc()
I encountered some occasional crashes of poke_int3_handler() when
kprobes are set, while accessing desc->vec.

The text poke mechanism claims to have an RCU-like behavior, but it
does not appear that there is any quiescent state to ensure that
nobody holds reference to desc. As a result, the following race
appears to be possible, which can lead to memory corruption.

  CPU0					CPU1
  ----					----
  text_poke_bp_batch()
  -> smp_store_release(&bp_desc, &desc)

  [ notice that desc is on
    the stack			]

					poke_int3_handler()

					[ int3 might be kprobe's
					  so sync events are do not
					  help ]

					-> try_get_desc(descp=&bp_desc)
					   desc = __READ_ONCE(bp_desc)

					   if (!desc) [false, success]
  WRITE_ONCE(bp_desc, NULL);
  atomic_dec_and_test(&desc.refs)

  [ success, desc space on the stack
    is being reused and might have
    non-zero value. ]
					arch_atomic_inc_not_zero(&desc->refs)

					[ might succeed since desc points to
					  stack memory that was freed and might
					  be reused. ]

Fix this issue with small backportable patch. Instead of trying to
make RCU-like behavior for bp_desc, just eliminate the unnecessary
level of indirection of bp_desc, and hold the whole descriptor as a
global.  Anyhow, there is only a single descriptor at any given
moment.

Fixes: 1f676247f36a4 ("x86/alternatives: Implement a better poke_int3_handler() completion scheme")
Signed-off-by: Nadav Amit <namit@vmware.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@kernel.org
Link: https://lkml.kernel.org/r/20220920224743.3089-1-namit@vmware.com
2022-09-27 22:50:26 +02:00
Linus Torvalds
a1375562c0 * A performance fix for recent large AMD systems that avoids an ancient
cpu idle hardware workaround.
 
  * A new Intel model number.  Folks like these upstream as soon as
    possible so that each developer doing feature development doesn't
    need to carry their own #define.
 
  * SGX fixes for a userspace crash and a rare kernel warning
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEV76QKkVc4xCGURexaDWVMHDJkrAFAmMx6tAACgkQaDWVMHDJ
 krClIQ//fSv5oE6XpRCGx9FuiTz6m1s6zebSyY1m1wyQ8j7InoBbgJnKc1GfBNvT
 +RCudOkHI5mqLsB7S5FcitFESH/TxrUQ3LlIXaMTySvf3OqaBe6oOFpBBoDD6Nal
 gzCoPfZ6dOLl7D6YjiYkSL3rWP3wMhsIm2I8dVwDvxD7iw9oRuTzON+DEFR/+b2L
 RTPTSGbGEHLlEXVc5S3+KYAGDTVVxo5XifLauFVWCa3bWCi6Wq78aJQnyVmvoCu9
 iHs3hb7TOzSL4hS3nFHBL8wd1QXNfg2e7/gxl+AVhiTAyoQL5atpa6NnL5MHehGE
 +HVJtrskFs9GjakGJmCHlh5tJy7NeiHcggdrL+EtqUif4qOehhKytIPw99Vmq8Po
 B7nxMMueZQJZfsnkLttYxMTBbPv4oYAzn3uCzdODDjbUQrPkJv//pcW7cWhwGtda
 GIspz1jBF+CFMygke7/xNfhEiwxIcu8nZ7HywUhWbcoGv+N3IpAgeMHlYkAIqgXA
 Qhluo5o09LaTFmIS6j1Ba+tEXzTPdQdQBpBQDC3u4A5U8KOSsXA9b1OA1pPowF1k
 ur4PbJe5eq2LvXofmISorCAH9qw2lpJk3n+rWojU6Rml+SI4flrGWuiRPeqhJP2B
 RuiVSjx9tS9ohKIo/tZOo7varj7Ct+W2ZO/M40hp3cB94sFGp5s=
 =ULl1
 -----END PGP SIGNATURE-----

Merge tag 'x86_urgent_for_v6.0-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Dave Hansen:

 - A performance fix for recent large AMD systems that avoids an ancient
   cpu idle hardware workaround

 - A new Intel model number. Folks like these upstream as soon as
   possible so that each developer doing feature development doesn't
   need to carry their own #define

 - SGX fixes for a userspace crash and a rare kernel warning

* tag 'x86_urgent_for_v6.0-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  ACPI: processor idle: Practically limit "Dummy wait" workaround to old Intel systems
  x86/sgx: Handle VA page allocation failure for EAUG on PF.
  x86/sgx: Do not fail on incomplete sanitization on premature stop of ksgxd
  x86/cpu: Add CPU model numbers for Meteor Lake
2022-09-26 14:53:38 -07:00
Sami Tolvanen
3c516f89e1 x86: Add support for CONFIG_CFI_CLANG
With CONFIG_CFI_CLANG, the compiler injects a type preamble immediately
before each function and a check to validate the target function type
before indirect calls:

  ; type preamble
  __cfi_function:
    mov <id>, %eax
  function:
    ...
  ; indirect call check
    mov     -<id>,%r10d
    add     -0x4(%r11),%r10d
    je      .Ltmp1
    ud2
  .Ltmp1:
    call    __x86_indirect_thunk_r11

Add error handling code for the ud2 traps emitted for the checks, and
allow CONFIG_CFI_CLANG to be selected on x86_64.

This produces the following oops on CFI failure (generated using lkdtm):

[   21.441706] CFI failure at lkdtm_indirect_call+0x16/0x20 [lkdtm]
(target: lkdtm_increment_int+0x0/0x10 [lkdtm]; expected type: 0x7e0c52a)
[   21.444579] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[   21.445296] CPU: 0 PID: 132 Comm: sh Not tainted
5.19.0-rc8-00020-g9f27360e674c #1
[   21.445296] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[   21.445296] RIP: 0010:lkdtm_indirect_call+0x16/0x20 [lkdtm]
[   21.445296] Code: 52 1c c0 48 c7 c1 c5 50 1c c0 e9 25 48 2a cc 0f 1f
44 00 00 49 89 fb 48 c7 c7 50 b4 1c c0 41 ba 5b ad f3 81 45 03 53 f8
[   21.445296] RSP: 0018:ffffa9f9c02ffdc0 EFLAGS: 00000292
[   21.445296] RAX: 0000000000000027 RBX: ffffffffc01cb300 RCX: 385cbbd2e070a700
[   21.445296] RDX: 0000000000000000 RSI: c0000000ffffdfff RDI: ffffffffc01cb450
[   21.445296] RBP: 0000000000000006 R08: 0000000000000000 R09: ffffffff8d081610
[   21.445296] R10: 00000000bcc90825 R11: ffffffffc01c2fc0 R12: 0000000000000000
[   21.445296] R13: ffffa31b827a6000 R14: 0000000000000000 R15: 0000000000000002
[   21.445296] FS:  00007f08b42216a0(0000) GS:ffffa31b9f400000(0000)
knlGS:0000000000000000
[   21.445296] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   21.445296] CR2: 0000000000c76678 CR3: 0000000001940000 CR4: 00000000000006f0
[   21.445296] Call Trace:
[   21.445296]  <TASK>
[   21.445296]  lkdtm_CFI_FORWARD_PROTO+0x30/0x50 [lkdtm]
[   21.445296]  direct_entry+0x12d/0x140 [lkdtm]
[   21.445296]  full_proxy_write+0x5d/0xb0
[   21.445296]  vfs_write+0x144/0x460
[   21.445296]  ? __x64_sys_wait4+0x5a/0xc0
[   21.445296]  ksys_write+0x69/0xd0
[   21.445296]  do_syscall_64+0x51/0xa0
[   21.445296]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   21.445296] RIP: 0033:0x7f08b41a6fe1
[   21.445296] Code: be 07 00 00 00 41 89 c0 e8 7e ff ff ff 44 89 c7 89
04 24 e8 91 c6 02 00 8b 04 24 48 83 c4 68 c3 48 63 ff b8 01 00 00 03
[   21.445296] RSP: 002b:00007ffcdf65c2e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[   21.445296] RAX: ffffffffffffffda RBX: 00007f08b4221690 RCX: 00007f08b41a6fe1
[   21.445296] RDX: 0000000000000012 RSI: 0000000000c738f0 RDI: 0000000000000001
[   21.445296] RBP: 0000000000000001 R08: fefefefefefefeff R09: fefefefeffc5ff4e
[   21.445296] R10: 00007f08b42222b0 R11: 0000000000000246 R12: 0000000000c738f0
[   21.445296] R13: 0000000000000012 R14: 00007ffcdf65c401 R15: 0000000000c70450
[   21.445296]  </TASK>
[   21.445296] Modules linked in: lkdtm
[   21.445296] Dumping ftrace buffer:
[   21.445296]    (ftrace buffer empty)
[   21.471442] ---[ end trace 0000000000000000 ]---
[   21.471811] RIP: 0010:lkdtm_indirect_call+0x16/0x20 [lkdtm]
[   21.472467] Code: 52 1c c0 48 c7 c1 c5 50 1c c0 e9 25 48 2a cc 0f 1f
44 00 00 49 89 fb 48 c7 c7 50 b4 1c c0 41 ba 5b ad f3 81 45 03 53 f8
[   21.474400] RSP: 0018:ffffa9f9c02ffdc0 EFLAGS: 00000292
[   21.474735] RAX: 0000000000000027 RBX: ffffffffc01cb300 RCX: 385cbbd2e070a700
[   21.475664] RDX: 0000000000000000 RSI: c0000000ffffdfff RDI: ffffffffc01cb450
[   21.476471] RBP: 0000000000000006 R08: 0000000000000000 R09: ffffffff8d081610
[   21.477127] R10: 00000000bcc90825 R11: ffffffffc01c2fc0 R12: 0000000000000000
[   21.477959] R13: ffffa31b827a6000 R14: 0000000000000000 R15: 0000000000000002
[   21.478657] FS:  00007f08b42216a0(0000) GS:ffffa31b9f400000(0000)
knlGS:0000000000000000
[   21.479577] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   21.480307] CR2: 0000000000c76678 CR3: 0000000001940000 CR4: 00000000000006f0
[   21.481460] Kernel panic - not syncing: Fatal exception

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Kees Cook <keescook@chromium.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220908215504.3686827-23-samitolvanen@google.com
2022-09-26 10:13:16 -07:00
Luciano Leão
30ea703a38 x86/cpu: Include the header of init_ia32_feat_ctl()'s prototype
Include the header containing the prototype of init_ia32_feat_ctl(),
solving the following warning:

  $ make W=1 arch/x86/kernel/cpu/feat_ctl.o
  arch/x86/kernel/cpu/feat_ctl.c:112:6: warning: no previous prototype for ‘init_ia32_feat_ctl’ [-Wmissing-prototypes]
    112 | void init_ia32_feat_ctl(struct cpuinfo_x86 *c)

This warning appeared after commit

  5d5103595e9e5 ("x86/cpu: Reinitialize IA32_FEAT_CTL MSR on BSP during wakeup")

had moved the function init_ia32_feat_ctl()'s prototype from
arch/x86/kernel/cpu/cpu.h to arch/x86/include/asm/cpu.h.

Note that, before the commit mentioned above, the header include "cpu.h"
(arch/x86/kernel/cpu/cpu.h) was added by commit

  0e79ad863df43 ("x86/cpu: Fix a -Wmissing-prototypes warning for init_ia32_feat_ctl()")

solely to fix init_ia32_feat_ctl()'s missing prototype. So, the header
include "cpu.h" is no longer necessary.

  [ bp: Massage commit message. ]

Fixes: 5d5103595e9e5 ("x86/cpu: Reinitialize IA32_FEAT_CTL MSR on BSP during wakeup")
Signed-off-by: Luciano Leão <lucianorsleao@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Nícolas F. R. A. Prado <n@nfraprado.net>
Link: https://lore.kernel.org/r/20220922200053.1357470-1-lucianorsleao@gmail.com
2022-09-26 17:06:27 +02:00
James Morse
f7b1843eca x86/resctrl: Make resctrl_arch_rmid_read() return values in bytes
resctrl_arch_rmid_read() returns a value in chunks, as read from the
hardware. This needs scaling to bytes by mon_scale, as provided by
the architecture code.

Now that resctrl_arch_rmid_read() performs the overflow and corrections
itself, it may as well return a value in bytes directly. This allows
the accesses to the architecture specific 'hw' structure to be removed.

Move the mon_scale conversion into resctrl_arch_rmid_read().
mbm_bw_count() is updated to calculate bandwidth from bytes.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <quic_jiles@quicinc.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Xin Hao <xhao@linux.alibaba.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Cristian Marussi <cristian.marussi@arm.com>
Link: https://lore.kernel.org/r/20220902154829.30399-22-james.morse@arm.com
2022-09-23 14:25:05 +02:00
James Morse
d80975e264 x86/resctrl: Add resctrl_rmid_realloc_limit to abstract x86's boot_cpu_data
resctrl_rmid_realloc_threshold can be set by user-space. The maximum
value is specified by the architecture.

Currently max_threshold_occ_write() reads the maximum value from
boot_cpu_data.x86_cache_size, which is not portable to another
architecture.

Add resctrl_rmid_realloc_limit to describe the maximum size in bytes
that user-space can set the threshold to.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <quic_jiles@quicinc.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Xin Hao <xhao@linux.alibaba.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Cristian Marussi <cristian.marussi@arm.com>
Link: https://lore.kernel.org/r/20220902154829.30399-21-james.morse@arm.com
2022-09-23 14:24:16 +02:00
James Morse
ae2328b529 x86/resctrl: Rename and change the units of resctrl_cqm_threshold
resctrl_cqm_threshold is stored in a hardware specific chunk size,
but exposed to user-space as bytes.

This means the filesystem parts of resctrl need to know how the hardware
counts, to convert the user provided byte value to chunks. The interface
between the architecture's resctrl code and the filesystem ought to
treat everything as bytes.

Change the unit of resctrl_cqm_threshold to bytes. resctrl_arch_rmid_read()
still returns its value in chunks, so this needs converting to bytes.
As all the users have been touched, rename the variable to
resctrl_rmid_realloc_threshold, which describes what the value is for.

Neither r->num_rmid nor hw_res->mon_scale are guaranteed to be a power
of 2, so the existing code introduces a rounding error from resctrl's
theoretical fraction of the cache usage. This behaviour is kept as it
ensures the user visible value matches the value read from hardware
when the rmid will be reallocated.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <quic_jiles@quicinc.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Xin Hao <xhao@linux.alibaba.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Cristian Marussi <cristian.marussi@arm.com>
Link: https://lore.kernel.org/r/20220902154829.30399-20-james.morse@arm.com
2022-09-23 14:23:41 +02:00
James Morse
38f72f50d6 x86/resctrl: Move get_corrected_mbm_count() into resctrl_arch_rmid_read()
resctrl_arch_rmid_read() is intended as the function that an
architecture agnostic resctrl filesystem driver can use to
read a value in bytes from a counter. Currently the function returns
the MBM values in chunks directly from hardware. When reading a bandwidth
counter, get_corrected_mbm_count() must be used to correct the
value read.

get_corrected_mbm_count() is architecture specific, this work should be
done in resctrl_arch_rmid_read().

Move the function calls. This allows the resctrl filesystems's chunks
value to be removed in favour of the architecture private version.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <quic_jiles@quicinc.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Xin Hao <xhao@linux.alibaba.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Cristian Marussi <cristian.marussi@arm.com>
Link: https://lore.kernel.org/r/20220902154829.30399-19-james.morse@arm.com
2022-09-23 14:22:53 +02:00
James Morse
1d81d15db3 x86/resctrl: Move mbm_overflow_count() into resctrl_arch_rmid_read()
resctrl_arch_rmid_read() is intended as the function that an
architecture agnostic resctrl filesystem driver can use to
read a value in bytes from a counter. Currently the function returns
the MBM values in chunks directly from hardware. When reading a bandwidth
counter, mbm_overflow_count() must be used to correct for any possible
overflow.

mbm_overflow_count() is architecture specific, its behaviour should
be part of resctrl_arch_rmid_read().

Move the mbm_overflow_count() calls into resctrl_arch_rmid_read().
This allows the resctrl filesystems's prev_msr to be removed in
favour of the architecture private version.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <quic_jiles@quicinc.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Xin Hao <xhao@linux.alibaba.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Cristian Marussi <cristian.marussi@arm.com>
Link: https://lore.kernel.org/r/20220902154829.30399-18-james.morse@arm.com
2022-09-23 14:22:20 +02:00
James Morse
8286618aca x86/resctrl: Pass the required parameters into resctrl_arch_rmid_read()
resctrl_arch_rmid_read() is intended as the function that an
architecture agnostic resctrl filesystem driver can use to
read a value in bytes from a hardware register. Currently the function
returns the MBM values in chunks directly from hardware.

To convert this to bytes, some correction and overflow calculations
are needed. These depend on the resource and domain structures.
Overflow detection requires the old chunks value. None of this
is available to resctrl_arch_rmid_read(). MPAM requires the
resource and domain structures to find the MMIO device that holds
the registers.

Pass the resource and domain to resctrl_arch_rmid_read(). This makes
rmid_dirty() too big. Instead merge it with its only caller, and the
name is kept as a local variable.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <quic_jiles@quicinc.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Xin Hao <xhao@linux.alibaba.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Cristian Marussi <cristian.marussi@arm.com>
Link: https://lore.kernel.org/r/20220902154829.30399-17-james.morse@arm.com
2022-09-23 14:21:25 +02:00
James Morse
4d044c521a x86/resctrl: Abstract __rmid_read()
__rmid_read() selects the specified eventid and returns the counter
value from the MSR. The error handling is architecture specific, and
handled by the callers, rdtgroup_mondata_show() and __mon_event_count().

Error handling should be handled by architecture specific code, as
a different architecture may have different requirements. MPAM's
counters can report that they are 'not ready', requiring a second
read after a short delay. This should be hidden from resctrl.

Make __rmid_read() the architecture specific function for reading
a counter. Rename it resctrl_arch_rmid_read() and move the error
handling into it.

A read from a counter that hardware supports but resctrl does not
now returns -EINVAL instead of -EIO from the default case in
__mon_event_count(). It isn't possible for user-space to see this
change as resctrl doesn't expose counters it doesn't support.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <quic_jiles@quicinc.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Xin Hao <xhao@linux.alibaba.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Cristian Marussi <cristian.marussi@arm.com>
Link: https://lore.kernel.org/r/20220902154829.30399-16-james.morse@arm.com
2022-09-23 14:17:20 +02:00
Kees Cook
712f210a45 x86/microcode/AMD: Track patch allocation size explicitly
In preparation for reducing the use of ksize(), record the actual
allocation size for later memcpy(). This avoids copying extra
(uninitialized!) bytes into the patch buffer when the requested
allocation size isn't exactly the size of a kmalloc bucket.
Additionally, fix potential future issues where runtime bounds checking
will notice that the buffer was allocated to a smaller value than
returned by ksize().

Fixes: 757885e94a22 ("x86, microcode, amd: Early microcode patch loading support for AMD")
Suggested-by: Daniel Micay <danielmicay@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/lkml/CA+DvKQ+bp7Y7gmaVhacjv9uF6Ar-o4tet872h4Q8RPYPJjcJQA@mail.gmail.com/
2022-09-23 13:46:26 +02:00
James Morse
fea62d370d x86/resctrl: Allow per-rmid arch private storage to be reset
To abstract the rmid counters into a helper that returns the number
of bytes counted, architecture specific per-rmid state is needed.

It needs to be possible to reset this hidden state, as the values
may outlive the life of an rmid, or the mount time of the filesystem.

mon_event_read() is called with first = true when an rmid is first
allocated in mkdir_mondata_subdir(). Add resctrl_arch_reset_rmid()
and call it from __mon_event_count()'s rr->first check.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <quic_jiles@quicinc.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Xin Hao <xhao@linux.alibaba.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Cristian Marussi <cristian.marussi@arm.com>
Link: https://lore.kernel.org/r/20220902154829.30399-15-james.morse@arm.com
2022-09-23 12:49:04 +02:00
James Morse
48dbe31a24 x86/resctrl: Add per-rmid arch private storage for overflow and chunks
A renamed __rmid_read() is intended as the function that an
architecture agnostic resctrl filesystem driver can use to
read a value in bytes from a counter. Currently the function returns
the MBM values in chunks directly from hardware. For bandwidth
counters the resctrl filesystem uses this to calculate the number of
bytes ever seen.

MPAM's scaling of counters can be changed at runtime, reducing the
resolution but increasing the range. When this is changed the prev_msr
values need to be converted by the architecture code.

Add an array for per-rmid private storage. The prev_msr and chunks
values will move here to allow resctrl_arch_rmid_read() to always
return the number of bytes read by this counter without assistance
from the filesystem. The values are moved in later patches when
the overflow and correction calls are moved into __rmid_read().

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <quic_jiles@quicinc.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Xin Hao <xhao@linux.alibaba.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Cristian Marussi <cristian.marussi@arm.com>
Link: https://lore.kernel.org/r/20220902154829.30399-14-james.morse@arm.com
2022-09-22 17:46:09 +02:00
James Morse
30442571ec x86/resctrl: Calculate bandwidth from the previous __mon_event_count() chunks
mbm_bw_count() is only called by the mbm_handle_overflow() worker once a
second. It reads the hardware register, calculates the bandwidth and
updates m->prev_bw_msr which is used to hold the previous hardware register
value.

Operating directly on hardware register values makes it difficult to make
this code architecture independent, so that it can be moved to /fs/,
making the mba_sc feature something resctrl supports with no additional
support from the architecture.
Prior to calling mbm_bw_count(), mbm_update() reads from the same hardware
register using __mon_event_count().

Change mbm_bw_count() to use the current chunks value most recently saved
by __mon_event_count(). This removes an extra call to __rmid_read().
Instead of using m->prev_msr to calculate the number of chunks seen,
use the rr->val that was updated by __mon_event_count(). This removes an
extra call to mbm_overflow_count() and get_corrected_mbm_count().
Calculating bandwidth like this means mbm_bw_count() no longer operates
on hardware register values directly.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <quic_jiles@quicinc.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Xin Hao <xhao@linux.alibaba.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Cristian Marussi <cristian.marussi@arm.com>
Link: https://lore.kernel.org/r/20220902154829.30399-13-james.morse@arm.com
2022-09-22 17:44:57 +02:00
James Morse
ff6357bb50 x86/resctrl: Allow update_mba_bw() to update controls directly
update_mba_bw() calculates a new control value for the MBA resource
based on the user provided mbps_val and the current measured
bandwidth. Some control values need remapping by delay_bw_map().

It does this by calling wrmsrl() directly. This needs splitting
up to be done by an architecture specific helper, so that the
remainder can eventually be moved to /fs/.

Add resctrl_arch_update_one() to apply one configuration value
to the provided resource and domain. This avoids the staging
and cross-calling that is only needed with changes made by
user-space. delay_bw_map() moves to be part of the arch code,
to maintain the 'percentage control' view of MBA resources
in resctrl.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <quic_jiles@quicinc.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Xin Hao <xhao@linux.alibaba.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Cristian Marussi <cristian.marussi@arm.com>
Link: https://lore.kernel.org/r/20220902154829.30399-12-james.morse@arm.com
2022-09-22 17:43:44 +02:00
James Morse
b58d4eb1f1 x86/resctrl: Remove architecture copy of mbps_val
The resctrl arch code provides a second configuration array mbps_val[]
for the MBA software controller.

Since resctrl switched over to allocating and freeing its own array
when needed, nothing uses the arch code version.

Remove it.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <quic_jiles@quicinc.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Xin Hao <xhao@linux.alibaba.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Cristian Marussi <cristian.marussi@arm.com>
Link: https://lore.kernel.org/r/20220902154829.30399-11-james.morse@arm.com
2022-09-22 17:37:16 +02:00
James Morse
6ce1560d35 x86/resctrl: Switch over to the resctrl mbps_val list
Updates to resctrl's software controller follow the same path as
other configuration updates, but they don't modify the hardware state.
rdtgroup_schemata_write() uses parse_line() and the resource's
parse_ctrlval() function to stage the configuration.
resctrl_arch_update_domains() then updates the mbps_val[] array
instead, and resctrl_arch_update_domains() skips the rdt_ctrl_update()
call that would update hardware.

This complicates the interface between resctrl's filesystem parts
and architecture specific code. It should be possible for mba_sc
to be completely implemented by the filesystem parts of resctrl. This
would allow it to work on a second architecture with no additional code.
resctrl_arch_update_domains() using the mbps_val[] array prevents this.

Change parse_bw() to write the configuration value directly to the
mbps_val[] array in the domain structure. Change rdtgroup_schemata_write()
to skip the call to resctrl_arch_update_domains(), meaning all the
mba_sc specific code in resctrl_arch_update_domains() can be removed.
On the read-side, show_doms() and update_mba_bw() are changed to read
the mbps_val[] array from the domain structure. With this,
resctrl_arch_get_config() no longer needs to consider mba_sc resources.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <quic_jiles@quicinc.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Xin Hao <xhao@linux.alibaba.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Cristian Marussi <cristian.marussi@arm.com>
Link: https://lore.kernel.org/r/20220902154829.30399-10-james.morse@arm.com
2022-09-22 17:34:08 +02:00
James Morse
781096d971 x86/resctrl: Create mba_sc configuration in the rdt_domain
To support resctrl's MBA software controller, the architecture must provide
a second configuration array to hold the mbps_val[] from user-space.

This complicates the interface between the architecture specific code and
the filesystem portions of resctrl that will move to /fs/, to allow
multiple architectures to support resctrl.

Make the filesystem parts of resctrl create an array for the mba_sc
values. The software controller can be changed to use this, allowing
the architecture code to only consider the values configured in hardware.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <quic_jiles@quicinc.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Xin Hao <xhao@linux.alibaba.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Cristian Marussi <cristian.marussi@arm.com>
Link: https://lore.kernel.org/r/20220902154829.30399-9-james.morse@arm.com
2022-09-22 17:17:59 +02:00
James Morse
b045c21586 x86/resctrl: Abstract and use supports_mba_mbps()
To determine whether the mba_MBps option to resctrl should be supported,
resctrl tests the boot CPUs' x86_vendor.

This isn't portable, and needs abstracting behind a helper so this check
can be part of the filesystem code that moves to /fs/.

Re-use the tests set_mba_sc() does to determine if the mba_sc is supported
on this system. An 'alloc_capable' test is added so that support for the
controls isn't implied by the 'delay_linear' property, which is always
true for MPAM. Because mbm_update() only update mba_sc if the mbm_local
counters are enabled, supports_mba_mbps() checks is_mbm_local_enabled().
(instead of using is_mbm_enabled(), which checks both).

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <quic_jiles@quicinc.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Xin Hao <xhao@linux.alibaba.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Cristian Marussi <cristian.marussi@arm.com>
Link: https://lore.kernel.org/r/20220902154829.30399-8-james.morse@arm.com
2022-09-22 16:10:11 +02:00
James Morse
1644dfe727 x86/resctrl: Remove set_mba_sc()s control array re-initialisation
set_mba_sc() enables the 'software controller' to regulate the bandwidth
based on the byte counters. This can be managed entirely in the parts
of resctrl that move to /fs/, without any extra support from the
architecture specific code. set_mba_sc() is called by rdt_enable_ctx()
during mount and unmount. It currently resets the arch code's ctrl_val[]
and mbps_val[] arrays.

The ctrl_val[] was already reset when the domain was created, and by
reset_all_ctrls() when the filesystem was last unmounted. Doing the work
in set_mba_sc() is not necessary as the values are already at their
defaults due to the creation of the domain, or were previously reset
during umount(), or are about to reset during umount().

Add a reset of the mbps_val[] in reset_all_ctrls(), allowing the code in
set_mba_sc() that reaches in to the architecture specific structures to
be removed.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <quic_jiles@quicinc.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Xin Hao <xhao@linux.alibaba.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Cristian Marussi <cristian.marussi@arm.com>
Link: https://lore.kernel.org/r/20220902154829.30399-7-james.morse@arm.com
2022-09-22 16:08:20 +02:00
James Morse
798fd4b9ac x86/resctrl: Add domain offline callback for resctrl work
Because domains are exposed to user-space via resctrl, the filesystem
must update its state when CPU hotplug callbacks are triggered.

Some of this work is common to any architecture that would support
resctrl, but the work is tied up with the architecture code to
free the memory.

Move the monitor subdir removal and the cancelling of the mbm/limbo
works into a new resctrl_offline_domain() call. These bits are not
specific to the architecture. Grouping them in one function allows
that code to be moved to /fs/ and re-used by another architecture.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <quic_jiles@quicinc.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Xin Hao <xhao@linux.alibaba.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Cristian Marussi <cristian.marussi@arm.com>
Link: https://lore.kernel.org/r/20220902154829.30399-6-james.morse@arm.com
2022-09-22 15:42:40 +02:00
James Morse
7add3af417 x86/resctrl: Group struct rdt_hw_domain cleanup
domain_add_cpu() and domain_remove_cpu() need to kfree() the child
arrays that were allocated by domain_setup_ctrlval().

As this memory is moved around, and new arrays are created, adjusting
the error handling cleanup code becomes noisier.

To simplify this, move all the kfree() calls into a domain_free() helper.
This depends on struct rdt_hw_domain being kzalloc()d, allowing it to
unconditionally kfree() all the child arrays.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <quic_jiles@quicinc.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Xin Hao <xhao@linux.alibaba.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Cristian Marussi <cristian.marussi@arm.com>
Link: https://lore.kernel.org/r/20220902154829.30399-5-james.morse@arm.com
2022-09-22 15:27:15 +02:00
James Morse
3a7232cdf1 x86/resctrl: Add domain online callback for resctrl work
Because domains are exposed to user-space via resctrl, the filesystem
must update its state when CPU hotplug callbacks are triggered.

Some of this work is common to any architecture that would support
resctrl, but the work is tied up with the architecture code to
allocate the memory.

Move domain_setup_mon_state(), the monitor subdir creation call and the
mbm/limbo workers into a new resctrl_online_domain() call. These bits
are not specific to the architecture. Grouping them in one function
allows that code to be moved to /fs/ and re-used by another architecture.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <quic_jiles@quicinc.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Xin Hao <xhao@linux.alibaba.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Cristian Marussi <cristian.marussi@arm.com>
Link: https://lore.kernel.org/r/20220902154829.30399-4-james.morse@arm.com
2022-09-22 15:13:27 +02:00
James Morse
bab6ee7368 x86/resctrl: Merge mon_capable and mon_enabled
mon_enabled and mon_capable are always set as a pair by
rdt_get_mon_l3_config().

There is no point having two values.

Merge them together.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <quic_jiles@quicinc.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Xin Hao <xhao@linux.alibaba.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Cristian Marussi <cristian.marussi@arm.com>
Link: https://lore.kernel.org/r/20220902154829.30399-3-james.morse@arm.com
2022-09-22 14:43:08 +02:00
James Morse
4d269ed485 x86/resctrl: Kill off alloc_enabled
rdt_resources_all[] used to have extra entries for L2CODE/L2DATA.
These were hidden from resctrl by the alloc_enabled value.

Now that the L2/L2CODE/L2DATA resources have been merged together,
alloc_enabled doesn't mean anything, it always has the same value as
alloc_capable which indicates allocation is supported by this resource.

Remove alloc_enabled and its helpers.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Jamie Iles <quic_jiles@quicinc.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Xin Hao <xhao@linux.alibaba.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Cristian Marussi <cristian.marussi@arm.com>
Link: https://lore.kernel.org/r/20220902154829.30399-2-james.morse@arm.com
2022-09-22 14:34:33 +02:00
Jiri Slaby
5258b80e60 x86/dumpstack: Don't mention RIP in "Code: "
Commit

  238c91115cd0 ("x86/dumpstack: Fix misleading instruction pointer error message")

changed the "Code:" line in bug reports when RIP is an invalid pointer.
In particular, the report currently says (for example):

  BUG: kernel NULL pointer dereference, address: 0000000000000000
  ...
  RIP: 0010:0x0
  Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.

That

  Unable to access opcode bytes at RIP 0xffffffffffffffd6.

is quite confusing as RIP value is 0, not -42. That -42 comes from
"regs->ip - PROLOGUE_SIZE", because Code is dumped with some prologue
(and epilogue).

So do not mention "RIP" on this line in this context.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/b772c39f-c5ae-8f17-fe6e-6a2bc4d1f83b@kernel.org
2022-09-20 16:11:54 +02:00
Peter Zijlstra
8c03af3e09 x86,retpoline: Be sure to emit INT3 after JMP *%\reg
Both AMD and Intel recommend using INT3 after an indirect JMP. Make sure
to emit one when rewriting the retpoline JMP irrespective of compiler
SLS options or even CONFIG_SLS.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Link: https://lkml.kernel.org/r/Yxm+QkFPOhrVSH6q@hirez.programming.kicks-ass.net
2022-09-15 16:13:53 +02:00
Haitao Huang
81fa6fd13b x86/sgx: Handle VA page allocation failure for EAUG on PF.
VM_FAULT_NOPAGE is expected behaviour for -EBUSY failure path, when
augmenting a page, as this means that the reclaimer thread has been
triggered, and the intention is just to round-trip in ring-3, and
retry with a new page fault.

Fixes: 5a90d2c3f5ef ("x86/sgx: Support adding of pages to an initialized enclave")
Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Tested-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20220906000221.34286-3-jarkko@kernel.org
2022-09-08 13:28:31 -07:00
Jarkko Sakkinen
133e049a3f x86/sgx: Do not fail on incomplete sanitization on premature stop of ksgxd
Unsanitized pages trigger WARN_ON() unconditionally, which can panic the
whole computer, if /proc/sys/kernel/panic_on_warn is set.

In sgx_init(), if misc_register() fails or misc_register() succeeds but
neither sgx_drv_init() nor sgx_vepc_init() succeeds, then ksgxd will be
prematurely stopped. This may leave unsanitized pages, which will result a
false warning.

Refine __sgx_sanitize_pages() to return:

1. Zero when the sanitization process is complete or ksgxd has been
   requested to stop.
2. The number of unsanitized pages otherwise.

Fixes: 51ab30eb2ad4 ("x86/sgx: Replace section->init_laundry_list with sgx_dirty_page_list")
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/linux-sgx/20220825051827.246698-1-jarkko@kernel.org/T/#u
Link: https://lkml.kernel.org/r/20220906000221.34286-2-jarkko@kernel.org
2022-09-08 13:27:44 -07:00
Sebastian Andrzej Siewior
8cbb2b50ee asm-generic: Conditionally enable do_softirq_own_stack() via Kconfig.
Remove the CONFIG_PREEMPT_RT symbol from the ifdef around
do_softirq_own_stack() and move it to Kconfig instead.

Enable softirq stacks based on SOFTIRQ_ON_OWN_STACK which depends on
HAVE_SOFTIRQ_ON_OWN_STACK and its default value is set to !PREEMPT_RT.
This ensures that softirq stacks are not used on PREEMPT_RT and avoids
a 'select' statement on an option which has a 'depends' statement.

Link: https://lore.kernel.org/YvN5E%2FPrHfUhggr7@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-09-05 17:20:55 +02:00
Ashok Raj
7fce8d6ecc x86/microcode: Print previous version of microcode after reload
Print both old and new versions of microcode after a reload is complete
because knowing the previous microcode version is sometimes important
from a debugging perspective.

  [ bp: Massage commit message. ]

Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20220829181030.722891-1-ashok.raj@intel.com
2022-09-02 08:01:58 +02:00
Daniel Sneddon
b8d1d16360 x86/apic: Don't disable x2APIC if locked
The APIC supports two modes, legacy APIC (or xAPIC), and Extended APIC
(or x2APIC).  X2APIC mode is mostly compatible with legacy APIC, but
it disables the memory-mapped APIC interface in favor of one that uses
MSRs.  The APIC mode is controlled by the EXT bit in the APIC MSR.

The MMIO/xAPIC interface has some problems, most notably the APIC LEAK
[1].  This bug allows an attacker to use the APIC MMIO interface to
extract data from the SGX enclave.

Introduce support for a new feature that will allow the BIOS to lock
the APIC in x2APIC mode.  If the APIC is locked in x2APIC mode and the
kernel tries to disable the APIC or revert to legacy APIC mode a GP
fault will occur.

Introduce support for a new MSR (IA32_XAPIC_DISABLE_STATUS) and handle
the new locked mode when the LEGACY_XAPIC_DISABLED bit is set by
preventing the kernel from trying to disable the x2APIC.

On platforms with the IA32_XAPIC_DISABLE_STATUS MSR, if SGX or TDX are
enabled the LEGACY_XAPIC_DISABLED will be set by the BIOS.  If
legacy APIC is required, then it SGX and TDX need to be disabled in the
BIOS.

[1]: https://aepicleak.com/aepicleak.pdf

Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
Link: https://lkml.kernel.org/r/20220816231943.1152579-1-daniel.sneddon@linux.intel.com
2022-08-31 14:34:11 -07:00
Kohei Tarumizu
499c8bb469 x86/resctrl: Fix to restore to original value when re-enabling hardware prefetch register
The current pseudo_lock.c code overwrites the value of the
MSR_MISC_FEATURE_CONTROL to 0 even if the original value is not 0.
Therefore, modify it to save and restore the original values.

Fixes: 018961ae5579 ("x86/intel_rdt: Pseudo-lock region creation/removal core")
Fixes: 443810fe6160 ("x86/intel_rdt: Create debugfs files for pseudo-locking testing")
Fixes: 8a2fc0e1bc0c ("x86/intel_rdt: More precise L2 hit/miss measurements")
Signed-off-by: Kohei Tarumizu <tarumizu.kohei@fujitsu.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lkml.kernel.org/r/eb660f3c2010b79a792c573c02d01e8e841206ad.1661358182.git.reinette.chatre@intel.com
2022-08-31 11:42:17 -07:00
Peter Zijlstra
bc12b70f7d x86/earlyprintk: Clean up pciserial
While working on a GRUB patch to support PCI-serial, a number of
cleanups were suggested that apply to the code I took inspiration from.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>   # pci_ids.h
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lkml.kernel.org/r/YwdeyCEtW+wa+QhH@worktop.programming.kicks-ass.net
2022-08-29 12:19:25 +02:00
Jane Chu
f9781bb18e x86/mce: Retrieve poison range from hardware
When memory poison consumption machine checks fire, MCE notifier
handlers like nfit_handle_mce() record the impacted physical address
range which is reported by the hardware in the MCi_MISC MSR. The error
information includes data about blast radius, i.e. how many cachelines
did the hardware determine are impacted. A recent change

  7917f9cdb503 ("acpi/nfit: rely on mce->misc to determine poison granularity")

updated nfit_handle_mce() to stop hard coding the blast radius value of
1 cacheline, and instead rely on the blast radius reported in 'struct
mce' which can be up to 4K (64 cachelines).

It turns out that apei_mce_report_mem_error() had a similar problem in
that it hard coded a blast radius of 4K rather than reading the blast
radius from the error information. Fix apei_mce_report_mem_error() to
convey the proper poison granularity.

Signed-off-by: Jane Chu <jane.chu@oracle.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/7ed50fd8-521e-cade-77b1-738b8bfb8502@oracle.com
Link: https://lore.kernel.org/r/20220826233851.1319100-1-jane.chu@oracle.com
2022-08-29 09:33:42 +02:00
Linus Torvalds
2f23a7c914 Misc fixes:
- Fix PAT on Xen, which caused i915 driver failures
  - Fix compat INT 80 entry crash on Xen PV guests
  - Fix 'MMIO Stale Data' mitigation status reporting on older Intel CPUs
  - Fix RSB stuffing regressions
  - Fix ORC unwinding on ftrace trampolines
  - Add Intel Raptor Lake CPU model number
  - Fix (work around) a SEV-SNP bootloader bug providing bogus values in
    boot_params->cc_blob_address, by ignoring the value on !SEV-SNP bootups.
  - Fix SEV-SNP early boot failure
  - Fix the objtool list of noreturn functions and annotate snp_abort(),
    which bug confused objtool on gcc-12.
  - Fix the documentation for retbleed
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmMLgUERHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hU1hAAj9bKO+D07gROpkRhXpLvbqm+mUqk6It8
 qEuyLkC/xvD9N//Pya0ZPPE+l23EHQxM/L0tXCYwsc8Zlx3fr678FQktHZuxULaP
 qlGmwub8PvsyMesXBGtgHJ4RT8L07FelBR+E/TpqCSbv/geM7mcc0ojyOKo+OmbD
 vbsUTDNqsMYndzePUBrv/vJRqVLGlbd1a22DKE/YiYBbZu5hughNUB0bHYhYdsml
 mutcRsVTKRbZOi4wiuY/pnveMf/z9wAwKOWd9EBaDWILygVSHkBJJQyhHigBh3F8
 H1hFn1q4ybULdrAUT4YND9Tjb6U8laU0fxg8Z43bay7bFXXElqoj3W2qka9NjAim
 QvWSFvTYQRTIWt6sCshVsUBWj6fBZdHxcJ8Fh+ucJJp+JWl9/aD0A41vK2Jx0LIt
 p8YzBRKEqd1Q/7QD855BArN7HNtQGgShNt4oPVZ7nPnjuQt0+lngu8NR+6X3RKpX
 r8rordyPvzgPL6W+1uV+8hnz1w+YU2xplAbK+zTijwgJVgyf8khSlZQNpFKULrou
 zjtzo/2nB+4C4bvfetNnaOGhi1/AdCHZHyZE35rotpd73SLHvdOrH0Ll9oCVfVrC
 UWbC1E67cHQw97Ni/4CrCsJRBULK01uyszVCxlEkSYInf0UsKlnmy+TqxizwsCVy
 reYQ0ePyWg0=
 =ZpCJ
 -----END PGP SIGNATURE-----

Merge tag 'x86-urgent-2022-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull misc x86 fixes from Ingo Molnar:

 - Fix PAT on Xen, which caused i915 driver failures

 - Fix compat INT 80 entry crash on Xen PV guests

 - Fix 'MMIO Stale Data' mitigation status reporting on older Intel CPUs

 - Fix RSB stuffing regressions

 - Fix ORC unwinding on ftrace trampolines

 - Add Intel Raptor Lake CPU model number

 - Fix (work around) a SEV-SNP bootloader bug providing bogus values in
   boot_params->cc_blob_address, by ignoring the value on !SEV-SNP
   bootups.

 - Fix SEV-SNP early boot failure

 - Fix the objtool list of noreturn functions and annotate snp_abort(),
   which bug confused objtool on gcc-12.

 - Fix the documentation for retbleed

* tag 'x86-urgent-2022-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Documentation/ABI: Mention retbleed vulnerability info file for sysfs
  x86/sev: Mark snp_abort() noreturn
  x86/sev: Don't use cc_platform_has() for early SEV-SNP calls
  x86/boot: Don't propagate uninitialized boot_params->cc_blob_address
  x86/cpu: Add new Raptor Lake CPU model number
  x86/unwind/orc: Unwind ftrace trampolines with correct ORC entry
  x86/nospec: Fix i386 RSB stuffing
  x86/nospec: Unwreck the RSB stuffing
  x86/bugs: Add "unknown" reporting for MMIO Stale Data
  x86/entry: Fix entry_INT80_compat for Xen PV guests
  x86/PAT: Have pat_enabled() properly reflect state when running on Xen
2022-08-28 10:10:23 -07:00
Borislav Petkov
8c61eafd22 x86/microcode: Remove ->request_microcode_user()
181b6f40e9ea ("x86/microcode: Rip out the OLD_INTERFACE")

removed the old microcode loading interface but forgot to remove the
related ->request_microcode_user() functionality which it uses.

Rip it out now too.

Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220825075445.28171-1-bp@alien8.de
2022-08-26 11:56:08 +02:00