IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Pull hyperv fixes from Wei Liu:
- clarify a comment (Michael Kelley)
- change a pr_warn() to pr_info() (Olaf Hering)
* tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
x86/hyperv: Clarify comment on x2apic mode
hv_balloon: disable warning when floor reached
Merge more updates from Andrew Morton:
"155 patches.
Subsystems affected by this patch series: mm (dax, debug, thp,
readahead, page-poison, util, memory-hotplug, zram, cleanups), misc,
core-kernel, get_maintainer, MAINTAINERS, lib, bitops, checkpatch,
binfmt, ramfs, autofs, nilfs, rapidio, panic, relay, kgdb, ubsan,
romfs, and fault-injection"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (155 commits)
lib, uaccess: add failure injection to usercopy functions
lib, include/linux: add usercopy failure capability
ROMFS: support inode blocks calculation
ubsan: introduce CONFIG_UBSAN_LOCAL_BOUNDS for Clang
sched.h: drop in_ubsan field when UBSAN is in trap mode
scripts/gdb/tasks: add headers and improve spacing format
scripts/gdb/proc: add struct mount & struct super_block addr in lx-mounts command
kernel/relay.c: drop unneeded initialization
panic: dump registers on panic_on_warn
rapidio: fix the missed put_device() for rio_mport_add_riodev
rapidio: fix error handling path
nilfs2: fix some kernel-doc warnings for nilfs2
autofs: harden ioctl table
ramfs: fix nommu mmap with gaps in the page cache
mm: remove the now-unnecessary mmget_still_valid() hack
mm/gup: take mmap_lock in get_dump_page()
binfmt_elf, binfmt_elf_fdpic: use a VMA list snapshot
coredump: rework elf/elf_fdpic vma_dump_size() into common helper
coredump: refactor page range dumping into common helper
coredump: let dump_emit() bail out on short writes
...
Pull another Hyper-V update from Wei Liu:
"One patch from Michael to get VMbus interrupt from ACPI DSDT"
* tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
Drivers: hv: vmbus: Add parsing of VMbus interrupt in ACPI DSDT
On ARM64, Hyper-V now specifies the interrupt to be used by VMbus
in the ACPI DSDT. This information is not used on x86 because the
interrupt vector must be hardcoded. But update the generic
VMbus driver to do the parsing and pass the information to the
architecture specific code that sets up the Linux IRQ. Update
consumers of the interrupt to get it from an architecture specific
function.
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/1597434304-40631-1-git-send-email-mikelley@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Pull Hyper-V updates from Wei Liu:
- a series from Boqun Feng to support page size larger than 4K
- a few miscellaneous clean-ups
* tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
hv: clocksource: Add notrace attribute to read_hv_sched_clock_*() functions
x86/hyperv: Remove aliases with X64 in their name
PCI: hv: Document missing hv_pci_protocol_negotiation() parameter
scsi: storvsc: Support PAGE_SIZE larger than 4K
Driver: hv: util: Use VMBUS_RING_SIZE() for ringbuffer sizes
HID: hyperv: Use VMBUS_RING_SIZE() for ringbuffer sizes
Input: hyperv-keyboard: Use VMBUS_RING_SIZE() for ringbuffer sizes
hv_netvsc: Use HV_HYP_PAGE_SIZE for Hyper-V communication
hv: hyperv.h: Introduce some hvpfn helper functions
Drivers: hv: vmbus: Move virt_to_hvpfn() to hyperv header
Drivers: hv: Use HV_HYP_PAGE in hv_synic_enable_regs()
Drivers: hv: vmbus: Introduce types of GPADL
Drivers: hv: vmbus: Move __vmbus_open()
Drivers: hv: vmbus: Always use HV_HYP_PAGE_SIZE for gpadl
drivers: hv: remove cast from hyperv_die_event
For a Hyper-V vmbus, the size of the ringbuffer has two requirements:
1) it has to take one PAGE_SIZE for the header
2) it has to be PAGE_SIZE aligned so that double-mapping can work
VMBUS_RING_SIZE() could calculate a correct ringbuffer size which
fulfills both requirements, therefore use it to make sure vmbus work
when PAGE_SIZE != HV_HYP_PAGE_SIZE (4K).
Note that since the argument for VMBUS_RING_SIZE() is the size of
payload (data part), so it will be minus 4k (the size of header when
PAGE_SIZE = 4k) than the original value to keep the ringbuffer total
size unchanged when PAGE_SIZE = 4k.
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Cc: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20200916034817.30282-11-boqun.feng@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
This patch introduces two types of GPADL: HV_GPADL_{BUFFER, RING}. The
types of GPADL are purely the concept in the guest, IOW the hypervisor
treat them as the same.
The reason of introducing the types for GPADL is to support guests whose
page size is not 4k (the page size of Hyper-V hypervisor). In these
guests, both the headers and the data parts of the ringbuffers need to
be aligned to the PAGE_SIZE, because 1) some of the ringbuffers will be
mapped into userspace and 2) we use "double mapping" mechanism to
support fast wrap-around, and "double mapping" relies on ringbuffers
being page-aligned. However, the Hyper-V hypervisor only uses 4k
(HV_HYP_PAGE_SIZE) headers. Our solution to this is that we always make
the headers of ringbuffers take one guest page and when GPADL is
established between the guest and hypervisor, the only first 4k of
header is used. To handle this special case, we need the types of GPADL
to differ different guest memory usage for GPADL.
Type enum is introduced along with several general interfaces to
describe the differences between normal buffer GPADL and ringbuffer
GPADL.
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20200916034817.30282-4-boqun.feng@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Since the hypervisor always uses 4K as its page size, the size of PFNs
used for gpadl should be HV_HYP_PAGE_SIZE rather than PAGE_SIZE, so
adjust this accordingly as the preparation for supporting 16K/64K page
size guests. No functional changes on x86, since PAGE_SIZE is always 4k
(equals to HV_HYP_PAGE_SIZE).
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20200916034817.30282-2-boqun.feng@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Pull hyperv fixes from Wei Liu:
"Two patches from Michael and Dexuan to fix vmbus hanging issues"
* tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
Drivers: hv: vmbus: Add timeout to vmbus_wait_for_unload
Drivers: hv: vmbus: hibernation: do not hang forever in vmbus_bus_resume()
After we Stop and later Start a VM that uses Accelerated Networking (NIC
SR-IOV), currently the VF vmbus device's Instance GUID can change, so after
vmbus_bus_resume() -> vmbus_request_offers(), vmbus_onoffer() can not find
the original vmbus channel of the VF, and hence we can't complete()
vmbus_connection.ready_for_resume_event in check_ready_for_resume_event(),
and the VM hangs in vmbus_bus_resume() forever.
Fix the issue by adding a timeout, so the resuming can still succeed, and
the saved state is not lost, and according to my test, the user can disable
Accelerated Networking and then will be able to SSH into the VM for
further recovery. Also prevent the VM in question from suspending again.
The host will be fixed so in future the Instance GUID will stay the same
across hibernation.
Fixes: d8bd2d442b ("Drivers: hv: vmbus: Resume after fixing up old primary channels")
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20200905025555.45614-1-decui@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Pull hyperv fixes from Wei Liu:
"Two patches from Vineeth to improve Hyper-V timesync facility"
* tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
hv_utils: drain the timesync packets on onchannelcallback
hv_utils: return error if host timesysnc update is stale
Pull hyper-v fixes from Wei Liu:
- fix oops reporting on Hyper-V
- make objtool happy
* tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
x86/hyperv: Make hv_setup_sched_clock inline
Drivers: hv: vmbus: Only notify Hyper-V for die events that are oops
Pull hyperv updates from Wei Liu:
- A patch series from Andrea to improve vmbus code
- Two clean-up patches from Alexander and Randy
* tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
hyperv: hyperv.h: drop a duplicated word
tools: hv: change http to https in hv_kvp_daemon.c
Drivers: hv: vmbus: Remove the lock field from the vmbus_channel struct
scsi: storvsc: Introduce the per-storvsc_device spinlock
Drivers: hv: vmbus: Remove unnecessary channel->lock critical sections (sc_list updaters)
Drivers: hv: vmbus: Use channel_mutex in channel_vp_mapping_show()
Drivers: hv: vmbus: Remove unnecessary channel->lock critical sections (sc_list readers)
Drivers: hv: vmbus: Replace cpumask_test_cpu(, cpu_online_mask) with cpu_online()
Drivers: hv: vmbus: Remove the numa_node field from the vmbus_channel struct
Drivers: hv: vmbus: Remove the target_vp field from the vmbus_channel struct
When the kernel panics, one page of kmsg data may be collected and sent to
Hyper-V to aid in diagnosing the failure. The collected kmsg data typically
contains 50 to 100 lines, each of which has a log level prefix that isn't
very useful from a diagnostic standpoint. So tell kmsg_dump_get_buffer()
to not include the log level, enabling more information that *is* useful to
fit in the page.
Requesting in stable kernels, since many kernels running in production are
stable releases.
Cc: stable@vger.kernel.org
Signed-off-by: Joseph Salisbury <joseph.salisbury@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/1593210497-114310-1-git-send-email-joseph.salisbury@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
The primitive currently uses channel->lock to protect the loop over
sc_list w.r.t. list additions/deletions but it doesn't protect the
target_cpu(s) loads w.r.t. a concurrent target_cpu_store(): while the
data races on target_cpu are hardly of any concern here, replace the
channel->lock critical section with a channel_mutex critical section
and extend the latter to include the loads of target_cpu; this same
pattern is also used in hv_synic_cleanup().
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Link: https://lore.kernel.org/r/20200617164642.37393-6-parri.andrea@gmail.com
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Pull hyper-v updates from Wei Liu:
- a series from Andrea to support channel reassignment
- a series from Vitaly to clean up Vmbus message handling
- a series from Michael to clean up and augment hyperv-tlfs.h
- patches from Andy to clean up GUID usage in Hyper-V code
- a few other misc patches
* tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (29 commits)
Drivers: hv: vmbus: Resolve more races involving init_vp_index()
Drivers: hv: vmbus: Resolve race between init_vp_index() and CPU hotplug
vmbus: Replace zero-length array with flexible-array
Driver: hv: vmbus: drop a no long applicable comment
hyper-v: Switch to use UUID types directly
hyper-v: Replace open-coded variant of %*phN specifier
hyper-v: Supply GUID pointer to printf() like functions
hyper-v: Use UUID API for exporting the GUID (part 2)
asm-generic/hyperv: Add definitions for Get/SetVpRegister hypercalls
x86/hyperv: Split hyperv-tlfs.h into arch dependent and independent files
x86/hyperv: Remove HV_PROCESSOR_POWER_STATE #defines
KVM: x86: hyperv: Remove duplicate definitions of Reference TSC Page
drivers: hv: remove redundant assignment to pointer primary_channel
scsi: storvsc: Re-init stor_chns when a channel interrupt is re-assigned
Drivers: hv: vmbus: Introduce the CHANNELMSG_MODIFYCHANNEL message type
Drivers: hv: vmbus: Synchronize init_vp_index() vs. CPU hotplug
Drivers: hv: vmbus: Remove the unused HV_LOCALIZED channel affinity logic
PCI: hv: Prepare hv_compose_msi_msg() for the VMBus-channel-interrupt-to-vCPU reassignment functionality
Drivers: hv: vmbus: Use a spin lock for synchronizing channel scheduling vs. channel removal
hv_utils: Always execute the fcopy and vss callbacks in a tasklet
...
init_vp_index() uses the (per-node) hv_numa_map[] masks to record the
CPUs allocated for channel interrupts at a given time, and distribute
the performance-critical channels across the available CPUs: in part.,
the mask of "candidate" target CPUs in a given NUMA node, for a newly
offered channel, is determined by XOR-ing the node's CPU mask and the
node's hv_numa_map. This operation/mechanism assumes that no offline
CPUs is set in the hv_numa_map mask, an assumption that does not hold
since such mask is currently not updated when a channel is removed or
assigned to a different CPU.
To address the issues described above, this adds hooks in the channel
removal path (hv_process_channel_removal()) and in target_cpu_store()
in order to clear, resp. to update, the hv_numa_map[] masks as needed.
This also adds a (missed) update of the masks in init_vp_index() (cf.,
e.g., the memory-allocation failure path in this function).
Like in the case of init_vp_index(), such hooks require to determine
if the given channel is performance critical. init_vp_index() does
this by parsing the channel's offer, it can not rely on the device
data structure (device_obj) to retrieve such information because the
device data structure has not been allocated/linked with the channel
by the time that init_vp_index() executes. A similar situation may
hold in hv_is_alloced_cpu() (defined below); the adopted approach is
to "cache" the device type of the channel, as computed by parsing the
channel's offer, in the channel structure itself.
Fixes: 7527810573 ("Drivers: hv: vmbus: Introduce the CHANNELMSG_MODIFYCHANNEL message type")
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20200522171901.204127-3-parri.andrea@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
vmbus_process_offer() does two things (among others):
1) first, it sets the channel's target CPU with cpu_hotplug_lock;
2) it then adds the channel to the channel list(s) with channel_mutex.
Since cpu_hotplug_lock is released before (2), the channel's target CPU
(as designated in (1)) can be deemed "free" by hv_synic_cleanup() and go
offline before the channel is added to the list.
Fix the race condition by "extending" the cpu_hotplug_lock critical
section to include (2) (and (1)), nesting the channel_mutex critical
section within the cpu_hotplug_lock critical section as done elsewhere
(hv_synic_cleanup(), target_cpu_store()) in the hyperv drivers code.
Move even further by extending the channel_mutex critical section to
include (1) (and (2)): this change allows to remove (the now redundant)
bind_channel_to_cpu_lock, and generally simplifies the handling of the
target CPUs (that are now always modified with channel_mutex held).
Fixes: d570aec0f2 ("Drivers: hv: vmbus: Synchronize init_vp_index() vs. CPU hotplug")
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20200522171901.204127-2-parri.andrea@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Pull Hyper-V fixes from Wei Liu:
- Two patches from Dexuan fixing suspension bugs
- Three cleanup patches from Andy and Michael
* tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
hyper-v: Remove internal types from UAPI header
hyper-v: Use UUID API for exporting the GUID
x86/hyperv: Suspend/resume the VP assist page for hibernation
Drivers: hv: Move AEOI determination to architecture dependent code
Drivers: hv: vmbus: Fix Suspend-to-Idle for Generation-2 VM
VMBus version 4.1 and later support the CHANNELMSG_MODIFYCHANNEL(22)
message type which can be used to request Hyper-V to change the vCPU
that a channel will interrupt.
Introduce the CHANNELMSG_MODIFYCHANNEL message type, and define the
vmbus_send_modifychannel() function to send CHANNELMSG_MODIFYCHANNEL
requests to the host via a hypercall. The function is then used to
define a sysfs "store" operation, which allows to change the (v)CPU
the channel will interrupt by using the sysfs interface. The feature
can be used for load balancing or other purposes.
One interesting catch here is that Hyper-V can *not* currently ACK
CHANNELMSG_MODIFYCHANNEL messages with the promise that (after the ACK
is sent) the channel won't send any more interrupts to the "old" CPU.
The peculiarity of the CHANNELMSG_MODIFYCHANNEL messages is problematic
if the user want to take a CPU offline, since we don't want to take a
CPU offline (and, potentially, "lose" channel interrupts on such CPU)
if the host is still processing a CHANNELMSG_MODIFYCHANNEL message
associated to that CPU.
It is worth mentioning, however, that we have been unable to observe
the above mentioned "race": in all our tests, CHANNELMSG_MODIFYCHANNEL
requests appeared *as if* they were processed synchronously by the host.
Suggested-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Link: https://lore.kernel.org/r/20200406001514.19876-11-parri.andrea@gmail.com
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
[ wei: fix conflict in channel_mgmt.c ]
Signed-off-by: Wei Liu <wei.liu@kernel.org>
init_vp_index() may access the cpu_online_mask mask via its calls of
cpumask_of_node(). Make sure to protect these accesses with a
cpus_read_lock() critical section.
Also, remove some (hardcoded) instances of CPU(0) from init_vp_index()
and replace them with VMBUS_CONNECT_CPU. The connect CPU can not go
offline, since Hyper-V does not provide a way to change it.
Finally, order the accesses of target_cpu from init_vp_index() and
hv_synic_cleanup() by relying on the channel_mutex; this is achieved
by moving the call of init_vp_index() into vmbus_process_offer().
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Link: https://lore.kernel.org/r/20200406001514.19876-10-parri.andrea@gmail.com
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Since vmbus_chan_sched() dereferences the ring buffer pointer, we have
to make sure that the ring buffer data structures don't get freed while
such dereferencing is happening. Current code does this by sending an
IPI to the CPU that is allowed to access that ring buffer from interrupt
level, cf., vmbus_reset_channel_cb(). But with the new functionality
to allow changing the CPU that a channel will interrupt, we can't be
sure what CPU will be running the vmbus_chan_sched() function for a
particular channel, so the current IPI mechanism is infeasible.
Instead synchronize vmbus_chan_sched() and vmbus_reset_channel_cb() by
using the (newly introduced) per-channel spin lock "sched_lock". Move
the test for onchannel_callback being NULL before the "switch" control
statement in vmbus_chan_sched(), in order to not access the ring buffer
if the vmbus_reset_channel_cb() has been completed on the channel.
Suggested-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Link: https://lore.kernel.org/r/20200406001514.19876-7-parri.andrea@gmail.com
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
The fcopy and vss callback functions could be running in a tasklet
at the same time they are called in hv_poll_channel(). Current code
serializes the invocations of these functions, and their accesses to
the channel ring buffer, by sending an IPI to the CPU that is allowed
to access the ring buffer, cf. hv_poll_channel(). This IPI mechanism
becomes infeasible if we allow changing the CPU that a channel will
interrupt. Instead modify the callback wrappers to always execute
the fcopy and vss callbacks in a tasklet, thus mirroring the solution
for the kvp callback functions adopted since commit a3ade8cc47
("HV: properly delay KVP packets when negotiation is in progress").
This will ensure that the callback function can't run on two CPUs at
the same time.
Suggested-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Link: https://lore.kernel.org/r/20200406001514.19876-6-parri.andrea@gmail.com
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
When Hyper-V sends an interrupt to the guest, the guest has to figure
out which channel the interrupt is associated with. Hyper-V sets a bit
in a memory page that is shared with the guest, indicating a particular
"relid" that the interrupt is associated with. The current Linux code
then uses a set of per-CPU linked lists to map a given "relid" to a
pointer to a channel structure.
This design introduces a synchronization problem if the CPU that Hyper-V
will interrupt for a certain channel is changed. If the interrupt comes
on the "old CPU" and the channel was already moved to the per-CPU list
of the "new CPU", then the relid -> channel mapping will fail and the
interrupt is dropped. Similarly, if the interrupt comes on the new CPU
but the channel was not moved to the per-CPU list of the new CPU, then
the mapping will fail and the interrupt is dropped.
Relids are integers ranging from 0 to 2047. The mapping from relids to
channel structures can be done by setting up an array with 2048 entries,
each entry being a pointer to a channel structure (hence total size ~16K
bytes, which is not a problem). The array is global, so there are no
per-CPU linked lists to update. The array can be searched and updated
by loading from/storing to the array at the specified index. With no
per-CPU data structures, the above mentioned synchronization problem is
avoided and the relid2channel() function gets simpler.
Suggested-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Link: https://lore.kernel.org/r/20200406001514.19876-4-parri.andrea@gmail.com
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>