xen: features and fixes for 4.13-rc1

-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABAgAGBQJZXdVXAAoJELDendYovxMvVA0IAITmvH21SDTFiilKCOrxhCv0
 W3q3cOhZA4D+UtTqqIm/os/et08n72864s0mUFoY4PxETaUsb1jBav7z7Tod2c6B
 wh26UgIAhVO3ZewFSmpdPYoW0l3elC5JUMkVMfwSvHkROaU+YDEYUsLWGuIHZiiy
 V/kIskcKe08HLObU//BMjfFusmMHmQSg+TruyqRWodlWj4Rwm7q5fNZ/xaap1UCM
 O7GcHyq1k699w5YYTlIEkLWsX/pGM+auGSlT1xdjJEc2bpjH8ps0xbvAn6dsAKsE
 yoDyxQWtX2wBUXCqF0hXYAB2r1iFx2aFfLQjwc7p+V6BvxpWwSsC7Ur4QIDnm3E=
 =OLb7
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.13-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen updates from Juergen Gross:
 "Other than fixes and cleanups it contains:

   - support > 32 VCPUs at domain restore

   - support for new sysfs nodes related to Xen

   - some performance tuning for Linux running as Xen guest"

* tag 'for-linus-4.13-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  x86/xen: allow userspace access during hypercalls
  x86: xen: remove unnecessary variable in xen_foreach_remap_area()
  xen: allocate page for shared info page from low memory
  xen: avoid deadlock in xenbus driver
  xen: add sysfs node for hypervisor build id
  xen: sync include/xen/interface/version.h
  xen: add sysfs node for guest type
  doc,xen: document hypervisor sysfs nodes for xen
  xen/vcpu: Handle xen_vcpu_setup() failure at boot
  xen/vcpu: Handle xen_vcpu_setup() failure in hotplug
  xen/pv: Fix OOPS on restore for a PV, !SMP domain
  xen/pvh*: Support > 32 VCPUs at domain restore
  xen/vcpu: Simplify xen_vcpu related code
  xen-evtchn: Bind dyn evtchn:qemu-dm interrupt to next online VCPU
  xen: avoid type warning in xchg_xen_ulong
  xen: fix HYPERVISOR_dm_op() prototype
  xen: don't print error message in case of missing Xenstore entry
  arm/xen: Adjust one function call together with a variable assignment
  arm/xen: Delete an error message for a failed memory allocation in __set_phys_to_machine_multi()
  arm/xen: Improve a size determination in __set_phys_to_machine_multi()
This commit is contained in:
Linus Torvalds 2017-07-06 19:11:24 -07:00
commit 6e6c5b9606
25 changed files with 542 additions and 167 deletions

View File

@ -0,0 +1,119 @@
What: /sys/hypervisor/compilation/compile_date
Date: March 2009
KernelVersion: 2.6.30
Contact: xen-devel@lists.xenproject.org
Description: If running under Xen:
Contains the build time stamp of the Xen hypervisor
Might return "<denied>" in case of special security settings
in the hypervisor.
What: /sys/hypervisor/compilation/compiled_by
Date: March 2009
KernelVersion: 2.6.30
Contact: xen-devel@lists.xenproject.org
Description: If running under Xen:
Contains information who built the Xen hypervisor
Might return "<denied>" in case of special security settings
in the hypervisor.
What: /sys/hypervisor/compilation/compiler
Date: March 2009
KernelVersion: 2.6.30
Contact: xen-devel@lists.xenproject.org
Description: If running under Xen:
Compiler which was used to build the Xen hypervisor
Might return "<denied>" in case of special security settings
in the hypervisor.
What: /sys/hypervisor/properties/capabilities
Date: March 2009
KernelVersion: 2.6.30
Contact: xen-devel@lists.xenproject.org
Description: If running under Xen:
Space separated list of supported guest system types. Each type
is in the format: <class>-<major>.<minor>-<arch>
With:
<class>: "xen" -- x86: paravirtualized, arm: standard
"hvm" -- x86 only: fully virtualized
<major>: major guest interface version
<minor>: minor guest interface version
<arch>: architecture, e.g.:
"x86_32": 32 bit x86 guest without PAE
"x86_32p": 32 bit x86 guest with PAE
"x86_64": 64 bit x86 guest
"armv7l": 32 bit arm guest
"aarch64": 64 bit arm guest
What: /sys/hypervisor/properties/changeset
Date: March 2009
KernelVersion: 2.6.30
Contact: xen-devel@lists.xenproject.org
Description: If running under Xen:
Changeset of the hypervisor (git commit)
Might return "<denied>" in case of special security settings
in the hypervisor.
What: /sys/hypervisor/properties/features
Date: March 2009
KernelVersion: 2.6.30
Contact: xen-devel@lists.xenproject.org
Description: If running under Xen:
Features the Xen hypervisor supports for the guest as defined
in include/xen/interface/features.h printed as a hex value.
What: /sys/hypervisor/properties/pagesize
Date: March 2009
KernelVersion: 2.6.30
Contact: xen-devel@lists.xenproject.org
Description: If running under Xen:
Default page size of the hypervisor printed as a hex value.
Might return "0" in case of special security settings
in the hypervisor.
What: /sys/hypervisor/properties/virtual_start
Date: March 2009
KernelVersion: 2.6.30
Contact: xen-devel@lists.xenproject.org
Description: If running under Xen:
Virtual address of the hypervisor as a hex value.
What: /sys/hypervisor/type
Date: March 2009
KernelVersion: 2.6.30
Contact: xen-devel@lists.xenproject.org
Description: If running under Xen:
Type of hypervisor:
"xen": Xen hypervisor
What: /sys/hypervisor/uuid
Date: March 2009
KernelVersion: 2.6.30
Contact: xen-devel@lists.xenproject.org
Description: If running under Xen:
UUID of the guest as known to the Xen hypervisor.
What: /sys/hypervisor/version/extra
Date: March 2009
KernelVersion: 2.6.30
Contact: xen-devel@lists.xenproject.org
Description: If running under Xen:
The Xen version is in the format <major>.<minor><extra>
This is the <extra> part of it.
Might return "<denied>" in case of special security settings
in the hypervisor.
What: /sys/hypervisor/version/major
Date: March 2009
KernelVersion: 2.6.30
Contact: xen-devel@lists.xenproject.org
Description: If running under Xen:
The Xen version is in the format <major>.<minor><extra>
This is the <major> part of it.
What: /sys/hypervisor/version/minor
Date: March 2009
KernelVersion: 2.6.30
Contact: xen-devel@lists.xenproject.org
Description: If running under Xen:
The Xen version is in the format <major>.<minor><extra>
This is the <minor> part of it.

View File

@ -1,8 +1,19 @@
What: /sys/hypervisor/guest_type
Date: June 2017
KernelVersion: 4.13
Contact: xen-devel@lists.xenproject.org
Description: If running under Xen:
Type of guest:
"Xen": standard guest type on arm
"HVM": fully virtualized guest (x86)
"PV": paravirtualized guest (x86)
"PVH": fully virtualized guest without legacy emulation (x86)
What: /sys/hypervisor/pmu/pmu_mode
Date: August 2015
KernelVersion: 4.3
Contact: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Description:
Description: If running under Xen:
Describes mode that Xen's performance-monitoring unit (PMU)
uses. Accepted values are
"off" -- PMU is disabled
@ -17,7 +28,16 @@ What: /sys/hypervisor/pmu/pmu_features
Date: August 2015
KernelVersion: 4.3
Contact: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Description:
Description: If running under Xen:
Describes Xen PMU features (as an integer). A set bit indicates
that the corresponding feature is enabled. See
include/xen/interface/xenpmu.h for available features
What: /sys/hypervisor/properties/buildid
Date: June 2017
KernelVersion: 4.13
Contact: xen-devel@lists.xenproject.org
Description: If running under Xen:
Build id of the hypervisor, needed for hypervisor live patching.
Might return "<denied>" in case of special security settings
in the hypervisor.

View File

@ -14228,6 +14228,8 @@ F: drivers/xen/
F: arch/x86/include/asm/xen/
F: include/xen/
F: include/uapi/xen/
F: Documentation/ABI/stable/sysfs-hypervisor-xen
F: Documentation/ABI/testing/sysfs-hypervisor-xen
XEN HYPERVISOR ARM
M: Stefano Stabellini <sstabellini@kernel.org>

View File

@ -16,7 +16,7 @@ static inline int xen_irqs_disabled(struct pt_regs *regs)
return raw_irqs_disabled_flags(regs->ARM_cpsr);
}
#define xchg_xen_ulong(ptr, val) atomic64_xchg(container_of((ptr), \
#define xchg_xen_ulong(ptr, val) atomic64_xchg(container_of((long long*)(ptr),\
atomic64_t, \
counter), (val))

View File

@ -144,17 +144,17 @@ bool __set_phys_to_machine_multi(unsigned long pfn,
return true;
}
p2m_entry = kzalloc(sizeof(struct xen_p2m_entry), GFP_NOWAIT);
if (!p2m_entry) {
pr_warn("cannot allocate xen_p2m_entry\n");
p2m_entry = kzalloc(sizeof(*p2m_entry), GFP_NOWAIT);
if (!p2m_entry)
return false;
}
p2m_entry->pfn = pfn;
p2m_entry->nr_pages = nr_pages;
p2m_entry->mfn = mfn;
write_lock_irqsave(&p2m_lock, irqflags);
if ((rc = xen_add_phys_to_mach_entry(p2m_entry)) < 0) {
rc = xen_add_phys_to_mach_entry(p2m_entry);
if (rc < 0) {
write_unlock_irqrestore(&p2m_lock, irqflags);
return false;
}

View File

@ -43,6 +43,7 @@
#include <asm/page.h>
#include <asm/pgtable.h>
#include <asm/smap.h>
#include <xen/interface/xen.h>
#include <xen/interface/sched.h>
@ -50,6 +51,8 @@
#include <xen/interface/platform.h>
#include <xen/interface/xen-mca.h>
struct xen_dm_op_buf;
/*
* The hypercall asms have to meet several constraints:
* - Work on 32- and 64-bit.
@ -214,10 +217,12 @@ privcmd_call(unsigned call,
__HYPERCALL_DECLS;
__HYPERCALL_5ARG(a1, a2, a3, a4, a5);
stac();
asm volatile("call *%[call]"
: __HYPERCALL_5PARAM
: [call] "a" (&hypercall_page[call])
: __HYPERCALL_CLOBBER5);
clac();
return (long)__res;
}
@ -474,9 +479,13 @@ HYPERVISOR_xenpmu_op(unsigned int op, void *arg)
static inline int
HYPERVISOR_dm_op(
domid_t dom, unsigned int nr_bufs, void *bufs)
domid_t dom, unsigned int nr_bufs, struct xen_dm_op_buf *bufs)
{
return _hypercall3(int, dm_op, dom, nr_bufs, bufs);
int ret;
stac();
ret = _hypercall3(int, dm_op, dom, nr_bufs, bufs);
clac();
return ret;
}
static inline void

View File

@ -106,15 +106,83 @@ int xen_cpuhp_setup(int (*cpu_up_prepare_cb)(unsigned int),
return rc >= 0 ? 0 : rc;
}
static void clamp_max_cpus(void)
static int xen_vcpu_setup_restore(int cpu)
{
#ifdef CONFIG_SMP
if (setup_max_cpus > MAX_VIRT_CPUS)
setup_max_cpus = MAX_VIRT_CPUS;
#endif
int rc = 0;
/* Any per_cpu(xen_vcpu) is stale, so reset it */
xen_vcpu_info_reset(cpu);
/*
* For PVH and PVHVM, setup online VCPUs only. The rest will
* be handled by hotplug.
*/
if (xen_pv_domain() ||
(xen_hvm_domain() && cpu_online(cpu))) {
rc = xen_vcpu_setup(cpu);
}
void xen_vcpu_setup(int cpu)
return rc;
}
/*
* On restore, set the vcpu placement up again.
* If it fails, then we're in a bad state, since
* we can't back out from using it...
*/
void xen_vcpu_restore(void)
{
int cpu, rc;
for_each_possible_cpu(cpu) {
bool other_cpu = (cpu != smp_processor_id());
bool is_up;
if (xen_vcpu_nr(cpu) == XEN_VCPU_ID_INVALID)
continue;
/* Only Xen 4.5 and higher support this. */
is_up = HYPERVISOR_vcpu_op(VCPUOP_is_up,
xen_vcpu_nr(cpu), NULL) > 0;
if (other_cpu && is_up &&
HYPERVISOR_vcpu_op(VCPUOP_down, xen_vcpu_nr(cpu), NULL))
BUG();
if (xen_pv_domain() || xen_feature(XENFEAT_hvm_safe_pvclock))
xen_setup_runstate_info(cpu);
rc = xen_vcpu_setup_restore(cpu);
if (rc)
pr_emerg_once("vcpu restore failed for cpu=%d err=%d. "
"System will hang.\n", cpu, rc);
/*
* In case xen_vcpu_setup_restore() fails, do not bring up the
* VCPU. This helps us avoid the resulting OOPS when the VCPU
* accesses pvclock_vcpu_time via xen_vcpu (which is NULL.)
* Note that this does not improve the situation much -- now the
* VM hangs instead of OOPSing -- with the VCPUs that did not
* fail, spinning in stop_machine(), waiting for the failed
* VCPUs to come up.
*/
if (other_cpu && is_up && (rc == 0) &&
HYPERVISOR_vcpu_op(VCPUOP_up, xen_vcpu_nr(cpu), NULL))
BUG();
}
}
void xen_vcpu_info_reset(int cpu)
{
if (xen_vcpu_nr(cpu) < MAX_VIRT_CPUS) {
per_cpu(xen_vcpu, cpu) =
&HYPERVISOR_shared_info->vcpu_info[xen_vcpu_nr(cpu)];
} else {
/* Set to NULL so that if somebody accesses it we get an OOPS */
per_cpu(xen_vcpu, cpu) = NULL;
}
}
int xen_vcpu_setup(int cpu)
{
struct vcpu_register_vcpu_info info;
int err;
@ -123,11 +191,11 @@ void xen_vcpu_setup(int cpu)
BUG_ON(HYPERVISOR_shared_info == &xen_dummy_shared_info);
/*
* This path is called twice on PVHVM - first during bootup via
* smp_init -> xen_hvm_cpu_notify, and then if the VCPU is being
* hotplugged: cpu_up -> xen_hvm_cpu_notify.
* As we can only do the VCPUOP_register_vcpu_info once lets
* not over-write its result.
* This path is called on PVHVM at bootup (xen_hvm_smp_prepare_boot_cpu)
* and at restore (xen_vcpu_restore). Also called for hotplugged
* VCPUs (cpu_init -> xen_hvm_cpu_prepare_hvm).
* However, the hypercall can only be done once (see below) so if a VCPU
* is offlined and comes back online then let's not redo the hypercall.
*
* For PV it is called during restore (xen_vcpu_restore) and bootup
* (xen_setup_vcpu_info_placement). The hotplug mechanism does not
@ -135,44 +203,46 @@ void xen_vcpu_setup(int cpu)
*/
if (xen_hvm_domain()) {
if (per_cpu(xen_vcpu, cpu) == &per_cpu(xen_vcpu_info, cpu))
return;
}
if (xen_vcpu_nr(cpu) < MAX_VIRT_CPUS)
per_cpu(xen_vcpu, cpu) =
&HYPERVISOR_shared_info->vcpu_info[xen_vcpu_nr(cpu)];
if (!xen_have_vcpu_info_placement) {
if (cpu >= MAX_VIRT_CPUS)
clamp_max_cpus();
return;
return 0;
}
if (xen_have_vcpu_info_placement) {
vcpup = &per_cpu(xen_vcpu_info, cpu);
info.mfn = arbitrary_virt_to_mfn(vcpup);
info.offset = offset_in_page(vcpup);
/* Check to see if the hypervisor will put the vcpu_info
structure where we want it, which allows direct access via
a percpu-variable.
N.B. This hypercall can _only_ be called once per CPU. Subsequent
calls will error out with -EINVAL. This is due to the fact that
hypervisor has no unregister variant and this hypercall does not
allow to over-write info.mfn and info.offset.
/*
* Check to see if the hypervisor will put the vcpu_info
* structure where we want it, which allows direct access via
* a percpu-variable.
* N.B. This hypercall can _only_ be called once per CPU.
* Subsequent calls will error out with -EINVAL. This is due to
* the fact that hypervisor has no unregister variant and this
* hypercall does not allow to over-write info.mfn and
* info.offset.
*/
err = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_info, xen_vcpu_nr(cpu),
&info);
err = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_info,
xen_vcpu_nr(cpu), &info);
if (err) {
printk(KERN_DEBUG "register_vcpu_info failed: err=%d\n", err);
pr_warn_once("register_vcpu_info failed: cpu=%d err=%d\n",
cpu, err);
xen_have_vcpu_info_placement = 0;
clamp_max_cpus();
} else {
/* This cpu is using the registered vcpu info, even if
later ones fail to. */
/*
* This cpu is using the registered vcpu info, even if
* later ones fail to.
*/
per_cpu(xen_vcpu, cpu) = vcpup;
}
}
if (!xen_have_vcpu_info_placement)
xen_vcpu_info_reset(cpu);
return ((per_cpu(xen_vcpu, cpu) == NULL) ? -ENODEV : 0);
}
void xen_reboot(int reason)
{
struct sched_shutdown r = { .reason = reason };

View File

@ -1,5 +1,6 @@
#include <linux/cpu.h>
#include <linux/kexec.h>
#include <linux/memblock.h>
#include <xen/features.h>
#include <xen/events.h>
@ -10,9 +11,11 @@
#include <asm/reboot.h>
#include <asm/setup.h>
#include <asm/hypervisor.h>
#include <asm/e820/api.h>
#include <asm/xen/cpuid.h>
#include <asm/xen/hypervisor.h>
#include <asm/xen/page.h>
#include "xen-ops.h"
#include "mmu.h"
@ -20,37 +23,34 @@
void __ref xen_hvm_init_shared_info(void)
{
int cpu;
struct xen_add_to_physmap xatp;
static struct shared_info *shared_info_page;
u64 pa;
if (HYPERVISOR_shared_info == &xen_dummy_shared_info) {
/*
* Search for a free page starting at 4kB physical address.
* Low memory is preferred to avoid an EPT large page split up
* by the mapping.
* Starting below X86_RESERVE_LOW (usually 64kB) is fine as
* the BIOS used for HVM guests is well behaved and won't
* clobber memory other than the first 4kB.
*/
for (pa = PAGE_SIZE;
!e820__mapped_all(pa, pa + PAGE_SIZE, E820_TYPE_RAM) ||
memblock_is_reserved(pa);
pa += PAGE_SIZE)
;
memblock_reserve(pa, PAGE_SIZE);
HYPERVISOR_shared_info = __va(pa);
}
if (!shared_info_page)
shared_info_page = (struct shared_info *)
extend_brk(PAGE_SIZE, PAGE_SIZE);
xatp.domid = DOMID_SELF;
xatp.idx = 0;
xatp.space = XENMAPSPACE_shared_info;
xatp.gpfn = __pa(shared_info_page) >> PAGE_SHIFT;
xatp.gpfn = virt_to_pfn(HYPERVISOR_shared_info);
if (HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp))
BUG();
HYPERVISOR_shared_info = (struct shared_info *)shared_info_page;
/* xen_vcpu is a pointer to the vcpu_info struct in the shared_info
* page, we use it in the event channel upcall and in some pvclock
* related functions. We don't need the vcpu_info placement
* optimizations because we don't use any pv_mmu or pv_irq op on
* HVM.
* When xen_hvm_init_shared_info is run at boot time only vcpu 0 is
* online but xen_hvm_init_shared_info is run at resume time too and
* in that case multiple vcpus might be online. */
for_each_online_cpu(cpu) {
/* Leave it to be NULL. */
if (xen_vcpu_nr(cpu) >= MAX_VIRT_CPUS)
continue;
per_cpu(xen_vcpu, cpu) =
&HYPERVISOR_shared_info->vcpu_info[xen_vcpu_nr(cpu)];
}
}
static void __init init_hvm_pv_info(void)
@ -106,7 +106,7 @@ static void xen_hvm_crash_shutdown(struct pt_regs *regs)
static int xen_cpu_up_prepare_hvm(unsigned int cpu)
{
int rc;
int rc = 0;
/*
* This can happen if CPU was offlined earlier and
@ -121,7 +121,9 @@ static int xen_cpu_up_prepare_hvm(unsigned int cpu)
per_cpu(xen_vcpu_id, cpu) = cpu_acpi_id(cpu);
else
per_cpu(xen_vcpu_id, cpu) = cpu;
xen_vcpu_setup(cpu);
rc = xen_vcpu_setup(cpu);
if (rc)
return rc;
if (xen_have_vector_callback && xen_feature(XENFEAT_hvm_safe_pvclock))
xen_setup_timer(cpu);
@ -130,9 +132,8 @@ static int xen_cpu_up_prepare_hvm(unsigned int cpu)
if (rc) {
WARN(1, "xen_smp_intr_init() for CPU %d failed: %d\n",
cpu, rc);
return rc;
}
return 0;
return rc;
}
static int xen_cpu_dead_hvm(unsigned int cpu)
@ -154,6 +155,13 @@ static void __init xen_hvm_guest_init(void)
xen_hvm_init_shared_info();
/*
* xen_vcpu is a pointer to the vcpu_info struct in the shared_info
* page, we use it in the event channel upcall and in some pvclock
* related functions.
*/
xen_vcpu_info_reset(0);
xen_panic_handler_init();
if (xen_feature(XENFEAT_hvm_callback_vector))

View File

@ -89,8 +89,6 @@
void *xen_initial_gdt;
RESERVE_BRK(shared_info_page_brk, PAGE_SIZE);
static int xen_cpu_up_prepare_pv(unsigned int cpu);
static int xen_cpu_dead_pv(unsigned int cpu);
@ -107,35 +105,6 @@ struct tls_descs {
*/
static DEFINE_PER_CPU(struct tls_descs, shadow_tls_desc);
/*
* On restore, set the vcpu placement up again.
* If it fails, then we're in a bad state, since
* we can't back out from using it...
*/
void xen_vcpu_restore(void)
{
int cpu;
for_each_possible_cpu(cpu) {
bool other_cpu = (cpu != smp_processor_id());
bool is_up = HYPERVISOR_vcpu_op(VCPUOP_is_up, xen_vcpu_nr(cpu),
NULL);
if (other_cpu && is_up &&
HYPERVISOR_vcpu_op(VCPUOP_down, xen_vcpu_nr(cpu), NULL))
BUG();
xen_setup_runstate_info(cpu);
if (xen_have_vcpu_info_placement)
xen_vcpu_setup(cpu);
if (other_cpu && is_up &&
HYPERVISOR_vcpu_op(VCPUOP_up, xen_vcpu_nr(cpu), NULL))
BUG();
}
}
static void __init xen_banner(void)
{
unsigned version = HYPERVISOR_xen_version(XENVER_version, NULL);
@ -960,30 +929,43 @@ void xen_setup_shared_info(void)
HYPERVISOR_shared_info =
(struct shared_info *)fix_to_virt(FIX_PARAVIRT_BOOTMAP);
#ifndef CONFIG_SMP
/* In UP this is as good a place as any to set up shared info */
xen_setup_vcpu_info_placement();
#endif
xen_setup_mfn_list_list();
if (system_state == SYSTEM_BOOTING) {
#ifndef CONFIG_SMP
/*
* Now that shared info is set up we can start using routines that
* point to pvclock area.
* In UP this is as good a place as any to set up shared info.
* Limit this to boot only, at restore vcpu setup is done via
* xen_vcpu_restore().
*/
xen_setup_vcpu_info_placement();
#endif
/*
* Now that shared info is set up we can start using routines
* that point to pvclock area.
*/
if (system_state == SYSTEM_BOOTING)
xen_init_time_ops();
}
}
/* This is called once we have the cpu_possible_mask */
void xen_setup_vcpu_info_placement(void)
void __ref xen_setup_vcpu_info_placement(void)
{
int cpu;
for_each_possible_cpu(cpu) {
/* Set up direct vCPU id mapping for PV guests. */
per_cpu(xen_vcpu_id, cpu) = cpu;
xen_vcpu_setup(cpu);
/*
* xen_vcpu_setup(cpu) can fail -- in which case it
* falls back to the shared_info version for cpus
* where xen_vcpu_nr(cpu) < MAX_VIRT_CPUS.
*
* xen_cpu_up_prepare_pv() handles the rest by failing
* them in hotplug.
*/
(void) xen_vcpu_setup(cpu);
}
/*
@ -1332,9 +1314,17 @@ asmlinkage __visible void __init xen_start_kernel(void)
*/
acpi_numa = -1;
#endif
/* Don't do the full vcpu_info placement stuff until we have a
possible map and a non-dummy shared_info. */
per_cpu(xen_vcpu, 0) = &HYPERVISOR_shared_info->vcpu_info[0];
/* Let's presume PV guests always boot on vCPU with id 0. */
per_cpu(xen_vcpu_id, 0) = 0;
/*
* Setup xen_vcpu early because start_kernel needs it for
* local_irq_disable(), irqs_disabled().
*
* Don't do the full vcpu_info placement stuff until we have
* the cpu_possible_mask and a non-dummy shared_info.
*/
xen_vcpu_info_reset(0);
WARN_ON(xen_cpuhp_setup(xen_cpu_up_prepare_pv, xen_cpu_dead_pv));
@ -1431,9 +1421,7 @@ asmlinkage __visible void __init xen_start_kernel(void)
#endif
xen_raw_console_write("about to get started...\n");
/* Let's presume PV guests always boot on vCPU with id 0. */
per_cpu(xen_vcpu_id, 0) = 0;
/* We need this for printk timestamps */
xen_setup_runstate_info(0);
xen_efi_init();
@ -1451,6 +1439,9 @@ static int xen_cpu_up_prepare_pv(unsigned int cpu)
{
int rc;
if (per_cpu(xen_vcpu, cpu) == NULL)
return -ENODEV;
xen_setup_timer(cpu);
rc = xen_smp_intr_init(cpu);

View File

@ -499,7 +499,7 @@ static unsigned long __init xen_foreach_remap_area(unsigned long nr_pages,
void __init xen_remap_memory(void)
{
unsigned long buf = (unsigned long)&xen_remap_buf;
unsigned long mfn_save, mfn, pfn;
unsigned long mfn_save, pfn;
unsigned long remapped = 0;
unsigned int i;
unsigned long pfn_s = ~0UL;
@ -515,8 +515,7 @@ void __init xen_remap_memory(void)
pfn = xen_remap_buf.target_pfn;
for (i = 0; i < xen_remap_buf.size; i++) {
mfn = xen_remap_buf.mfns[i];
xen_update_mem_tables(pfn, mfn);
xen_update_mem_tables(pfn, xen_remap_buf.mfns[i]);
remapped++;
pfn++;
}
@ -530,8 +529,6 @@ void __init xen_remap_memory(void)
pfn_s = xen_remap_buf.target_pfn;
len = xen_remap_buf.size;
}
mfn = xen_remap_mfn;
xen_remap_mfn = xen_remap_buf.next_area_mfn;
}

View File

@ -1,4 +1,5 @@
#include <linux/smp.h>
#include <linux/cpu.h>
#include <linux/slab.h>
#include <linux/cpumask.h>
#include <linux/percpu.h>
@ -114,6 +115,36 @@ int xen_smp_intr_init(unsigned int cpu)
return rc;
}
void __init xen_smp_cpus_done(unsigned int max_cpus)
{
int cpu, rc, count = 0;
if (xen_hvm_domain())
native_smp_cpus_done(max_cpus);
if (xen_have_vcpu_info_placement)
return;
for_each_online_cpu(cpu) {
if (xen_vcpu_nr(cpu) < MAX_VIRT_CPUS)
continue;
rc = cpu_down(cpu);
if (rc == 0) {
/*
* Reset vcpu_info so this cpu cannot be onlined again.
*/
xen_vcpu_info_reset(cpu);
count++;
} else {
pr_warn("%s: failed to bring CPU %d down, error %d\n",
__func__, cpu, rc);
}
}
WARN(count, "%s: brought %d CPUs offline\n", __func__, count);
}
void xen_smp_send_reschedule(int cpu)
{
xen_send_IPI_one(cpu, XEN_RESCHEDULE_VECTOR);

View File

@ -14,6 +14,8 @@ extern void xen_smp_intr_free(unsigned int cpu);
int xen_smp_intr_init_pv(unsigned int cpu);
void xen_smp_intr_free_pv(unsigned int cpu);
void xen_smp_cpus_done(unsigned int max_cpus);
void xen_smp_send_reschedule(int cpu);
void xen_smp_send_call_function_ipi(const struct cpumask *mask);
void xen_smp_send_call_function_single_ipi(int cpu);

View File

@ -12,7 +12,8 @@ static void __init xen_hvm_smp_prepare_boot_cpu(void)
native_smp_prepare_boot_cpu();
/*
* Setup vcpu_info for boot CPU.
* Setup vcpu_info for boot CPU. Secondary CPUs get their vcpu_info
* in xen_cpu_up_prepare_hvm().
*/
xen_vcpu_setup(0);
@ -27,10 +28,20 @@ static void __init xen_hvm_smp_prepare_boot_cpu(void)
static void __init xen_hvm_smp_prepare_cpus(unsigned int max_cpus)
{
int cpu;
native_smp_prepare_cpus(max_cpus);
WARN_ON(xen_smp_intr_init(0));
xen_init_lock_cpu(0);
for_each_possible_cpu(cpu) {
if (cpu == 0)
continue;
/* Set default vcpu_id to make sure that we don't use cpu-0's */
per_cpu(xen_vcpu_id, cpu) = XEN_VCPU_ID_INVALID;
}
}
#ifdef CONFIG_HOTPLUG_CPU
@ -60,4 +71,5 @@ void __init xen_hvm_smp_init(void)
smp_ops.send_call_func_ipi = xen_smp_send_call_function_ipi;
smp_ops.send_call_func_single_ipi = xen_smp_send_call_function_single_ipi;
smp_ops.smp_prepare_boot_cpu = xen_hvm_smp_prepare_boot_cpu;
smp_ops.smp_cpus_done = xen_smp_cpus_done;
}

View File

@ -371,10 +371,6 @@ static int xen_pv_cpu_up(unsigned int cpu, struct task_struct *idle)
return 0;
}
static void xen_pv_smp_cpus_done(unsigned int max_cpus)
{
}
#ifdef CONFIG_HOTPLUG_CPU
static int xen_pv_cpu_disable(void)
{
@ -469,7 +465,7 @@ static irqreturn_t xen_irq_work_interrupt(int irq, void *dev_id)
static const struct smp_ops xen_smp_ops __initconst = {
.smp_prepare_boot_cpu = xen_pv_smp_prepare_boot_cpu,
.smp_prepare_cpus = xen_pv_smp_prepare_cpus,
.smp_cpus_done = xen_pv_smp_cpus_done,
.smp_cpus_done = xen_smp_cpus_done,
.cpu_up = xen_pv_cpu_up,
.cpu_die = xen_pv_cpu_die,

View File

@ -8,15 +8,10 @@
void xen_hvm_post_suspend(int suspend_cancelled)
{
int cpu;
if (!suspend_cancelled)
if (!suspend_cancelled) {
xen_hvm_init_shared_info();
xen_vcpu_restore();
}
xen_callback_vector();
xen_unplug_emulated_devices();
if (xen_feature(XENFEAT_hvm_safe_pvclock)) {
for_each_online_cpu(cpu) {
xen_setup_runstate_info(cpu);
}
}
}

View File

@ -78,7 +78,8 @@ bool xen_vcpu_stolen(int vcpu);
extern int xen_have_vcpu_info_placement;
void xen_vcpu_setup(int cpu);
int xen_vcpu_setup(int cpu);
void xen_vcpu_info_reset(int cpu);
void xen_setup_vcpu_info_placement(void);
#ifdef CONFIG_SMP

View File

@ -1303,10 +1303,9 @@ void rebind_evtchn_irq(int evtchn, int irq)
}
/* Rebind an evtchn so that it gets delivered to a specific cpu */
static int rebind_irq_to_cpu(unsigned irq, unsigned tcpu)
int xen_rebind_evtchn_to_cpu(int evtchn, unsigned tcpu)
{
struct evtchn_bind_vcpu bind_vcpu;
int evtchn = evtchn_from_irq(irq);
int masked;
if (!VALID_EVTCHN(evtchn))
@ -1338,12 +1337,13 @@ static int rebind_irq_to_cpu(unsigned irq, unsigned tcpu)
return 0;
}
EXPORT_SYMBOL_GPL(xen_rebind_evtchn_to_cpu);
static int set_affinity_irq(struct irq_data *data, const struct cpumask *dest,
bool force)
{
unsigned tcpu = cpumask_first_and(dest, cpu_online_mask);
int ret = rebind_irq_to_cpu(data->irq, tcpu);
int ret = xen_rebind_evtchn_to_cpu(evtchn_from_irq(data->irq), tcpu);
if (!ret)
irq_data_update_effective_affinity(data, cpumask_of(tcpu));

View File

@ -421,6 +421,36 @@ static void evtchn_unbind_from_user(struct per_user_data *u,
del_evtchn(u, evtchn);
}
static DEFINE_PER_CPU(int, bind_last_selected_cpu);
static void evtchn_bind_interdom_next_vcpu(int evtchn)
{
unsigned int selected_cpu, irq;
struct irq_desc *desc;
unsigned long flags;
irq = irq_from_evtchn(evtchn);
desc = irq_to_desc(irq);
if (!desc)
return;
raw_spin_lock_irqsave(&desc->lock, flags);
selected_cpu = this_cpu_read(bind_last_selected_cpu);
selected_cpu = cpumask_next_and(selected_cpu,
desc->irq_common_data.affinity, cpu_online_mask);
if (unlikely(selected_cpu >= nr_cpu_ids))
selected_cpu = cpumask_first_and(desc->irq_common_data.affinity,
cpu_online_mask);
this_cpu_write(bind_last_selected_cpu, selected_cpu);
/* unmask expects irqs to be disabled */
xen_rebind_evtchn_to_cpu(evtchn, selected_cpu);
raw_spin_unlock_irqrestore(&desc->lock, flags);
}
static long evtchn_ioctl(struct file *file,
unsigned int cmd, unsigned long arg)
{
@ -478,8 +508,10 @@ static long evtchn_ioctl(struct file *file,
break;
rc = evtchn_bind_to_user(u, bind_interdomain.local_port);
if (rc == 0)
if (rc == 0) {
rc = bind_interdomain.local_port;
evtchn_bind_interdom_next_vcpu(rc);
}
break;
}

View File

@ -278,8 +278,16 @@ static void sysrq_handler(struct xenbus_watch *watch, const char *path,
err = xenbus_transaction_start(&xbt);
if (err)
return;
if (xenbus_scanf(xbt, "control", "sysrq", "%c", &sysrq_key) < 0) {
pr_err("Unable to read sysrq code in control/sysrq\n");
err = xenbus_scanf(xbt, "control", "sysrq", "%c", &sysrq_key);
if (err < 0) {
/*
* The Xenstore watch fires directly after registering it and
* after a suspend/resume cycle. So ENOENT is no error but
* might happen in those cases.
*/
if (err != -ENOENT)
pr_err("Error %d reading sysrq code in control/sysrq\n",
err);
xenbus_transaction_end(xbt, 1);
return;
}

View File

@ -50,6 +50,35 @@ static int __init xen_sysfs_type_init(void)
return sysfs_create_file(hypervisor_kobj, &type_attr.attr);
}
static ssize_t guest_type_show(struct hyp_sysfs_attr *attr, char *buffer)
{
const char *type;
switch (xen_domain_type) {
case XEN_NATIVE:
/* ARM only. */
type = "Xen";
break;
case XEN_PV_DOMAIN:
type = "PV";
break;
case XEN_HVM_DOMAIN:
type = xen_pvh_domain() ? "PVH" : "HVM";
break;
default:
return -EINVAL;
}
return sprintf(buffer, "%s\n", type);
}
HYPERVISOR_ATTR_RO(guest_type);
static int __init xen_sysfs_guest_type_init(void)
{
return sysfs_create_file(hypervisor_kobj, &guest_type_attr.attr);
}
/* xen version attributes */
static ssize_t major_show(struct hyp_sysfs_attr *attr, char *buffer)
{
@ -327,12 +356,40 @@ static ssize_t features_show(struct hyp_sysfs_attr *attr, char *buffer)
HYPERVISOR_ATTR_RO(features);
static ssize_t buildid_show(struct hyp_sysfs_attr *attr, char *buffer)
{
ssize_t ret;
struct xen_build_id *buildid;
ret = HYPERVISOR_xen_version(XENVER_build_id, NULL);
if (ret < 0) {
if (ret == -EPERM)
ret = sprintf(buffer, "<denied>");
return ret;
}
buildid = kmalloc(sizeof(*buildid) + ret, GFP_KERNEL);
if (!buildid)
return -ENOMEM;
buildid->len = ret;
ret = HYPERVISOR_xen_version(XENVER_build_id, buildid);
if (ret > 0)
ret = sprintf(buffer, "%s", buildid->buf);
kfree(buildid);
return ret;
}
HYPERVISOR_ATTR_RO(buildid);
static struct attribute *xen_properties_attrs[] = {
&capabilities_attr.attr,
&changeset_attr.attr,
&virtual_start_attr.attr,
&pagesize_attr.attr,
&features_attr.attr,
&buildid_attr.attr,
NULL
};
@ -471,6 +528,9 @@ static int __init hyper_sysfs_init(void)
ret = xen_sysfs_type_init();
if (ret)
goto out;
ret = xen_sysfs_guest_type_init();
if (ret)
goto guest_type_out;
ret = xen_sysfs_version_init();
if (ret)
goto version_out;
@ -502,6 +562,8 @@ uuid_out:
comp_out:
sysfs_remove_group(hypervisor_kobj, &version_group);
version_out:
sysfs_remove_file(hypervisor_kobj, &guest_type_attr.attr);
guest_type_out:
sysfs_remove_file(hypervisor_kobj, &type_attr.attr);
out:
return ret;

View File

@ -299,17 +299,7 @@ static int process_msg(void)
mutex_lock(&xb_write_mutex);
list_for_each_entry(req, &xs_reply_list, list) {
if (req->msg.req_id == state.msg.req_id) {
if (req->state == xb_req_state_wait_reply) {
req->msg.type = state.msg.type;
req->msg.len = state.msg.len;
req->body = state.body;
req->state = xb_req_state_got_reply;
list_del(&req->list);
req->cb(req);
} else {
list_del(&req->list);
kfree(req);
}
err = 0;
break;
}
@ -317,6 +307,15 @@ static int process_msg(void)
mutex_unlock(&xb_write_mutex);
if (err)
goto out;
if (req->state == xb_req_state_wait_reply) {
req->msg.type = state.msg.type;
req->msg.len = state.msg.len;
req->body = state.body;
req->state = xb_req_state_got_reply;
req->cb(req);
} else
kfree(req);
}
mutex_unlock(&xs_response_mutex);

View File

@ -39,6 +39,8 @@
#include <xen/interface/sched.h>
#include <xen/interface/platform.h>
struct xen_dm_op_buf;
long privcmd_call(unsigned call, unsigned long a1,
unsigned long a2, unsigned long a3,
unsigned long a4, unsigned long a5);
@ -53,7 +55,8 @@ int HYPERVISOR_physdev_op(int cmd, void *arg);
int HYPERVISOR_vcpu_op(int cmd, int vcpuid, void *extra_args);
int HYPERVISOR_tmem_op(void *arg);
int HYPERVISOR_vm_assist(unsigned int cmd, unsigned int type);
int HYPERVISOR_dm_op(domid_t domid, unsigned int nr_bufs, void *bufs);
int HYPERVISOR_dm_op(domid_t domid, unsigned int nr_bufs,
struct xen_dm_op_buf *bufs);
int HYPERVISOR_platform_op_raw(void *arg);
static inline int HYPERVISOR_platform_op(struct xen_platform_op *op)
{

View File

@ -58,6 +58,7 @@ void evtchn_put(unsigned int evtchn);
void xen_send_IPI_one(unsigned int cpu, enum ipi_vector vector);
void rebind_evtchn_irq(int evtchn, int irq);
int xen_rebind_evtchn_to_cpu(int evtchn, unsigned tcpu);
static inline void notify_remote_via_evtchn(int port)
{

View File

@ -63,4 +63,19 @@ struct xen_feature_info {
/* arg == xen_domain_handle_t. */
#define XENVER_guest_handle 8
#define XENVER_commandline 9
struct xen_commandline {
char buf[1024];
};
/*
* Return value is the number of bytes written, or XEN_Exx on error.
* Calling with empty parameter returns the size of build_id.
*/
#define XENVER_build_id 10
struct xen_build_id {
uint32_t len; /* IN: size of buf[]. */
unsigned char buf[];
};
#endif /* __XEN_PUBLIC_VERSION_H__ */

View File

@ -15,6 +15,8 @@ static inline uint32_t xen_vcpu_nr(int cpu)
return per_cpu(xen_vcpu_id, cpu);
}
#define XEN_VCPU_ID_INVALID U32_MAX
void xen_arch_pre_suspend(void);
void xen_arch_post_suspend(int suspend_cancelled);