Commit Graph

755548 Commits

Author SHA1 Message Date
Prarit Bhargava
139dd0e07c tools/power turbostat: rename num_cores_per_pkg to num_cores_per_node
turbostat incorrectly assumes that there is one node per package.  As a
result num_cores_per_pkg is not correctly named and is actually
num_cores_per_node.

Rename num_cores_per_pkg to num_cores_per_node.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01 23:12:46 -04:00
Prarit Bhargava
8cb48b32a5 tools/power turbostat: track thread ID in cpu_topology
The code can be simplified if the cpu_topology *cpus tracks the thread
IDs.  This removes an additional file lookup and simplifies the counter
initialization code.

Add thread ID to cpu_topology information and cleanup the counter
initialization code.

v2: prevent thread_id from being overwritten

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01 23:12:46 -04:00
Prarit Bhargava
ef6057417a tools/power turbostat: Calculate additional node information for a package
The code currently assumes each package has exactly one node.  This is not
the case for AMD systems and Intel systems with COD.  AMD systems also
may re-enumerate each node's core IDs starting at 0 (for example, an AMD
processor may have two nodes, each with core IDs from 0 to 7).  In order
to properly enumerate the cores we need to track both the physical and
logical node IDs.

Add physical_node_id to track the node ID assigned by the kernel, and
logical_node_id used by turbostat to track the nodes per package ie) a
0-based count within the package.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01 23:12:46 -04:00
Len Brown
0e2d8f058f tools/power turbostat: Fix node and siblings lookup data
The turbostat code only looks at thread_siblings_list to determine if
processing units/threads are on the same the core.  This works well on
Intel systems which have a shared L1 instruction and data cache.  This
does not work on AMD systems which have shared L1 instruction cache but
separate L1 data caches.  Other utilities also check sibling's core ID
to determine if the processing unit shares the same core.

Additionally, the cpu_topology *cpus list used in topology_probe() can
be used elsewhere in the code to simplify things.

Export *cpus to the entire turbostat code, and add Processing Unit/Thread
IDs information to each cpu_topology struct.  Confirm that the thread
is on the same core as indicated by thread_siblings_list.

[v2]: Fixup CPU_* usage that caused gcc malloc error.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01 23:12:46 -04:00
Prarit Bhargava
843c57916d tools/power turbostat: set max_num_cpus equal to the cpumask length
Future fixes will use sysfs files that contain cpumask output.  The code
needs to know the length of the cpumask in order to determine which cpus
are set in a cpumask.  Currently topo.max_cpu_num is the maximum cpu
number.  It can be increased the the maximum value of cpus represented in
cpumasks.

Set max_num_cpus to the length of a cpumask.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01 23:12:46 -04:00
Chen Yu
023fe0ac97 tools/power turbostat: if --num_iterations, print for specific number of iterations
There's a use case during test to only print specific round of iterations
if --num_iterations is specified, for example, with this patch applied:

turbostat -i 5 -n 4
will capture 4 samples with 5 seconds interval.

[lenb: renamed to --num_iterations from --iterations]

Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Reviewed-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01 23:12:45 -04:00
Srinivas Pandruvada
997e53950e tools/power turbostat: Add Cannon Lake support
All MSRs related to turbostat are same as Kabylake.
Even though SDM claims that core C3 residency can be read from MSR 0x662,
the read on this MSR fails on CNL platform. Hence disabled C3 MSR read
and display.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01 23:12:45 -04:00
Len Brown
9d4eab02a7 tools/power turbostat: delete duplicate #defines
The SNB_C1_AUTO_UNDEMOTE definition should have been deleted once
it was copied into msr-index.h.  One copy of the truth is better --
particularly when Matt needs to fix it:-)

Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01 23:12:45 -04:00
Matt Turner
a00072a24a x86: msr-index.h: Correct SNB_C1/C3_AUTO_UNDEMOTE defines
According to the Intel Software Developers' Manual, Vol. 4, Order No.
335592, these macros have been reversed since they were added in the
initial turbostat commit. The reversed definitions were presumably
copied from turbostat.c to this file.

Fixes: 9c63a650bb ("tools/power/x86/turbostat: share kernel MSR #defines")
Signed-off-by: Matt Turner <mattst88@gmail.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01 23:12:45 -04:00
Matt Turner
e0d34648b4 tools/power turbostat: Correct SNB_C1/C3_AUTO_UNDEMOTE defines
According to the Intel Software Developers' Manual, Vol. 4, Order No.
335592, these macros have been reversed since they were added.

Fixes: 889facbee3 ("tools/power turbostat: v3.0: monitor Watts and Temperature")
Signed-off-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01 23:12:44 -04:00
Len Brown
0748eaf0cf tools/power turbostat: add POLL and POLL% column
Like the "C1" and "C1%" column, the new POLL and POLL% columns
show invocations and residency% during the measurement interval.

While it didn't seem important to track in the past,
we've recently found some Linux cpuidle bugs related to POLL%.

Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01 23:12:44 -04:00
Len Brown
4bd1f8f21a tools/power turbostat: Fix --hide Pk%pc10
The column header for PC10 residency is "Pk%pc10"
This is missing the 'g' that others have, eg Pkg%pc6,
to allow tab-delimited columns to fit into 8-columns.

However, --hide Pk%pc10 did not work, it was still looking for the 'g'.
This was confusing, because --list shows the correct "Pk%pc10"

Reported-by: Wendy Wang <wendy.wang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01 23:12:44 -04:00
Len Brown
be0e54c4eb tools/power turbostat: Build-in "Low Power Idle" counters support
Linux 4.15 exports the ACPI Low Power Idle Table's
counters in /sys/devices/system/cpu/cpuidle/

low_power_idle_cpu_residency_us

	Show this in the "CPU%LPI" column.

	Today this reflects the "North Complex"
	residency in PC10, so expect it to
	closely follow "Pk%pc10".

low_power_idle_system_residency_us

	Show this in the "SYS%LPI" column.

	Today, this reflects the North is in PC10,
	plus the PCH is sufficiently quiescent
	to save additional power via the "S0ix"
	system state, as measured by the
	PCH SLP_S0 counter.

Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01 23:12:40 -04:00
Laura Abbott
e29dc460d6 tools/power turbostat: Don't make man pages executable
rpm-lint flagged these as being executable:

kernel-tools.x86_64: W: spurious-executable-perm /usr/share/man/man8/turbostat.8.gz
kernel-tools.x86_64: W: spurious-executable-perm /usr/share/man/man8/x86_energy_perf_policy.8.gz

Fix this

Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01 17:15:09 -04:00
Len Brown
94d6ab4b11 tools/power turbostat: remove blank lines
When the user reuests to collect and show columns
that are not present on every row (eg. for every CPU)
turbostat still prints an (empty) line for every CPU.
Update so no blank lines are printed.

old:
	# turbostat --quiet --show Pkg%pc6
	Pkg%pc6
	9.12
	9.12

	Pkg%pc6
	9.12
	9.12

new:
	# turbostat --quiet --show Pkg%pc6
	Pkg%pc6
	9.12
	9.12
	Pkg%pc6
	9.12
	9.12

Reported-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01 17:15:09 -04:00
Artem Bityutskiy
3e8b62bf0c tools/power turbostat: a small C-states dump readability immprovement
Improve readability a little bit by changing this output:

 MSR_PKG_CST_CONFIG_CONTROL: 0x00008407 (locked: pkg-cstate-limit=7: unlimited, automatic-c-state-conversion=off)

with this output:

 MSR_PKG_CST_CONFIG_CONTROL: 0x00008407 (locked, pkg-cstate-limit=7 (unlimited), automatic-c-state-conversion=off)

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01 17:15:08 -04:00
Artem Bityutskiy
ac980e1357 tools/power turbostat: dump BDX, SKX automatic C-state conversion bit
BDX and SKX have a bit that tells them to PROMOTE shallow
C-states requests to MWAIT(C6).  It is generally a BIOS bug
if this bit is set.  As we have encountered that BIOS bug,
let's print this bit in turbostat debug output.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01 17:15:08 -04:00
Len Brown
733ef0f8e7 tools/power turbostat: do not hard-code 25MHz crystal on SKX
Some SKX use a 24 MHz crystal, so do not hard code 25 MHz.

Also, SKX crystal is not exact, because SKX uses an EMI reduction
circuit that costs a fraction of a percent.

Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01 17:15:08 -04:00
Len Brown
46c2797826 tools/power turbostat: fix possible sprintf buffer overflow
Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01 17:14:56 -04:00
Thomas Gleixner
65441ba9c4 irqchip updates for 4.18
- Support for Meson-AXG GPIO irqchip
 - Large stm32 irqchip rework (suspend/resume, hierarchical domains)
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCAAzFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAlsRaJQVHG1hcmMuenlu
 Z2llckBhcm0uY29tAAoJECPQ0LrRPXpDCgEP/i92dogvXyOJpdAaxdxM1Z0y5ICU
 qv6pvvc89BnStILDX7rFbbdCSi+yv3ZUuPKtcH0KkISpC8Mj0eGhxcPjfMqTwkgS
 q3f996XV6wDzgYx+8miboreaVkjsApPl2gFxwnPoP+H5uyeOq187c96dCQvU+PKJ
 4syTcgRm32xg+jKII+7zWs52EHM3yAKGWI2pOmn8XOhpefyucjNKUcJlRAyqn1Yc
 /FofTuVpjRfkF1KxLMFJLXLbeRVyvvFJ4UgNWlPjBiinghecVdf+NQhdiIW4fHFH
 /meVMPnmJXGhu8JRHsRUO+iAVEeGEpM9tReDY15M1VibTsb/WqcgVeoTf9MFUKdg
 uFIV9Y5biPeuPqOlupQMVFYIvKRomJyJiF3bEiTHsCqgJrRb0tAWYkjpNr4xti8W
 ec4JxAjRpvupxzji4bCpQOEMkqNBH3gU3Wt9aTKRjleN8dvd92NSUCrXgZTJnl3w
 0wNz9dlBbsGN8shmY4okIwEghXtC5K3u/g5uQfyVLLJ/G0upO5iwh4z3tY9sEX0n
 CosLwnJi444+EVeBQrD3Y2oSBmLaKN1NAiOyhLcni+C6U8sEMixNSlQrRqi18K1G
 ScD2Wb0VCI/zeNmUz4LruDvvcynNrZ25bk95zro6Vsk+psMSp5XWfzuZKHOTIjAW
 3ynL0fo8AQsKi/jq
 =OsmY
 -----END PGP SIGNATURE-----

Merge tag 'irqchip-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core

Pull irqchip updates for 4.18 from Marc Zyngier:

 - Support for Meson-AXG GPIO irqchip

 - Large stm32 irqchip rework (suspend/resume, hierarchical domains)
2018-06-01 22:19:50 +02:00
Arnaldo Carvalho de Melo
0b3a18387f perf tools intel-pt-decoder: Update insn.h from the kernel sources
To pick up the changes in:

  ee6a7354a3 ("kprobes/x86: Prohibit probing on exception masking instructions")

That doesn't entail changes in tooling, but silences this perf build
warning:

  Warning: Intel PT: x86 instruction decoder header at 'tools/perf/util/intel-pt-decoder/insn.h' differs from latest version at 'arch/x86/include/asm/insn.h'

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-o3wfwjnyh7r8l0gi9q3y9f44@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-06-01 16:13:18 -03:00
Arnaldo Carvalho de Melo
a20d23bb7b tools headers: Sync x86 cpufeatures.h with the kernel sources
To pick up changes found in these csets:

 11fb068349 x86/speculation: Add virtualized speculative store bypass disable support
 d1035d9718 x86/cpufeatures: Add FEATURE_ZEN
 52817587e7 x86/cpufeatures: Disentangle SSBD enumeration
 7eb8956a7f x86/cpufeatures: Disentangle MSR_SPEC_CTRL enumeration from IBRS
 e7c587da12 x86/speculation: Use synthetic bits for IBRS/IBPB/STIBP
 9f65fb2937 x86/bugs: Rename _RDS to _SSBD
 764f3c2158 x86/bugs/AMD: Add support to disable RDS on Fam[15,16,17]h if requested
 24f7fc83b9 x86/bugs: Provide boot parameters for the spec_store_bypass_disable mitigation
 0cc5fa00b0 x86/cpufeatures: Add X86_FEATURE_RDS
 c456442cd3 x86/bugs: Expose /sys/../spec_store_bypass

The usage of this file in tools doesn't use the newly added X86_FEATURE_
defines:

  CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
  CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o
  LD       /tmp/build/perf/bench/perf-in.o
  LD       /tmp/build/perf/perf-in.o

Silencing this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-mrwyauyov8c7s048abg26khg@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-06-01 16:13:16 -03:00
Arnaldo Carvalho de Melo
63b89a19cc tools headers: Synchronize prctl.h ABI header
To pick up changes from:

  $ git log --oneline -2 -i include/uapi/linux/prctl.h
  356e4bfff2 prctl: Add force disable speculation
  b617cfc858 prctl: Add speculation control prctls

  $ tools/perf/trace/beauty/prctl_option.sh > before.c
  $ cp include/uapi/linux/prctl.h tools/include/uapi/linux/prctl.h
  $ tools/perf/trace/beauty/prctl_option.sh > after.c
  $ diff -u before.c after.c
  --- before.c	2018-06-01 10:39:53.834073962 -0300
  +++ after.c	2018-06-01 10:42:11.307985394 -0300
  @@ -35,6 +35,8 @@
          [42] = "GET_THP_DISABLE",
          [45] = "SET_FP_MODE",
          [46] = "GET_FP_MODE",
  +       [52] = "GET_SPECULATION_CTRL",
  +       [53] = "SET_SPECULATION_CTRL",
   };
   static const char *prctl_set_mm_options[] = {
 	  [1] = "START_CODE",
  $

This will be used by 'perf trace' to show these strings when beautifying
the prctl syscall args. At some point we'll be able to say something
like:

	'perf trace --all-cpus -e prctl(option=*SPEC*)'

To filter by arg by name.

  This silences this warning when building tools/perf:

    Warning: Kernel ABI header at 'tools/include/uapi/linux/prctl.h' differs from latest version at 'include/uapi/linux/prctl.h'

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-zztsptwhc264r8wg44tqh5gp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-06-01 16:13:15 -03:00
Arnaldo Carvalho de Melo
0d690fc043 perf trace beauty prctl: Default header_dir to cwd to work without parms
Useful when checking the effects of header synchs for the files it uses
as a input to generate string tables, in retrospect this is how it
should've been done from day 1, not requiring the header_dir to be set
on the Makefile, will change everything later, so that the only parm,
common to all generators will be $(srctree) and $(beauty_outdir).

So, to see what it generates, just call it without any parameters:

  $ tools/perf/trace/beauty/prctl_option.sh
  static const char *prctl_options[] = {
	  [1] = "SET_PDEATHSIG",
	  [2] = "GET_PDEATHSIG",
	  [3] = "GET_DUMPABLE",
	  [4] = "SET_DUMPABLE",
	  [5] = "GET_UNALIGN",
	  [6] = "SET_UNALIGN",
	  [7] = "GET_KEEPCAPS",
	  [8] = "SET_KEEPCAPS",
	  [9] = "GET_FPEMU",
	  [10] = "SET_FPEMU",
	  [11] = "GET_FPEXC",
	  [12] = "SET_FPEXC",
	  [13] = "GET_TIMING",
	  [14] = "SET_TIMING",
	  [15] = "SET_NAME",
	  [16] = "GET_NAME",
	  [19] = "GET_ENDIAN",
	  [20] = "SET_ENDIAN",
	  [21] = "GET_SECCOMP",
	  [22] = "SET_SECCOMP",
	  [25] = "GET_TSC",
	  [26] = "SET_TSC",
	  [27] = "GET_SECUREBITS",
	  [28] = "SET_SECUREBITS",
	  [29] = "SET_TIMERSLACK",
	  [30] = "GET_TIMERSLACK",
	  [35] = "SET_MM",
	  [36] = "SET_CHILD_SUBREAPER",
	  [37] = "GET_CHILD_SUBREAPER",
	  [38] = "SET_NO_NEW_PRIVS",
	  [39] = "GET_NO_NEW_PRIVS",
	  [40] = "GET_TID_ADDRESS",
	  [41] = "SET_THP_DISABLE",
	  [42] = "GET_THP_DISABLE",
	  [45] = "SET_FP_MODE",
	  [46] = "GET_FP_MODE",
  };
  static const char *prctl_set_mm_options[] = {
	  [1] = "START_CODE",
	  [2] = "END_CODE",
	  [3] = "START_DATA",
	  [4] = "END_DATA",
	  [5] = "START_STACK",
	  [6] = "START_BRK",
	  [7] = "BRK",
	  [8] = "ARG_START",
	  [9] = "ARG_END",
	  [10] = "ENV_START",
	  [11] = "ENV_END",
	  [12] = "AUXV",
	  [13] = "EXE_FILE",
	  [14] = "MAP",
	  [15] = "MAP_SIZE",
  };
  $

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-qtotspuztydjttxi7k6mec6h@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-06-01 16:13:06 -03:00
Daniele Palmas
9f7c728332 net: usb: cdc_mbim: add flag FLAG_SEND_ZLP
Testing Telit LM940 with ICMP packets > 14552 bytes revealed that
the modem needs FLAG_SEND_ZLP to properly work, otherwise the cdc
mbim data interface won't be anymore responsive.

Signed-off-by: Daniele Palmas <dnlplm@gmail.com>
Acked-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-01 14:01:42 -04:00
David S. Miller
8a11801581 Merge branch 'tunnel-mtus'
Nicolas Dichtel says:

====================
ip[6] tunnels: fix mtu calculations

The first patch restores the possibility to bind an ip4 tunnel to an
interface whith a large mtu.
The second patch was spotted after the first fix. I also target it to net
because it fixes the max mtu value that can be used for ipv6 tunnels.

v2: remove the 0xfff8 in ip_tunnel_newlink()
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-01 13:56:31 -04:00
Nicolas Dichtel
f7ff1fde94 ip6_tunnel: remove magic mtu value 0xFFF8
I don't know where this value comes from (probably a copy and paste and
paste and paste ...).
Let's use standard values which are a bit greater.

Link: https://git.kernel.org/pub/scm/linux/kernel/git/davem/netdev-vger-cvs.git/commit/?id=e5afd356a411a
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-01 13:56:30 -04:00
Nicolas Dichtel
82612de1c9 ip_tunnel: restore binding to ifaces with a large mtu
After commit f6cc9c054e, the following conf is broken (note that the
default loopback mtu is 65536, ie IP_MAX_MTU + 1):

$ ip tunnel add gre1 mode gre local 10.125.0.1 remote 10.125.0.2 dev lo
add tunnel "gre0" failed: Invalid argument
$ ip l a type dummy
$ ip l s dummy1 up
$ ip l s dummy1 mtu 65535
$ ip tunnel add gre1 mode gre local 10.125.0.1 remote 10.125.0.2 dev dummy1
add tunnel "gre0" failed: Invalid argument

dev_set_mtu() doesn't allow to set a mtu which is too large.
First, let's cap the mtu returned by ip_tunnel_bind_dev(). Second, remove
the magic value 0xFFF8 and use IP_MAX_MTU instead.
0xFFF8 seems to be there for ages, I don't know why this value was used.

With a recent kernel, it's also possible to set a mtu > IP_MAX_MTU:
$ ip l s dummy1 mtu 66000
After that patch, it's also possible to bind an ip tunnel on that kind of
interface.

CC: Petr Machata <petrm@mellanox.com>
CC: Ido Schimmel <idosch@mellanox.com>
Link: https://git.kernel.org/pub/scm/linux/kernel/git/davem/netdev-vger-cvs.git/commit/?id=e5afd356a411a
Fixes: f6cc9c054e ("ip_tunnel: Emit events for post-register MTU changes")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-01 13:56:29 -04:00
David S. Miller
ccfde6e27d Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:

====================
pull request (net): ipsec 2018-05-31

1) Avoid possible overflow of the offset variable
   in  _decode_session6(), this fixes an infinite
   lookp there. From Eric Dumazet.

2) We may use an error pointer in the error path of
   xfrm_bundle_create(). Fix this by returning this
   pointer directly to the caller.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-01 13:25:41 -04:00
Bastian Germann
c9bdf29154 hwmon: (asus_atk0110) Make use of device managed memory
Use devm_* variants of kstrdup and kzalloc. Get rid of kfree cleanups.

Signed-off-by: Bastian Germann <bastiangermann@fishpost.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2018-06-01 09:38:36 -07:00
Bastian Germann
3c60726d21 hwmon: (asus_atk0110) Replace deprecated device register call
Make the asus_atk0110 driver use hwmon_device_register_with_groups instead
of the deprecated hwmon_device_register.
Construct the expected attribute_group array from the sensor list which
contains all needed attributes.
Remove the manual sysfs file creation and deletion that are now taken care
of by the (un)register calls via the attribute_group array.

Signed-off-by: Bastian Germann <bastiangermann@fishpost.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2018-06-01 09:34:54 -07:00
Len Brown
fd3933ca7b tools/power turbostat: fix MSR_IA32_MISC_ENABLE MWAIT printout
MSR_IA32_MISC_ENABLE[18] is the MWAIT ENABLE bit, not DISABLE bit...

so

MSR_IA32_MISC_ENABLE: 0x00850089 (TCC EIST No-MWAIT PREFETCH TURBO)

should print as:

MSR_IA32_MISC_ENABLE: 0x00850089 (TCC EIST MWAIT PREFETCH TURBO)

Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01 12:13:06 -04:00
Artem Bityutskiy
47936f944e tools/power turbostat: fix printing on input
The recent patch that implements table printing on a keypress introduced a
regression - turbostat prints the table almost continuously if it is run from a
daemon program.

The problem is also easy to reproduce like this:

echo | turbostat

The reason is that we cannot assume that stdin is always a TTY. It can be many
things.

This patch adds fixes the problem by limiting the new keypress functionality to
TTYs only. If stdin is not a TTY, we just sleep for the full interval time.

While on it, clean-up 'do_sleep()' to return no value, as callers do not expect
that anyway.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01 12:13:05 -04:00
Len Brown
b9ad8ee0da tools/power turbostat: end current interval upon newline input
In turbostat interval mode, a newline typed on standard input
will now conclude the current interval.  Data will immediately
be collected and printed for that interval, and the next interval
will be started.

This is similar to the recently added SIGUSR1 feature.
But that is for use by programs, while this is for interactive use.

Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01 12:13:05 -04:00
Len Brown
072119606a tools/power turbostat: on SIGUSR1: sample, print and continue
Interval-mode turbostat now catches and discards SIGUSR1.

Thus, SIGUSR1 can be used to tell turbostat to cut short
the current measurement interval.  Turbostat will then start
the next measurement interval using the regular interval length.

This can be used to give turbostat variable intervals.
Invoke turbostat with --interval LARGE_NUMBER_SEC
and have a program that has permission to send it a SIGUSR1
always before LARGE_NUMBER_SEC expires.

It may also be useful to use "--enable Time_Of_Day_Seconds"
to observe the actual interval length.

Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01 12:13:04 -04:00
Len Brown
8aa2ed0b28 tools/power turbostat: on SIGINT: sample, print and exit
When running in interval-mode, catch interrupts
and print a final data record before exiting.

Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01 12:13:04 -04:00
Len Brown
3f44a5c62b tools/power turbostat: add --enable Time_Of_Day_Seconds
Add a Time_Of_Day_Seconds column showing when measurement
for each row was completed.  Units are [sec.subsec] since Epoch,
as reported by gettimeofday(2).

While useful to correlate turbostat output with other tools,
this built-in column is disabled, by default.

Add the "--enable" option to enable such disabled-by-default
built-in columns:

"--enable Time_Of_Day_Seconds"
"--enable usec"

"--enable all", will enable all disabled-by-defauilt built-in counters.

When "--debug" is used, all disabled-by-default columns are enabled,
unless explicitly skipped using "--hide"

Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01 12:13:04 -04:00
Artem Bityutskiy
2085e12441 tools/power turbostat: fix Skylake Xeon package C-state display
Turbostat neglects to display all package C-states for some Skylake Xeon BIOS configurations.

This is due to a typo in the table decoding MSR_PKG_CST_CONFIG_CONTROL (0x000000e2)

Here we fix that typo, according to Intel SDM, vol 4, Table 2-41 -
"MSRs Supported by Intel® Xeon® Processor Scalable Family with DisplayFamily_DisplayModel 06_55H".

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01 12:13:03 -04:00
Len Brown
41a233dcbe MAINTAINERS: add turbostat utility
Signed-off-by: Len Brown <len.brown@intel.com>
2018-06-01 12:12:49 -04:00
Colin Ian King
fb8eefd3b4 hwmon: (k10temp) Make function get_raw_temp static
The function get_raw_temp is local to the source and does not need to
be in global scope, so make it static.

Cleans up sparse warning:
drivers/hwmon/k10temp.c:149:14: warning: symbol 'get_raw_temp' was not
declared. Should it be static?

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2018-06-01 08:56:35 -07:00
Damien Thébault
a95691bc54 net: dsa: b53: Add BCM5389 support
This patch adds support for the BCM5389 switch connected through MDIO.

Signed-off-by: Damien Thébault <damien.thebault@vitec.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-01 11:15:42 -04:00
Javier González
9cfd5a9538 lightnvm: pblk: take bitmap alloc. out of critical section
pblk allocates line bitmaps within the line lock unnecessarily. In order
to take pressure out of the fast patch, allocate line bitmaps outside
of this lock and refactor accordingly.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <mb@lightnvm.io>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01 09:02:53 -06:00
Hans Holmberg
cc9c9a00b1 lightnvm: pblk: kick writer on new flush points
Unless we kick the writer directly when setting a new flush point, the
user risks having to wait for up to one second (the default timeout for
the write thread to be kicked) for the IO to complete.

Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com>
Signed-off-by: Matias Bjørling <mb@lightnvm.io>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01 09:02:53 -06:00
Hans Holmberg
b06be2873d lightnvm: pblk: only try to recover lines with written smeta
When switching between different lun configurations, there is no
guarantee that all lines that contain closed/open chunks have some
valid data to recover.

Check that the smeta chunk has been written to instead. Also
skip bad lines (that does not have enough good chunks).

Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com>
Signed-off-by: Matias Bjørling <mb@lightnvm.io>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01 09:02:53 -06:00
Javier González
87cc40bbe3 lightnvm: pblk: remove unnecessary bio_get/put
In the read path, pblk gets a reference to the incoming bio and puts it
after ending the bio. Though this behavior is correct, it is unnecessary
since pblk is the one putting the bio, therefore, it cannot disappear
underneath it.

Removing this reference, allows to clean up rqd->bio and avoids pointer
bouncing for the different read paths. Now, the incoming bio always
resides in the read context and pblk's internal bios (if any) reside in
rqd->bio.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <mb@lightnvm.io>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01 09:02:53 -06:00
Marcin Dziegielewski
4a828884c6 lightnvm: pblk: add possibility to set write buffer size manually
In some cases, users can want set write buffer size manually, e.g. to
adjust it to specific workload. This patch provides the possibility
to set write buffer size via module parameter feature.

Signed-off-by: Marcin Dziegielewski <marcin.dziegielewski@intel.com>
Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
Signed-off-by: Matias Bjørling <mb@lightnvm.io>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01 09:02:53 -06:00
Igor Konopko
fbadca7396 lightnvm: fix partial read error path
When error occurs during bio_add_page on partial read path, pblk
tries to free pages twice.

Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
Signed-off-by: Marcin Dziegielewski <marcin.dziegielewski@intel.com>
Signed-off-by: Matias Bjørling <mb@lightnvm.io>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01 09:02:53 -06:00
Igor Konopko
f142ac0b5d lightnvm: proper error handling for pblk_bio_add_pages
Currently in case of error caused by bio_pc_add_page in
pblk_bio_add_pages two issues occur when calling from
pblk_rb_read_to_bio(). First one is in pblk_bio_free_pages, since we
are trying to free pages not allocated from our mempool. Second one
is the warn from dma_pool_free, that we are trying to free NULL
pointer dma.

This commit fix both issues.

Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
Signed-off-by: Marcin Dziegielewski <marcin.dziegielewski@intel.com>
Signed-off-by: Matias Bjørling <mb@lightnvm.io>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01 09:02:53 -06:00
Hans Holmberg
6cf17a2f83 lightnvm: pblk: fix smeta write error path
Smeta write errors were previously ignored. Skip these
lines instead and throw them back on the free
list, so the chunks will go through a reset cycle
before we attempt to use the line again.

Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com>
Reviewed-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <mb@lightnvm.io>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01 09:02:53 -06:00
Hans Holmberg
48b8d20895 lightnvm: pblk: garbage collect lines with failed writes
Write failures should not happen under normal circumstances,
so in order to bring the chunk back into a known state as soon
as possible, evacuate all the valid data out of the line and let the
fw judge if the block can be written to in the next reset cycle.

Do this by introducing a new gc list for lines with failed writes,
and ensure that the rate limiter allocates a small portion of
the write bandwidth to get the job done.

The lba list is saved in memory for use during gc as we
cannot gurantee that the emeta data is readable if a write
error occurred.

Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com>
Reviewed-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <mb@lightnvm.io>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-06-01 09:02:53 -06:00