Commit Graph

4241 Commits

Author SHA1 Message Date
Simeon Ehrig
50bc19f6b5 Add some complex cuda template kernel test cases. 2018-06-25 08:29:07 +02:00
Simeon Ehrig
2011246c17 Overwork CUDA device tests.
- add cudaDeviceSynchronize() at every kernel launch
- remove small address bug at cudaMemcpy, if host array is used
- in parallel test cases, replace fixes thread number with variable
- overworked shared memory kernel
2018-06-25 08:29:07 +02:00
Simeon Ehrig
e7b0e22ae8 Add test cases for CUDA features.
- CUDA __constant__ memory
- CUDA global __device__ memory
- CUDA __host__ prefix
- CUDA kernel launch with arguments
- CUDA templated kernels
- CUDA shared memory with dynamic runtime
- CUDA Streams
- test if CUDA device is available
2018-06-25 08:29:07 +02:00
Simeon Ehrig
d14ab2daec Improve the search of the clang instance for CUDA.
Before, it was not possible to find the clang++, which is contained in the cling, if we don't start the clang from the bin folder ('./cling -xcuda'). Now, for example it is possible to start the cling with 'bin/cling -xcuda' .

Fix a Bug, which avoid to start './cling -xcuda -fsyntax-only'.
2018-06-25 08:29:07 +02:00
Simeon Ehrig
2b61966c0d Fix temp-path bug.
In some cases, the path of the cling temp folder contains some non printable chars at the end.
Change the handling of the path string, to solve this problem.
2018-06-25 08:29:07 +02:00
simeon
b87dfcf9b1 Improve Error Message. 2018-06-25 08:29:07 +02:00
Simeon Ehrig
eb5cfc5f3f The CUDA compiler can handle variable declarations from the prompt.
Now, it is possible to declare variables on the prompt, which are visible for other statements.

The problems was, that cling wrapped all statements in a function, to get valid input. So, every variable is just visible inside in its own wrapper function. To solve the problem, cling change the local variable declaration to a global declaration.

The implementation for the CUDA compiler checks, if the unwrapping is happened. If it happened, the c++ code of the unwrapped variable declaration (AST-printer) instead the raw input will be written to the .cu-file.
2018-06-25 08:29:07 +02:00
Simeon Ehrig
c18f074cbd Add workaround for clang PCH-bug
At the moment, to extend the AST-tree of the device code, we use PCH-files to extend the exist device code with new lines of code. In detail, if we want to create a new PTX-file, we use the CUDA code (.cu file) and a PCH-file with the existing AST as input and generate an new PCH-file, which contains the whole AST. Then, the PCH-file will compiled to a PTX-file.

A bug in clang prevent, that we can’t generate more than 5 new PCH-files. The bug is not easy to fix, so I write a small workaround. Instead using a PCH-file, which contains the AST, we generate a new complete AST from all .cu-files every time.

The workaround is temporary and should removed, if clang is patched.
2018-06-25 08:29:07 +02:00
Simeon Ehrig
454c359c51 Setting the arguments of cling to clang nvptx and fatbinary are possible.
Now, it is possible to set some arguments of the clang nvptx and fatbinary via arguments at cling start. The arguments are filtered. So not every argument is possible at the moment. The Arguments can’t changed during runtime, because the PCH-files forbid it. For Example, the calng nvptx use the optimization level, which is set at start of cling.

At the moment, the debug options of clang nvptx are simple. If any debug option is detected, just a -g will add to the clang nvptx.

Additional PTX options for clang nvptx doesn’t works at the moment. There is a problem at parsing at the start of cling.
2018-06-25 08:29:07 +02:00
Simeon Ehrig
4882fbe886 Overwork of the include path handling of the CUDA device compiler.
I replaced copies of the include paths with a pointer to the headerSearchOptions. Now, explicit handling of the include paths is not more necessary. Add include paths, which was declared via argument at start also works.
2018-06-25 08:29:07 +02:00
Simeon Ehrig
58d99cf4f9 CUDA device compiler can use include paths from cling runtime.
This new function allows the CUDA device compiler to search after headers in folders, which was declared at runtime vie .I command.
2018-06-25 08:29:07 +02:00
Simeon Ehrig
6652d1a7c9 Add CUDA device compiler, which allows to generate CUDA PTX Code on runtime.
The class IncrementalCUDADeviceCompiler use external tools to generate PTX and cuda fatbin files. It runs the tools clang and fatbinary via llvm::sys::ExecuteAndWait. The class also handle to include new code in existing code. The steps of the compiler pipeline are:
- clang: CUDA C++ + previous PCH -> PCH
- clang: PCH -> PTX
- fatbinary: PTX -> fatbin

There is no selection of code. Every input of the cling will pass to the  IncrementalCUDADeviceCompiler.
2018-06-25 08:29:07 +02:00
Simeon Ehrig
90454f964a Added function to detect c++ attributes at function definition.
Now, it is possible to define functions with c++ attributes without the .rawInput mode. For example functions like `[[ noreturn ]] foo() { ... }` or `[[deprecated]] [[nodiscard]] int bar(){ … }`.
2018-06-19 13:44:58 +02:00
Axel Naumann
c33c5fb033 cling cpt travis fold: misspelled "end". 2018-06-14 10:14:12 +02:00
Axel Naumann
fa5aa3b2cc cling cpt: make sure TRAVIS_BUILD_DIR is found if set. 2018-06-14 09:44:07 +02:00
Axel Naumann
203956e373 cling README: typo. Thanks, Damien L-G! 2018-06-14 09:38:52 +02:00
Axel Naumann
9cf018c299 cling travis: create a new log section when *running*...
... not when echoing what is going to be run.
2018-06-13 17:44:09 +02:00
Axel Naumann
0302690afe cling travis: CMake and travis are uncooperative, cannot use ccache with clang:
COMPILER="ccache clang" gets lost in CMake; using ccache does not work as there is no ccache-wrapper for clang-3.9.
So just use clang-3.9 without ccache.
2018-06-13 17:29:08 +02:00
Axel Naumann
f82914110f cling travis: convince ccache to kick in. 2018-06-13 15:44:07 +02:00
Axel Naumann
79acadaf1d cling travis: specify osx image. Fix compiler name for a build. 2018-06-13 15:44:07 +02:00
Axel Naumann
931612fb1a cling travis: teach travis to do two brew tasks. Again. 2018-06-13 12:59:07 +02:00
Axel Naumann
4f1d3d7d22 cling travis: move cpt log fold into cpt.py. 2018-06-13 12:14:13 +02:00
Axel Naumann
7a5c13431a cling travis: travis merged two lines into one command. 2018-06-13 12:14:13 +02:00
Yuka Takahashi
891b279bcc Revert "Revert "Add the cwg to the prebuilt module cache path." (#2160)"
This reverts commit 011aa8200277cd31957e222afd9b37415458b31f.

This is a revert of revert. I reverted the first commit because adding
"." to prebuiltmodulepath was causing failure in runtime modules, but
now we're skipping "." in TCling::LazyFunctionCreatorAutoloadForModule so
doesn't matter even if we have ".".
2018-06-12 22:59:08 +02:00
Yuka Takahashi
befa982fe3 Fix nightlies by autoload dependency libraries
We had test failures in runtime nightlies such as this one:
https://epsft-jenkins.cern.ch/view/ROOT/job/root-nightly-runtime-cxxmodules/95/BUILDTYPE=Debug,COMPILER=gcc62,LABEL=slc6/testReport/junit/projectroot.roottest.root.math/smatrix/roottest_root_math_smatrix_testKalman/

Failures were due to what @pcanal commented in #2135, that some so files in
roottest doesn't have external linkage. (It means that if you call
    dlopen(libfoo.so), linux kernel can't find dependency libraries and it
    emits "undefined symbol" error when they try to initialize global
    variables in libfoo.so but couldn't find symbol definition)
With pch, rootmap files were providing information about the depending library.

However we stopped generating rootmap files in #2127 and that's why we
got these failures. To fix this issue, I implemented a callback to
TCling which gets called when DynamicLibraryManager fails. The callback
pass error message to TCling and it handles message if it contains "undefined error".
2018-06-12 22:59:08 +02:00
Axel Naumann
2d17c46c83 cling travis: travis_fold:begin: is spelled travis_fold:start:. 2018-06-12 21:29:07 +02:00
Axel Naumann
6d9cdb9b4f cling travis: add "compiler" tag to GCC-7@Mac build. 2018-06-12 21:29:07 +02:00
Axel Naumann
e2fa4916c9 cling travis: remove stray "--overwrite". 2018-06-12 21:29:07 +02:00
Axel Naumann
e9c2cfcb61 cling travis: oclint is in the way for brew install gcc@7.
See https://github.com/travis-ci/travis-ci/issues/8826
2018-06-12 21:06:16 +02:00
Axel Naumann
58b6f65d2b cling travis: Install coreutils; Trusty has no "timeout" apt package. 2018-06-12 21:06:16 +02:00
Axel Naumann
69f1fa243b cling travis: help find timeout:
Even though timeout existed, the script decided to call gtimeout on Linux - which does not exit.
2018-06-12 21:06:16 +02:00
Axel Naumann
52c4e692ce cling travis: remove leading "@"; no idea why the example had it. 2018-06-12 21:06:16 +02:00
Axel Naumann
2d768ab1ad cling travis: indent of comments. 2018-06-12 16:09:10 +02:00
Axel Naumann
c7ac88c7b2 cling travis: add mac os ccache ot PATH. 2018-06-12 14:44:11 +02:00
Axel Naumann
436b28ca62 cling-travis: update images, less builds, newer comp, log folding. 2018-06-12 12:44:08 +02:00
Axel Naumann
681e0c5e1d Update cling readme (cling issue #210). 2018-06-12 10:44:10 +02:00
Saagar Jha
d57fbe37d5 Handle Control+C and Control+D (mostly) correctly 2018-06-12 08:41:07 +02:00
Saagar Jha
68c93f9743 Fix unfortunate typo 2018-06-12 08:41:03 +02:00
Guilherme Amadio
cbaac95d07 Replace deprecated std::ptr_fun and std::mem_fun with equivalent code
Removed in C++17. No functionality change intended.
2018-06-12 08:40:00 +02:00
Bertrand Bellenot
a561c61888 disable unicode (UTF-8) in the console for the time being, since it causes problems on Windows 10 2018-06-12 08:40:00 +02:00
Raphael Isemann
345a13cef6 Don't assume fileno is a function.
This otherwise leads to compilation errors when we do `::fileno`
as OpenBSD implemented fileno as a macro (which seems to be allowed).
2018-06-12 08:40:00 +02:00
Bertrand Bellenot
ed135557e7 disambiguate the else branch 2018-06-12 08:40:00 +02:00
Bertrand Bellenot
c4990cff45 Fix uninitialized fDefaultAttributes (makes terminal black on black) 2018-06-12 08:40:00 +02:00
Yuka Takahashi
0d478d0fc4 Revert "Add the cwg to the prebuilt module cache path." (#2160)
This reverts commit 5298b418eec4129351888f41cb7c3bfc90161e22.

This commit was mistakenly committed. PR was opened in #1730, but it was
closed and moved to #1761. I didn't notice this and created another PR
in #1980.

This change was causing 100+ failures in runtime cxxmodules nightlies.
(Eg. https://epsft-jenkins.cern.ch/job/root-pullrequests-build/29183/testReport/junit/projectroot/runtutorials/tutorial_fit_FittingDemo/)
We want to have **proper** PrebuildModulesPaths which information were
extracted from LD_LIBRARY_PATH and DYLD_LIBRARY_PATH, not a random ".".

Because of this commit, we were trying to autoload libraries generated
by roottest on-demand (for example "./h1analysisTreeReader_C.so") This
is not an intentional behavior, these autogenerated libraries are
already loaded by roottest and what we want to do is to load **proper**
libraries like libHist.so instead.
2018-06-07 15:59:45 +02:00
Yuka Takahashi
35a8988d50 Autoload less libraries
In previous allmodules&autoloading patch, we used callback from
DeserializationListener to get Decl and loaded corresponding libraries.
It worked, but the performance was bad because ROOT was loading
excessive libraries.

In this patch, we use TCling::LazyFunctionCreatorAutoloadForModule. This
function gets callback when "mangled_name" was not found in loaded
libraries thus we have to the load corresponding library and lookup
again.

I used unordered_map to store mangled identifier and library pair. I'm
doing an optimization by hashing mangled name and storing library not by
name but by uint8 and hold uint8-name information in another vector.
Also tried std::map but unorderd_map was more performant. There are
better hash table like:
https://probablydance.com/2018/05/28/a-new-fast-hash-table-in-response-to-googles-new-fast-hash-table/
we can try to use them if this part gets crucial.

With this patch:
```
Processing tutorials/hsimple.C...
hsimple   : Real Time =   0.04 seconds Cpu Time =   0.03 seconds
(TFile *) 0x562b37a14fe0
Processing /home/yuka/CERN/ROOT/memory.C...
cpu  time = 0.362307 seconds
sys  time = 0.039741 seconds
res  memory = 278.215 Mbytes
vir  memory = 448.973 Mbytes
```

W/o this patch:
```
Processing tutorials/hsimple.C...
hsimple   : Real Time =   0.08 seconds Cpu Time =   0.07 seconds
(TFile *) 0x5563018a1d30
Processing /home/yuka/CERN/ROOT/memory.C...
cpu  time = 1.524314 seconds
sys  time = 0.157075 seconds
res  memory = 546.867 Mbytes
vir  memory = 895.184 Mbytes
```

So it improves time by 4x times and memory by 2x.
2018-06-02 10:44:38 +02:00
Axel Naumann
eb9fbe9f2c Unroll the pointer-check cache loop. 2018-05-24 23:14:32 +02:00
Nathan Daly
f2bcf29b1c Allow cpt.py to handle double-digit version numbers
Before this commit, cpt.py attempted `"3.11.1" < "3.4.3"`, but this
incorrectly returns `True`. This commit adds a function that splits the
string into version identifiers and checks them all individually.
2018-05-24 11:29:51 +02:00
Axel Naumann
332a707e55 Allow passing of cling::Interpreter constructor flags. 2018-05-22 23:29:52 +02:00
Axel Naumann
7682a69042 Do not override preproc (un)defines if already set. 2018-05-22 23:29:52 +02:00
Axel Naumann
673a1d63a4 cling PR 233 (#2038)
Add llvm module pass to generate unique cuda module ctor/dtor names.

This llvm module pass address the follow problem. Every llvm module has a cuda ctor and dtor (if a cuda fatbinary exist), with at least a function call to register the fatbinary. The ctor/dtor can also include function calls to register global functions and variables at runtime, depending on user's code. The lazy compilation detects functions by the name. If the name (symbol) already exists it uses the existing translation. Otherwise it translates the function on first use (but it never translates twice). Without the module pass, Cling will always use the translation of the first module.

The testcase use the reflection of the gCling interpreter object. It takes two random modules and compare the symbols of the cuda module ctor and dtor.

Also add function, which change the symbol of the cuda module ctor and dtor to preprocessor compliant symbols.
2018-05-18 11:14:07 +02:00