IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
- To enable the CUDA test, lit detects the `libcudart.so` in
`LD_LIBRARY_PATH`. Now lit also set the CUDA SDK root of
`libcudart.so` as cling parameter (`--cuda-path`) in the tests.
- Pass through the environment variable `CUDA_VISIBLE_DEVICES`.
This makes all functions end up in the same text section, which is
important for TCling on macOS to catch exceptions from constructors:
Stack unwinding requires information about program addresses to find
out which objects to destroy and what code should be called to handle
the exception. These addresses are relocated against a single __text
section when loading the produced MachO binary, which breaks if the
call sites of global constructors end up in a separate init section.
Fixes ROOT-10703 and ROOT-10962
This commit allows the user to enable/disable specific interpreter capabilities
without requiring to `#include` the heavier weight `Interpreter.h` (that also
has dependencies on llvm).
The only feature covered at the moment is definition shadowing.
Closes cling issue #360.
(std::pair<std::__strip_reference_wrapper<const char *>::__type, std::__strip_reference_wrapper<int>::__type>) { "s", 10 }
i.e. the array size is not part of the type.
When using these modifiers, a type can consist of multiple tokens
as for example "unsigned int". However the keyword "int" can also
be omitted, making "unsigned" itself also a valid type.
Note that this only handles the most basic case for the modifiers.
The size modifiers "short", "long", and "long long" would need
similar treatment. Moreover the standard permits any order for the
type specifiers, ie "unsigned long long int" and "long int unsigned
long" are both valid (and actually the same type).
- it is more similar to the interface of cling::Interpreter
- replace function compileDeviceCode() with process()
- add declare() and parse() functions
- the functions have only the argument input, because the rest of the missing arguments (e.g. Transaction) requires modifications at the transaction system
- it also fixes a bug in the I/O system of the xeus-cling kernel
Replaced the old version of the PTX compiler which used external tools and llvm::ExecuteAndWait with an internal implementation. The new incremental PTX compiler uses a modified version of the cling::Interpreter instance. The instance can process the PTX built-ins and generates LLVM IR. The LLVM IR is be compiled to PTX via an additional NVPTX backend implemented in the IncrementalCUDADeviceCompiler.
The new implementation has many advantages:
- it's much faster than the old version
- less error-prone because the ASTPrinter and some unnecessary cling transformations are avoided
- reduction of problems in searching for external tools (can be very complicated on HPC systems)
The IncrementalCUDADeviceCompiler is moved from the cling::IncrementalParser to the cling::Interpreter, because the second interpreter needs the input without wrappers or transformations.
- change CUDA to CUDAHost and add CUDADevice to the InvocationOptions
- in the PTX mode, some ASTTransformer will not be used, which are useful for the x86 mode
This fixes ROOT-10333.
Since the printValue set of functions always takes a pointer and dereferences it without any checks,
the previous implementation was attempting to valuePrint the pointee (rather than the pointer value)
but this lead to a segmentation fault whenever the smart pointer was set to nullptr.
Now, the valuePrinting for the smart pointers behaves the same as for regular pointer.
The user might use utilities which print on cout and expects the output
to be shown immediately.
This patch automatically flushes std::cout after each execution of a wrapper.
This is the final version of "printValue" discussion.
We agreed that printValue interface should be altered to ToString
interface, which can be invoked `gInterpreter->ToString(XYZ)`. (ToString is in TCling and toString is in Interpreter :D)
This patch contains:
- Implementation of toString in Interpreter.cpp
- Re-Implementation of ClingPrintValue to use ToString because I changed to use Evaluate some time ago.
- Removing of RVec version of printValue which wasn't used at all
- Fix test/vecops_rvec.cxx, printValue is never supposed to be called by a normal user.
Clang allows third party shared libraries to provide user-defined
extensions. For example, a custom libTemplateInstantiation.so can
visualize all template instantiation chains in clang. To enable it
one needs to pass a set of options such as -fplugin.
Cling should be able to inherently work with clang plugins. However,
cling still does not make full use of the clang driver where the plugin
setup is handled.
This patch enables plugins in cling and extends them in some aspects.
In particular, cling allows loading of plugins from shared libraries
but also if they are linked to the same library where cling is. This is
very useful in cases where cling runs itself in a shared library (eg
libCling). Users of libCling (such as ROOT) prefer to keep all llvm and
clang related symbols local to avoid symbol clashes if there is another
version of clang and llvm linked against a package. This can be done by
dlopen-ing libCling with RTLD_LOCAL visibility mode. Then the only way
for clang plugins to work in this scenario is to be linked to libCling.
Patch by Aleksandr Efremov and me.
- little changes at comments and code style
- try to use const in IncrementalCUDADeviceCompiler, where is possible
- move CUDA device code compiler instance to IncrementalParser
- change the members of CuArgs to const and adjust the setCuArgs method
- use std::vector<string> instead llvm::Smallvector<const char *> to build argv for executeAndWait
- improve the error messages of generatePCH(), generatePTX() and generateFatbinary()
- replace m_Counter with a copy in IncrementalCUDADeviceCompiler to avoid involuntary changes
- add cudaDeviceSynchronize() at every kernel launch
- remove small address bug at cudaMemcpy, if host array is used
- in parallel test cases, replace fixes thread number with variable
- overworked shared memory kernel
- CUDA __constant__ memory
- CUDA global __device__ memory
- CUDA __host__ prefix
- CUDA kernel launch with arguments
- CUDA templated kernels
- CUDA shared memory with dynamic runtime
- CUDA Streams
- test if CUDA device is available
Add llvm module pass to generate unique cuda module ctor/dtor names.
This llvm module pass address the follow problem. Every llvm module has a cuda ctor and dtor (if a cuda fatbinary exist), with at least a function call to register the fatbinary. The ctor/dtor can also include function calls to register global functions and variables at runtime, depending on user's code. The lazy compilation detects functions by the name. If the name (symbol) already exists it uses the existing translation. Otherwise it translates the function on first use (but it never translates twice). Without the module pass, Cling will always use the translation of the first module.
The testcase use the reflection of the gCling interpreter object. It takes two random modules and compare the symbols of the cuda module ctor and dtor.
Also add function, which change the symbol of the cuda module ctor and dtor to preprocessor compliant symbols.