12 Commits

Author SHA1 Message Date
Simeon Ehrig
be5ea3a651 Fixed CUDA mode for Clang/LLVM 9 upgrade
- fix bug, which was caused by executing a transaction in the device
interpreter
- fixed warning from the device compiler
- update test cases
2021-02-25 20:44:19 +01:00
Simeon Ehrig
e47bb75f9f Allows to configure CUDA sm level for Cling CUDA tests
- the CUDA sm level can be set via CLING_TEST_CUDA_SM_LEVEL
environment variable (e.g. "35"), before running the tests
2021-02-18 09:29:04 +01:00
Simeon Ehrig
b2bfd6e19f Extend the cling-test to deal with CUDA SDKs that are not in the default location
- To enable the CUDA test, lit detects the `libcudart.so` in
`LD_LIBRARY_PATH`. Now lit also set the CUDA SDK root of
`libcudart.so` as cling parameter (`--cuda-path`) in the tests.
- Pass through the environment variable `CUDA_VISIBLE_DEVICES`.
2021-02-10 15:18:14 +01:00
Simeon Ehrig
ad8d5e1137 Changes for Pull Request #284
- add Author to CUDA test cases
- optimize DeviceKernelInliner
- improve some comments
- remove deprecated opt level variables
- change interface of IncrementalCUDADeviceCompiler::process() IncrementalCUDADeviceCompiler::declare()
2019-11-07 19:29:15 +01:00
Simeon Ehrig
f63b935c68 Refactor public interface of cling::IncrementalCUDADeviceCompiler
- it is more similar to the interface of cling::Interpreter
- replace function compileDeviceCode() with process()
- add declare() and parse() functions
- the functions have only the argument input, because the rest of the missing arguments (e.g. Transaction) requires modifications at the transaction system
- it also fixes a bug in the I/O system of the xeus-cling kernel
2019-11-07 19:29:15 +01:00
Simeon Ehrig
64fe3f7d6d Setting a new include path at runtime in the PTX compiler now works 2019-11-07 19:29:15 +01:00
Simeon Ehrig
0d9d8be5b9 Support for the define argument (-D) in the CUDA mode 2019-11-07 19:29:15 +01:00
Simeon Ehrig
9a4418b3c0 Improvements for Pull Request #240
- little changes at comments and code style
- try to use const in IncrementalCUDADeviceCompiler, where is possible
- move CUDA device code compiler instance to IncrementalParser
- change the members of CuArgs to const and adjust the setCuArgs method
- use std::vector<string> instead llvm::Smallvector<const char *> to build argv for executeAndWait
- improve the error messages of generatePCH(), generatePTX() and generateFatbinary()
- replace m_Counter with a copy in IncrementalCUDADeviceCompiler to avoid involuntary changes
2018-06-25 08:29:07 +02:00
Simeon Ehrig
68cffbb853 Improve CUDA device code testcases. 2018-06-25 08:29:07 +02:00
Simeon Ehrig
50bc19f6b5 Add some complex cuda template kernel test cases. 2018-06-25 08:29:07 +02:00
Simeon Ehrig
2011246c17 Overwork CUDA device tests.
- add cudaDeviceSynchronize() at every kernel launch
- remove small address bug at cudaMemcpy, if host array is used
- in parallel test cases, replace fixes thread number with variable
- overworked shared memory kernel
2018-06-25 08:29:07 +02:00
Simeon Ehrig
e7b0e22ae8 Add test cases for CUDA features.
- CUDA __constant__ memory
- CUDA global __device__ memory
- CUDA __host__ prefix
- CUDA kernel launch with arguments
- CUDA templated kernels
- CUDA shared memory with dynamic runtime
- CUDA Streams
- test if CUDA device is available
2018-06-25 08:29:07 +02:00