How do the warps schedule on CUDA SMs?
如何在CUDA短信经纱的时间表吗?
How to properly apply thread synchronization in CUDA app?
如何正确应用在CUDA应用程序线程同步吗?
Is this a CUDA thread synchronization issue or something else?
这是CUDA线程同步问题还是其他什么?
Appendix B lists the mathematical functions supported in CUDA.
附录b列举cuda中支持的数学函数。
Are general reads and writes to global memory atomic in CUDA if.
一般读和写在CUDA如果全局内存原子。
How to query the current performance state of your GPU with CUDA?
如何查询你的GPU使用CUDA的当前性能状态?
Is there a performance penalty for CUDA method not running in sync?
有一种方法不同步运行CUDA的性能?
Cumulative sum in two dimensions on array in nested loop — CUDA implementation?
在数组的嵌套循环——CUDA实现二维累积?
How to calculate time loss for the data transmission from host to device in CUDA?
如何计算数据传输从主机到设备在CUDA的损失?
You likely created a new CPP file using "CUDA C Bitreverse Application" template.
你可能会创建一个新的CPP文件使用CUDAC倒位应用模板。
In this paper, we implement an efficient matrix multiplication on GPU using NVIDIA's CUDA.
本文使用NVIDIA的CUDA在GPU上实现了一个高效的矩阵乘法。
Using CUDA C language, using CUDA texture memory, image stretching parallel implementation.
说明:使用CUDAC语言,利用CUDA纹理内存,实现图像拉伸的并行实现。
We do take every opportunity to discuss the ability to run CUDA with anyone who's interested.
但我们的确在抓紧每个机会与那些对CUDA感兴趣的人讨论运行CUDA的能力问题。
This document is divided into the following chapters: chapter 1 is an introduction to CUDA and GPU.
本文档分为以下几个章节:第1章是CUDA和GPU的简介。
That being said, as of CUDA 4.0 by default there is one context created per process and not per thread.
也就是说,默认4.0CUDA技术的每个过程,而不是有一个上下文创建每个线程。
Achieve a highly paralleled algorithm to calculate the simplification error of triangular meshes by using CUDA.
利用CUDA实现了高度并行化的网格模型简化误差计算算法。
Can I somehow run X11 on the Intel integrated graphics in my optimus laptop and debug CUDA code on the NVIDIA GPU?
我能以某种方式运行X11在英特尔集成显卡的笔记本电脑在我的擎天柱和NVIDIA GPU的CUDA代码调试?
The result showed that CUDA could speed up calculation and be well used in real-time target tracking on upper computer.
结果表明,CUDA的应用使上位机目标跟踪的实时性得到了很大提升,可以将其应用于其它众多领域。
Multiple NPN240s can be linked to single or multiple hosts to create multi-node CUDA GPU clusters capable of thousands of GFLOPS.
多个NPN240处理器可以链接到一个或多个主机,建立多节点CUDAGPU集群,峰值可达数千gflops。
After experiments, comparing CPU 's computing power can be found, CUDA' s ability to process data in parallel is very strong.
在经过实验之后,对比CPU的计算能力可以发现,CUDA在并行处理数据的能力非常强大。
What counts more when CUDA kernel speed execution is of vital importance? The frequency of the cores or the number of the SMs?
更重要的在CUDA内核执行速度是至关重要的?核心的频率或短信的数量吗?
Abstract CUDA is a parallel computing architecture introduced by NVIDIA, it mainly used for large scale data-intensive computing.
摘要CUDA是一种由NVIDIA推出的并行计算架构,非常适合大规模数据密集型计算。
The CUDA driver and Toolkit installation are required before running the precompiled examples or compiling the example source code.
必需安装CUDA驱动和CUDA工具包,此后才可运行预编译的例程或编译样例源代码。
Each CUDA context has it's own virtual memory space, therefore you can not use a pointer from one context inside an another context.
每个CUDA上下文都有它自己的虚拟内存空间,因此你不能使用一个指针从一个上下文在另一个上下文。
CUDA just take full advantage of parallel capability of GPU, which is a kind of scalable parallel computing model launched by the NVIDIA.
CUDA正是为了充分利用GPU的并行功能,由NVIDIA公司推出的可伸缩并行计算模型。
The CUDA application completely runs on the target machine, so the console or UI for the application will be seen on the target machine only.
CUDA应用程序完全运行在目标机器上,所以控制台或用户界面的应用程序将被视为对目标机。
Please note that the CUDA Debugger for Linux has been tested only on 32-bit Red hat Enterprise Linux (RHEL) 5.x but may work on other distros as well.
注意:Linux平台下的CUDA调试程序仅在32位的Linux红帽企业版5 .x (RHEL)上测试通过,可能也支持Linux其他已发行版本。
CUDA gives full play to the advantages of GPU Streaming Multiprocessors Array and greatly improves the efficiency of the parallel computation programs.
倍。CUDA使GPU流处理器阵列的性能得到充分发挥,极大地提高了并行计算程序的效率。
The core part of ray tracing computation will be modified to adapt the advantages and limitations of CUDA so as to amplify the power of parallelization.
光线追踪的核心计算部分则根据CUDA优势与限制进行适应性改造,发挥尽可能大的并行能力。
Any GPU device has a device driver, so targeting it makes more sense than generating CUDA or OpenCL code which would require from users to install other SDKs.
所有GPU设备都有设备驱动,因此针对它来编程更合理,这样会比生成CUDA或者OpenGL的代码更好,因为那还需要用户安装其它的SDK。
应用推荐