如果对代码进行向量化,由于您正在将这些值当作字节进行处理,这意味着每条指令都要一次操作16 个值!
If you vectorize this code, since you are treating the values as bytes, that means that each instruction will operate on 16 values at once!
问题是为了对代码进行向量化,代码必须要对所有向量元素都按照相同的指令集执行。
The problem was that in order to vectorize the code, the code must follow the same set of instructions for each member of the vector.
请记住:删除分支的目的在于对代码进行向量化。
Remember that the point of removing the branch was so that you can vectorize the code.
另外也有用于IBMPowerPC 970处理器的VMX向量扩展,可以提高向量化代码的性能。
There are also VMX vector extensions for the IBM PowerPC 970 processor that can increase the performance of vectorized code.
并给出了相应的程序转换算法,通过数据相关性的分析,在应用程序向量化时,生成采用向量寄存器部分重用的优化代码。
According to the dependence analysis, we present a transformation strategy that generates vector codes exploiting the partial reuse of vector registers during the vectorization of applications.
并给出了相应的程序转换算法,通过数据相关性的分析,在应用程序向量化时,生成采用向量寄存器部分重用的优化代码。
According to the dependence analysis, we present a transformation strategy that generates vector codes exploiting the partial reuse of vector registers during the vectorization of applications.
应用推荐