123 / 3 页下一页

在stack上分配大块内存，是否会影响效率？ [复制链接]

塑料袋

白手起家

论坛徽章:: 4

11楼 [报告]

发表于 2012-06-20 17:11 |只看该作者

wangzhen11aaa 发表于 2012-06-20 17:03
回复 7# 塑料袋
结论和开始的1维数组not cache friendly 矛盾，证明1维是cache friendly.

我们认为没有任何玩意是cache friendly的啊，当然结论也是1维数组不是cache friendly

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

wangzhen11aaa

小富即安

论坛徽章:: 0

12楼 [报告]

发表于 2012-06-20 17:36 |只看该作者

回复 11# 塑料袋
归纳假设：
1维cache friendly.
假设N维cache friendly.成立
N+1维是N维的1维数组。所以cache friendly.
结论N维cache friendly.

归纳假设： N维not cache friendly.
1维cache not cache friendly.
N维not cache friendly.
N+1维是N维的一维数组。 not && not = ？

没有结论。

和1维数组是否cache friendly 有直接关系。

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

塑料袋

白手起家

论坛徽章:: 4

13楼 [报告]

发表于 2012-06-20 17:47 |只看该作者

wangzhen11aaa 发表于 2012-06-20 17:36
回复 11# 塑料袋
归纳假设：
1维cache friendly.
假设N维cache friendly.成立
N+1维是N维的1维数组。所以cache friendly.
结论N维cache friendly.

大哥，我看不懂！

归纳好像不是这么玩的

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

塑料袋

白手起家

论坛徽章:: 4

14楼 [报告]

发表于 2012-06-20 17:48 |只看该作者

wangzhen11aaa 发表于 2012-06-20 17:36
回复 11# 塑料袋
归纳假设：
1维cache friendly.
假设N维cache friendly.成立
N+1维是N维的1维数组。所以cache friendly.
结论N维cache friendly.

大哥，我看不懂！

归纳好像不是这么玩的

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

wwwsq

富足长乐

论坛徽章:: 0

15楼 [报告]

发表于 2012-06-20 18:21 |只看该作者

塑料袋发表于 2012-06-20 16:53
这个问题是不是应该这样考虑，我们的结论是认为1维数组cpu cache friendly的说法不成立

一维数组，如果是连续读的，那么是cache friendly的，cpu会prefetch。

比如：
char ay[10000];
for (i = 0; i < 10000; i++)
{
sum += ay;
}
cpu将会去预读ay，从而提高cpu cache命中率。

二维数组的访问方式一般是跳着读写的，就没法预读了。那是不是访问二维数组的cpu cache命中率会非常低？
char aa[100][1000];
for (i = 0; i < 100; i++)
{
sum += aa;
}
由于aa[0]和aa[1]不是连续的，cpu的预读机制不能发挥作用。

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

folklore

巨富豪门

论坛徽章:: 59

16楼 [报告]

发表于 2012-06-20 18:35 |只看该作者

回复 6# wwwsq

to cpu, there are no difference between the 2/1 dim array.

in may case, the cache replacement policy are base on the actaul address of memory.
my english is poor, so , use the the following figure to descript what i means.

of course it is just a "as is" model (i use that the easiest understand model), not all cpu act like this.

the cpu cache are split into 128 byte blocks. the cache capacility is 128K, so that there are 1K cache lines available

+------+ cache line 1: 0~127
+------+ cache line 2: 128~255
+------+ cache line 3: 256~...

and the cache replacement policy is just use the following mapping function
aa= actual address
ca=cache line address

ca=aa&(1<<10+7) ; modeling to 128K ,the cache size
and if the cache not be hit, read one cache line arround the accessed memory.

then if the memory 0 be access: via code int a=*(int*)0;
and if the cpu found that the memory is not be hit in cache, cpu would load ((char*)0)[0...127] to cache line 0;
then the program may access the address around 0 such as 3,4,10,..., all the above may be found that they are allready stay in
the cache. so that the program may be speed up by caching.

so that if no memory access happen, nothing happened about the cache.
but if you do any accession of the memory , the cpu cache line may be changed(if this access is not hit the cache).

no worry about that you access object is the 2 or 1 dim of array, the caching policy can't understand what they are.
it just base on the memory address. (if no more complex cache memory policy applied)

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

folklore

巨富豪门

论坛徽章:: 59

17楼 [报告]

发表于 2012-06-20 18:37 |只看该作者

回复 15# wwwsq

it depend on the cache line size.

in many case, what you think is right. if it is a must to answer "right" or "wrong" about it.

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

wwwsq

富足长乐

论坛徽章:: 0

18楼 [报告]

发表于 2012-06-20 20:20 |只看该作者

本帖最后由 wwwsq 于 2012-06-20 20:20 编辑

folklore 发表于 2012-06-20 18:35
回复 6# wwwsq

明白了。谢谢~~

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

塑料袋

白手起家

论坛徽章:: 4

19楼 [报告]

发表于 2012-06-20 21:10 |只看该作者

wwwsq 发表于 2012-06-20 18:21
一维数组，如果是连续读的，那么是cache friendly的，cpu会prefetch。

比如：

预读比你想象的要复杂的多，要考虑太多问题。

简单说，就两大类：

1) CPU并没有什么复杂的读预测逻辑：由于pipeline的存在，读指令解码完==>读操作真正应该执行，中间的这段时间有预读的发生。解码完的读指令，本身就是对预读的启示，指出了应该预读那个地址。这种情况下即使800维数组都有预读，而且这种情况是RISC的绝大多数情况。

2) CPU有非常复杂的的读预测逻辑：这种情况，我不大了解。听说有的CISC连非常逆天的预写都出来了，不知道真假。但是既然这种预读逻辑非常复杂，他不可能只支持简单的顺序读，至少和kernel预读磁盘至内存一样，顺序读，交织读.......

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

wwwsq

富足长乐

论坛徽章:: 0

20楼 [报告]

发表于 2012-06-20 22:31 |只看该作者

本帖最后由 wwwsq 于 2012-06-20 22:39 编辑

塑料袋发表于 2012-06-20 21:10
预读比你想象的要复杂的多，要考虑太多问题。

简单说，就两大类：

“顺序读，交织读”，靠不靠谱的？不要忽悠我啊。。。。

另外，Intel sandy bridge xeon CPU有复杂的读预测逻辑么？

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

123 / 3 页下一页

返回列表

Chinaunix › 论坛 › 程序设计 › C/C++ › 在stack上分配大块内存，是否会影响效率？

在stack上分配大块内存，是否会影响效率？ [复制链接]

浏览过的版块