论坛徽章:: 0

1楼 [报告]

发表于 2006-03-09 18:14 |显示全部楼层

原帖由 albcamus 于 2006-3-9 15:37 发表

>对barrier() 我的理解是这只是一个compiler barrier，这个barrier加入到代码中，会使cache invalidation
>而mb是hardware barrier，在代码运行中，CPU会prevent from reordering cache visit.
非常感 ...

碰巧这几天在看Solaris的锁机制，正好也涉及到类似的问题。对比x86和sparc的锁你就会发现，实际上，mutex_enter在x86就是只用lock，没有明着用barriers，但是sparc就不同了。再看看手册就知道，lock在这里起双重作用：

AMD64 Architecture Programmer's Manual, Volume 2, System Programming.

"Read/write barrier instructions
force all prior reads or writes to complete before
subsequent reads or writes are executed....
...
Serializing instructions, I/O instructions, and locked
instructions can also be used as read/write barriers." - Page 198, 199

"Locked Instructions - Before completing a locked instruction
(an instruction executed using the LOCK prefix), all
previous reads and writes must be written to memory, and
the locked instruction must complete before completing
subsequent writes." - Page 206

http://cvs.opensolaris.org/sourc ... ia32/ml/lock_prim.s

554 ENTRY_NP(mutex_enter)
555 movq %gs:CPU_THREAD, %rdx /* rdx = thread ptr */
556 xorl %eax, %eax /* rax = 0 (unheld adaptive) */
557 lock                            ----> lock在此处也起了barrier的作用
558 cmpxchgq %rdx, (%rdi)          ----> 获得锁
559 jnz mutex_vector_enter
560 .mutex_enter_lockstat_patch_point:
561 ret

http://cvs.opensolaris.org/sourc ... c/v9/ml/lock_prim.s

382 ENTRY(mutex_enter)
383 mov THREAD_REG, %o1
384 casx [%o0], %g0, %o1 ! try to acquire as adaptive --> 获得锁
385 brnz,pn %o1, 1f ! locked or wrong type
386 membar #LoadLoad                ---> mem barrier指令
387 .mutex_enter_lockstat_patch_point:
388 retl

[ 本帖最后由 Solaris12 于 2006-3-10 10:35 编辑 ]

Solaris12

富足长乐

论坛徽章:: 0

2楼 [报告]

发表于 2006-03-10 11:06 |显示全部楼层

原帖由 albcamus 于 2006-3-9 15:37 发表
可是您给我看的那篇文章说：
A given CPU always perceives its own memory operations as occurring in program order. That is, memory-reordering issues arise only when a CPU is observing other CPUs' memory operations.
似乎只有一个主体访问内存时，无论如何也不会需求barrier。只有两个或更多主体（CPU、DMA控制器）访问内存，且其中一个观测另一个，就需要barrier了。

这个推论是正确的。CPU为了让pipeline更高效是会打乱内存读取顺序，但是，这都建立在分析指令间依赖关系之上，因此即便是乱序，对同一单元那部分存取指令也是顺序执行的。

因此，在单cpu的系统上，程序员还是可以假设CPU是按照程序给定的路径顺序执行指令的。正确性完全是由CPU自己保证的。

[ 本帖最后由 Solaris12 于 2006-3-10 11:09 编辑 ]

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

Solaris12

富足长乐

论坛徽章:: 0

3楼 [报告]

发表于 2006-03-10 13:44 |显示全部楼层

原帖由 albcamus 于 2006-3-10 11:27 发表
再补充一点，关于snoopying和SMP上的缓存一致性：

IA32的每个CPU都要实现MESI协议(M:Modified；E:Exclusive；S：Shared；I：Invalid)

刚才和别人讨论了一下这个问题，在SMP系统上，出现内存乱序的根本原因可能有以下几个：

1. 现代CPU并行执行指令，导致了内存的写入或者读入顺序的不可确定性。

2. 各个CPU内部的数据指令缓冲及各个CPU Cache之间的一致性问题。

因此前面提到的几条原则可以这么理解：

  1. A given CPU always perceives its own memory operations as occurring in program order. That is, memory-reordering issues arise only when a CPU is observing other CPUs' memory operations.

  单处理器系统出现的乱序CPU自己可以解决，只有SMP的系统上才会要求内核程序员考虑处理内存乱序。原因就是上面的两点。

  2. An operation is reordered with a store only if the operation accesses a different location than does the store.

  如果乱序的指令包含了store，那么必然其它操作访问的内存单元与这个store访问的内存单元无关。

  3. Aligned simple loads and stores are atomic.

  对已经对齐的数据进行简单的load和store操作是原子的。意味着非对齐的数据的load或者store可能会对其它CPU而言，存在乱序可能。

  4. Linux-kernel synchronization primitives contain any needed memory barriers, which is a good reason to use these primitives.

  任何操作系统的同步原语中都包含了必须的memory barriers指令，前面我给的Solaris也不例外。

[ 本帖最后由 Solaris12 于 2006-3-10 14:30 编辑 ]

实战分享：从技术角度谈机器学习入门| 【大话IT】RadonDB低门槛向MySQL集群下战书 | ChinaUnix打赏功能已上线！ | 新一代分布式关系型数据库RadonDB知多少？

返回列表

Chinaunix › 论坛 › 程序设计 › 内核源码 › [推荐] LKML上一篇关于barrier文档草案的讨论

[推荐] LKML上一篇关于barrier文档草案的讨论 [复制链接]