Chinaunix

标题: mmap共享内存更新不及时 [打印本页]

作者: tqyou85    时间: 2013-08-05 10:51
标题: mmap共享内存更新不及时
使用mmap将内核空间内存映射到用户空间,对这块内存,内核来写,1ms更新一次,用户空间读,测试发现用户空间读取的数据有时会滞后内核更新的数据几个ms。
不知是否mmap参数设置不对,求助大虾帮忙。
内核代码:
  1. p = get_zeroed_page(GFP_KERNEL);
  2. static int int_mmap(struct file *filp, struct vm_area_struct *vma)
  3. {

  4.     unsigned long offset = vma->vm_pgoff << PAGE_SHIFT;
  5.     unsigned long physics = ((void *)p)-PAGE_OFFSET;
  6.     unsigned long mypfn = physics >> PAGE_SHIFT;
  7.     unsigned long vmsize = vma->vm_end-vma->vm_start;
  8.     unsigned long psize = PAGE_SIZE - offset;
  9.    
  10.     if(vmsize > psize) {
  11.         return -ENXIO;
  12.     }

  13.        
  14.         vma->vm_flags |= VM_IO | VM_SHARED;
  15.         vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);

  16.     if(remap_pfn_range(vma, vma->vm_start,  mypfn,  vmsize, vma->vm_page_prot) != 0) {
  17.         return -EAGAIN;
  18.     }   
  19.    
  20.     return   0;
  21. }
复制代码
用户空间:
  1. s32 test_init()
  2. {
  3.     fd = open(FILE_DEVICE, O_RDWR);
  4.         if (fd< 0)
  5.         {
  6.                 return OSP_ERROR;
  7.         }

  8.         p= mmap(NULL, getpagesize(),
  9.                 (u32)PROT_READ | (u32)PROT_WRITE, MAP_SHARED, fd, 0);
  10.         if(ospmmap == MAP_FAILED )
  11.         {
  12.                 close(fd);
  13.                 return -1;
  14.         }

  15.         return 0;
  16. }
复制代码

作者: 瀚海书香    时间: 2013-08-05 13:12
回复 1# tqyou85
使用mmap将内核空间内存映射到用户空间,对这块内存,内核来写,1ms更新一次,用户空间读,测试发现用户空间读取的数据有时会滞后内核更新的数据几个ms。
不知是否mmap参数设置不对,求助大虾帮忙。


内核空间和用户空间访问的是同一块内存,不会出现不一致的情况。之前的项目中曾经用过该机制进行用户态和内核态的数据同步,没出现过问题。

从你贴出来的代码看不出问题
作者: tqyou85    时间: 2013-08-05 14:43
本帖最后由 tqyou85 于 2013-08-05 14:43 编辑

回复 2# 瀚海书香

修改了内核代码,将以下两行注释掉:
  1. vma->vm_flags |= VM_IO | VM_SHARED;
  2. vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
复制代码
问题就解决了,原因还得仔细分析。


   
作者: embeddedlwp    时间: 2013-08-05 15:18
回复 3# tqyou85


只注释掉
  1. vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
复制代码
可以解决吗?

   
作者: 瀚海书香    时间: 2013-08-05 16:05
回复 4# embeddedlwp

按理说应该设置nocached的啊?drivers/char/mem.c中的代码
   
作者: tqyou85    时间: 2013-08-05 16:10
本帖最后由 tqyou85 于 2013-08-05 16:12 编辑
embeddedlwp 发表于 2013-08-05 15:18
回复 3# tqyou85


测试了下,只注释掉vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);可以解决问题。
作者: embeddedlwp    时间: 2013-08-05 16:27
回复 6# tqyou85

神马平台?
   
作者: 瀚海书香    时间: 2013-08-05 16:27
回复 6# tqyou85

看看这个patch的注释,x86平台下加不加应该都可以。

From: Jean-Samuel Chenard <jsamch@gmail.com>
To: Greg KH <greg@kroah.com>
Cc: Hans J Koch <hjk@linutronix.de>, linux-kernel@vger.kernel.org,
Juergen Beisert <juergen127@kreuzholzen.de>
Date: Fri, 14 Mar 2008 11:19:49 +0100
Subject: Add pgprot_noncached() to UIO mmap code

Mapping of physical memory in UIO needs pgprot_noncached() to ensure
that IO memory is not cached. Without pgprot_noncached(), it (accidentally)
works on x86 and arm, but fails on PPC.

Signed-off-by: Jean-Samuel Chenard <jsamch@gmail.com>
Signed-off-by: Hans J Koch <hjk@linutronix.de>

---
drivers/uio/uio.c |    2 ++
1 file changed, 2 insertions(+)
Index: linux-2.6.25-rc/drivers/uio/uio.c
===================================================================
--- linux-2.6.25-rc.orig/drivers/uio/uio.c        2008-03-14 11:00:59.000000000 +0100
+++ linux-2.6.25-rc/drivers/uio/uio.c        2008-03-14 11:03:13.000000000 +0100
@@ -470,6 +470,8 @@

        vma->vm_flags |= VM_IO | VM_RESERVED;

+        vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
+
        return remap_pfn_range(vma,
                               vma->vm_start,
                               idev->info->mem[mi].addr >> PAGE_SHIFT,
   
作者: tqyou85    时间: 2013-08-05 17:07
本帖最后由 tqyou85 于 2013-08-05 17:09 编辑
embeddedlwp 发表于 2013-08-05 16:27
回复 6# tqyou85

神马平台?


回复 8# 瀚海书香

ppc平台
作者: blake326    时间: 2013-08-08 13:42
你把用户态的设成了no cache,直接读物理内存。
        vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
内核直接用的dcache,肯定又一段延迟才会写到物理内存。
这样用户态自然读的延迟了。


如果 vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);把这个去掉的话。用户态和内核态都用cache,虽然可能有cache aliasing的问题(用户态cache line和内核cache line都指向的同一块物理内存),但是一般体系都会自动解决这个alasing的,比如armv7就是自动处理vipt dcache aliasing的。


作者: luoyan_xy    时间: 2013-08-08 22:15
  哈哈,PPC的这个问题我也遇到过,硬件DMA跟上层共享数据的话不会自动刷新cache,当时我是在访问之前,使用flush_cache_range刷新了cache
作者: 瀚海书香    时间: 2013-08-09 13:13
回复 11# luoyan_xy
哈哈,PPC的这个问题我也遇到过,硬件DMA跟上层共享数据的话不会自动刷新cache,当时我是在访问之前,使用flush_cache_range刷新了cache


没在PPC平台上用过。多谢分享
作者: arm-linux-gcc    时间: 2013-08-10 11:58
ppc是pipt的cache,所以不会出现cache alias的问题
LZ之前代码中有pgprot_noncached,于是kernel就是写的data cache,而app却是直接读的内存,所以数据不一致
而LZ后面又去掉了pgprot_noncached,于是kernel和app都是读写的data cache,而ppc又是pipt的cache,所以app和kernel都是读写的同一个cache line,因此没有问题了
作者: tqyou85    时间: 2013-08-12 09:00
arm-linux-gcc 发表于 2013-08-10 11:58
ppc是pipt的cache,所以不会出现cache alias的问题
LZ之前代码中有pgprot_noncached,于是kernel就是写的d ...


多谢LS大侠解答,对cache alias这些概念不是很懂,特意查了下:
Caches can be divided into 4 types, based on whether the index or tag correspond to physical or virtual addresses:

    Physically indexed, physically tagged (PIPT) caches use the physical address for both the index and the tag. While this is simple and avoids problems with aliasing, it is also slow, as the physical address must be looked up (which could involve a TLB miss and access to main memory) before that address can be looked up in the cache.

    Virtually indexed, virtually tagged (VIVT) caches use the virtual address for both the index and the tag. This caching scheme can result in much faster lookups, since the MMU doesn't need to be consulted first to determine the physical address for a given virtual address. However, VIVT suffers from aliasing problems, where several different virtual addresses may refer to the same physical address. The result is that such addresses would be cached separately despite referring to the same memory, causing coherency problems. Another problem is homonyms, where the same virtual address maps to several different physical addresses. It is not possible to distinguish these mappings by only looking at the virtual index, though potential solutions include: flushing the cache after a context switch, forcing address spaces to be non-overlapping, tagging the virtual address with an address space ID (ASID), or using physical tags. Additionally, there is a problem that virtual-to-physical mappings can change, which would require flushing cache lines, as the VAs would no longer be valid.

    Virtually indexed, physically tagged (VIPT) caches use the virtual address for the index and the physical address in the tag. The advantage over PIPT is lower latency, as the cache line can be looked up in parallel with the TLB translation, however the tag can't be compared until the physical address is available. The advantage over VIVT is that since the tag has the physical address, the cache can detect homonyms. VIPT requires more tag bits, as the index bits no longer represent the same address.

    Physically indexed, virtually tagged (PIVT) caches are only theoretical as they would basically be useless.[13]

The speed of this recurrence (the load latency) is crucial to CPU performance, and so most modern level-1 caches are virtually indexed, which at least allows the MMU's TLB lookup to proceed in parallel with fetching the data from the cache RAM.

http://en.wikipedia.org/wiki/CPU_cache#Associativity




欢迎光临 Chinaunix (http://bbs.chinaunix.net/) Powered by Discuz! X3.2