write是原子的, 系统一定要保证这个语义, 不然这系统没法用了作者: myworkstation 时间: 2015-09-11 23:30
我从posix的角度来解释一下这个问题:
首先我们根据posix的规定可以明定哪些系统调用是线程安全的"A function that may be safely invoked concurrently by multiple threads. Each function defined in the System Interfaces volume of POSIX.1-2008 is thread-safe unless explicitly stated otherwise. ",由标准的定义可知除非另行说明,否则常见的系统调用都是线程安全的。那么具体到write,其规定中并没有直接说明线程安全的问题,但着重说明了多个write可能导致的数据覆盖问题。write具体做了什么由如下规定"On a regular file or other file capable of seeking, the actual writing of data shall proceed from the position in the file indicated by the file offset associated with fildes. Before successful return from write(), the file offset shall be incremented by the number of bytes actually written. On a regular file, if the position of the last byte written is greater than or equal to the length of the file, the length of the file shall be set to this position plus one. ",从上面的操作可知write不是个原子操作,一次成功的write需要seeking和writing两个动作,所以原则上来讲write是非线程安全的。进而关于write有如下规定"This volume of POSIX.1-2008 does not specify behavior of concurrent writes to a file from multiple processes. Applications should use some form of concurrency control.",显然多进程操作同一文件肯定是需要同步的,但是多线程就没有更多的规定了,只能推断write是非线程安全的(因为不是原子操作),至于write写socket那么就肯定是线程安全的,因为有如下规定"If fildes refers to a socket, write ( ) shall be equivalent to send ( ) with no flags set.",而且send是线程安全的。作者: alwaysR9 时间: 2015-09-12 09:49 回复 11# myworkstation
All functions defined by this volume of POSIX.1-2008 shall be thread-safe, except that the following functions need not be thread-safe.
后面列的函数名单里没有 write,所以按标准它是线程安全的。
不过,据Linux手册,write 在 3.14 之后的内核才符合标准。
man7.org/linux/man-pages/man2/write.2.html
BUGS
According to POSIX.1-2008/SUSv4 Section XSI 2.9.7 ("Thread
Interactions with Regular File Operations"):
All of the following functions shall be atomic with respect to
each other in the effects specified in POSIX.1-2008 when they
operate on regular files or symbolic links: ...
Among the APIs subsequently listed are write() and writev(2). And
among the effects that should be atomic across threads (and
processes) are updates of the file offset. However, on Linux before
version 3.14, this was not the case: if two processes that share an
open file description (see open(2)) perform a write() (or writev(2))
at the same time, then the I/O operations were not atomic with
respect updating the file offset, with the result that the blocks of
data output by the two processes might (incorrectly) overlap. This
problem was fixed in Linux 3.14.作者: folklore 时间: 2015-09-12 14:10 回复 13# cokeboL
585 ret = vfs_write(f.file, buf, count, &pos); // 写文件
586 if (ret >= 0) // 接下来3行, 更新文件offset
587 file_pos_write(f.file, pos);
588 fdput_pos(f);
589 }
590
591 return ret;
592 }
复制代码
很明显 获得offset, 写文件, 更新offset 三个操作是非原子的
假设线程A, B写同一个文件, 线程A先获得了offset值, 此时线程A被挂起; 线程B开始执行,B获得offset值, 从offset处写入一段字符, 挂起; 线程A从与B相同的offset处写入字符(线程A将B写入的字符覆盖)....
上面的情况可能会出现, 我做的多线程写文件实验确实出现了这种情况. 所以我认为write函数是非线程安全的, 不知道我的理解对不对 ?作者: irp 时间: 2015-09-21 10:12
write(), thread safety, and POSIX
[Posted April 18, 2006 by corbet]
Dan Bonachea recently reported a problem. It seems that he has a program where multiple threads are simultaneously writing to the same file descriptor. Occasionally, some of that output disappears - overwritten by other threads. Random loss of output data is not generally considered to be a desirable sort of behavior, and, says Dan, POSIX requires that write() calls be thread-safe. So he would like to see this behavior fixed.
Andrew Morton quickly pointed out the source of this behavior. Consider how write() is currently implemented:
asmlinkage ssize_t sys_write(unsigned int fd, const char __user *buf,
size_t count)
{
struct file *file;
ssize_t ret = -EBADF;
int fput_needed;
return ret;
}
There is no locking around this function, so it is possible for two (or more) threads performing simultaneous writes to obtain the same value for pos. They will each then write their data to the same file position, and the thread which writes last wins.
Putting some sort of lock (using the inode lock, perhaps) around the entire function would solve the problem and make write() calls thread-safe. The cost of this solution would be high, however: an extra layer of locking when almost no application actually needs it. Serializing write() operations in this way would also rule out simultaneous writes to the same file - a capability which can be useful to some applications.
So some developers have questioned whether this behavior should be fixed at all. It is not something which causes problems for over 99.9% of applications, and, for those which need to be able to perform this sort of simultaneous write, there are other options available. These include user-space locking or using the O_APPEND option. So, it is asked, why add unnecessary overhead to the kernel?
Linus responds that it is a "quality of implementation" issue, and that if there is a low-cost way of getting the system to behave the way users would like, it might as well be done. His proposal is to apply a lock to the file position in particular. His patch adds a f_pos_lock mutex to the file structure and uses that lock to serialize uses of and changes to the file position. This change will have the effect of serializing calls to write(), while leaving other forms (asynchronous I/O, pwrite()) unserialized.
The patch has not drawn a lot of comments, and it has not been merged as of this writing. Its ultimate fate will probably depend on whether avoiding races in this obscure case is truly seen to be worth the additional cost imposed on all users.