注:为了更便于阅读,在说明问题时常把该结构表示为附录2所表示的形式。
siginfo_t结构中的联合数据成员确保该结构适应所有的信号,比如对于实时信号来说,则实际采用下面的结构形式:
typedef struct {
int si_signo;
int si_errno;
int si_code;
union sigval si_value;
} siginfo_t;
结构的第四个域同样为一个联合数据结构:
union sigval {
int sival_int;
void *sival_ptr;
}
采用联合数据结构,说明siginfo_t结构中的 si_value要么持有一个4字节的整数值,要么持有一个指针,这就构成了与信号相关的数据。在信号的处理函数中,包含这样的信号相关数据指针,但没有规定具体如何对这些数据进行操作,操作方法应该由程序开发人员根据具体任务事先约定。
前面在讨论系统调用sigqueue 发送信号时,sigqueue的第三个参数就是sigval联合数据结构,当调用sigqueue时,该数据结构中的数据就将拷贝到信号处理函数的第二个参数中。这样,在发送信号同时,就可以让信号传递一些附加信息。信号可以传递信息对程序开发是非常有意义的。
信号参数的传递过程可图示如下:
第三部分 APUE2 10.4 节 不可靠的信号
10.4. Unreliable Signals
In earlier versions of the UNIX System (such as Version 7), signals were unreliable. By this we mean that signals could get lost: a signal could occur and the process would never know about it. Also, a process had little control over a signal: a process could catch the signal or ignore it. Sometimes, we would like to tell the kernel to block a signal: don't ignore it, just remember if it occurs, and tell us later when we're ready.
Changes were made with 4.2BSD to provide what are called reliable signals. A different set of changes was then made in SVR3 to provide reliable signals under System V. POSIX.1 chose the BSD model to standardize.
One problem with these early versions is that the action for a signal was reset to its default each time the signal occurred. (In the previous example, when we ran the program in Figure 10.2
, we avoided this detail by catching each signal only once.) The classic example from programming books that described these earlier systems concerns how to handle the interrupt signal. The code that was described usually looked like
int sig_int(); /* my signal handling function */
...
signal(SIGINT, sig_int); /* establish handler */
...
sig_int()
{
signal(SIGINT, sig_int); /* reestablish handler for next time */
... /* process the signal ... */
}
(The reason the signal handler is declared as returning an integer is that these early systems didn't support the ISO C void data type.)
The problem with this code fragment is that there is a window of timeafter the signal has occurred, but before the call to signal in the signal handlerwhen the interrupt signal could occur another time. This second signal would cause the default action to occur, which for this signal terminates the process. This is one of those conditions that works correctly most of the time, causing us to think that it is correct, when it isn't.
Another problem with these earlier systems is that the process was unable to turn a signal off when it didn't want the signal to occur. All the process could do was ignore the signal. There are times when we would like to tell the system "prevent the following signals from occurring, but remember if they do occur." The classic example that demonstrates this flaw is shown by a piece of code that catches a signal and sets a flag for the process that indicates that the signal occurred:
int sig_int_flag; /* set nonzero when signal occurs */
main()
{
int sig_int(); /* my signal handling function */
...
signal(SIGINT, sig_int); /* establish handler */
...
while (sig_int_flag == 0)
pause(); /* go to sleep, waiting for signal */
...
}
sig_int()
{
signal(SIGINT, sig_int); /* reestablish handler for next time */
sig_int_flag = 1; /* set flag for main loop to examine */
}
Here, the process is calling the pause function to put it to sleep until a signal is caught. When the signal is caught, the signal handler just sets the flag sig_int_flag to a nonzero value. The process is automatically awakened by the kernel after the signal handler returns, notices that the flag is nonzero, and does whatever it needs to do. But there is a window of time when things can go wrong. If the signal occurs after the test of sig_int_flag, but before the call to pause, the process could go to sleep forever (assuming that the signal is never generated again). This occurrence of the signal is lost. This is another example of some code that isn't right, yet it works most of the time. Debugging this type of problem can be difficult.
在早期的UNIX版本(例如V7)中,信号是不可靠的。不可靠在这里指的是,信号可能会丢失:一个信号发生了,但是进程却可能一直不知道这一点。同时,进程对信号的控制能力也很差,它能捕捉信号或忽略它。有时用户希望通知内核阻塞一个信号:不要忽略该信号,在其发生时记住它,然后在进程做好准备时在通知它。这种阻塞信号的能力当时并不具备。
(4.2BSD 对信号机制进行了更改,提供了被称为可靠信号的机制。然后,SVR3也修改了信号机制,提供了另一套系统V可靠信号机制。POSIX.1 选择了 BSD模型作为其标准化的基础)
早期版本中的一个问题是在进程每次接到信号对其进行处理时,随即将该信号动作复位为默认值。在描述这些早期系统的编程书籍中,有一个经典实例,它与如何处理中断信号相关,其代码与下面所示的相似:
int sig_int(); /* my signal handling function */
...
signal(SIGINT, sig_int); /* establish handler */
...
sig_int()
{
signal(SIGINT, sig_int); /* reestablish handler for next time */
... /* process the signal ... */
}
(由于早期的 C 语言版本不支持 ISO C 的void数据类型,所以将信号处理程序声明为int类型)
这段代码的一个问题是:从信号发生之后到在信号处理程序中调用 signal 函数之前这段时间中有一个时间窗口。在次段时间中,可能发生另一次中断信号。第二个信号会导致执行默认动作,而针对中断信号的默认动作是终止该进程。这种类型的程序段在大多数情况下会正常工作,使得我们认为它们是正确无误的,而实际上并非如此。
这些早期系统的另一个问题是:在进程不希望某种信号发生时,它不能关闭该信号。进程能做的一切就是忽略该信号。有时希望通知系统“阻止下列信号发生,如果它们确实发生了,请记住它们。”能够显现这种缺陷的一个经典实例是下列程序段,它捕获一个信号,然后设置一个表示该信号已发生的标志:
int sig_int_flag; /* set nonzero when signal occurs */
main()
{
int sig_int(); /* my signal handling function */
...
signal(SIGINT, sig_int); /* establish handler */
...
while (sig_int_flag == 0)
pause(); /* go to sleep, waiting for signal */
...
}
sig_int()
{
signal(SIGINT, sig_int); /* reestablish handler for next time */
sig_int_flag = 1; /* set flag for main loop to examine */
}
其中,进程掉用 pause 函数使自己休眠,直至捕捉到一个信号。当捕捉到信号时,信号处理程序将标志 sig_int_flag 设置为非 0 值。从信号处理程序返回后,内核自动将该进程唤醒,它检测到该标志为非 0,然后执行它所需做的工作。但是这来也有一个时间窗口,在此时间窗口中操作可能失误。如果在测试 sig_int_flag 之后和调用 pause 之前发生信号,则此进程在调用 pause 时入睡,并且长眠不醒(假定此信号不会再次产生)。于是,这次发生的信号也就丢失了。这是另一个例子,某段代码并不正确,但是大多数时间却能正常工作。要查找并排除这种类型的问题很困难。