- 论坛徽章:
- 0
|
本帖最后由 guguozhifeng 于 2013-02-25 11:52 编辑
(gdb) bt
#0 0x0000003ac6c6f261 in _IO_str_overflow_internal () from /lib64/libc.so.6
#1 0x0000003ac6c6e404 in _IO_default_xsputn_internal () from /lib64/libc.so.6
#2 0x0000003ac6c429f0 in vfprintf () from /lib64/libc.so.6
#3 0x0000003ac6c63cb9 in vsprintf () from /lib64/libc.so.6
#4 0x0000003ac6c4d698 in sprintf () from /lib64/libc.so.6
#5 0x00002aaaad0662e5 in doLogEx (pContext=0x2aaaad272800, tv=<value optimized out>, caption=0x2aaaad06e423 "ERROR",
text=0x41258d30 "file: ../common/fdht_proto.c, line: 38, server: 192.168.32.162:11411, recv data fail, errno: 107, error info: Transport endpoint is not connected", text_len=145, bNeedSync=0 '\0') at ../common/logger.c:372
#6 0x00002aaaad066530 in doLog (pContext=0x2aaaad272800, caption=0x2aaaad06e423 "ERROR",
text=0x41258d30 "file: ../common/fdht_proto.c, line: 38, server: 192.168.32.162:11411, recv data fail, errno: 107, error info: Transport endpoint is not connected", text_len=145, bNeedSync=0 '\0') at ../common/logger.c:417
#7 0x00002aaaad0669ae in logError (format=<value optimized out>) at ../common/logger.c:617
#8 0x00002aaaace53ea1 in fdht_recv_header (pServer=0x15745a8, in_bytes=0x4125968c) at ../common/fdht_proto.c:35
#9 0x00002aaaace576c5 in fdht_get_ex1 (pGroupArray=<value optimized out>, bKeepAlive=1 '\001', pKeyInfo=0x412599b0, expires=-1, ppValue=0x412599a8,
value_len=0x412599a4, malloc_func=0x3ac6c74de0 <malloc>) at fdht_client.c:415
#10 0x00002aaaad274843 in my_fdfs_get_file_id (pContext=<value optimized out>, my_file_id=<value optimized out>, fdfs_file_id=0x41259b30 "\200\235%A",
file_id_size=144) at my_fdfs_client.c:350
#11 0x00002aaaad2749d4 in my_fdfs_file_exist (pContext=0x41258a20, my_file_id=0x5b <Address 0x5b out of bounds>) at my_fdfs_client.c:612
我是将这个代码写入一个服务,在服务运行一段时间后就会重启一次,gdb调试后,发现my_fdfs_file_exist报错,而my_fdfs_file_exist 输入的参数是没有问题的,my_fdfs_file_exist 这个函数被调用了很多次,但是偶尔会报错。不知道怎么回事,有人遇到过没有,请教一下?
我跟踪了一下代码,发现fastdht 中logger.c 中doLogEx函数 中有一个sprintf执行出错。
我初步估计是LogContext中的pcurrent_buff 和 log_buff 处理有问题。
报错的地方的代码:
if ((pContext->pcurrent_buff - pContext->log_buff) + text_len + 64 \
> LOG_BUFF_SIZE)
{
log_fsync(pContext, false);
}
if (pContext->time_precision == LOG_TIME_PRECISION_SECOND)
{
buff_len = sprintf(pContext->pcurrent_buff, \
"[%04d-%02d-%02d %02d:%02d:%02d] ", \
tm.tm_year+1900, tm.tm_mon+1, tm.tm_mday, \
tm.tm_hour, tm.tm_min, tm.tm_sec);
}
else
{
buff_len = sprintf(pContext->pcurrent_buff, \
"[%04d-%02d-%02d %02d:%02d:%02d.%03d] ", \
tm.tm_year+1900, tm.tm_mon+1, tm.tm_mday, \
tm.tm_hour, tm.tm_min, tm.tm_sec, time_fragment);
}
会不会是因为多线程使用,没有加锁(判断缓冲区大小时和使用sprintf时),导致sprintf 使用时,缓冲区大小不够了。 |
|