yunzhongyue 发表于 2007-12-08 19:33

RH 9.0+IDS 7.3 ERROR 25580,27001,349

最近用户反应在跑一个很大的程序时,总是跑到一半时程序会报25580和349错误,然后就退出了。
检查online.log发现有很多如下的错误,

18:44:46Logical Log 14230 Complete.
18:45:08Logical Log 14231 Complete.
18:45:34Logical Log 14232 Complete.
18:48:56listener-thread: err = -25580: oserr = 0: errstr = : System error occurred in network function.
18:48:59Logical Log 14233 Complete.
18:52:47Logical Log 14234 Complete.
18:55:56Logical Log 14235 Complete.
18:58:56listener-thread: err = -25580: oserr = 0: errstr = : System error occurred in network function.
19:00:24Logical Log 14236 Complete.
19:05:55Logical Log 14237 Complete.
19:08:56listener-thread: err = -27001: oserr = 0: errstr = : Read error occurred during connection attempt.
19:09:22Logical Log 14238 Complete.
19:11:14Logical Log 14239 Complete.
19:12:01Logical Log 14240 Complete.
19:12:47Logical Log 14241 Complete.
19:14:10Logical Log 14242 Complete.
19:15:14Logical Log 14243 Complete.


配置文件如下,

onconfig -c

# Root Dbspace Configuration

ROOTNAME      rootdbs         # Root dbspace name
ROOTPATH      /dev/sda5       # Path for device containing root dbspace
ROOTOFFSET      100             # Offset of root dbspace into device (Kbytes)
###ROOTSIZE      2090000         # Size of root dbspace (Kbytes)
ROOTSIZE      2048000         # Size of root dbspace (Kbytes)

# Disk Mirroring Configuration Parameters

MIRROR          0               # Mirroring flag (Yes = 1, No = 0)
MIRRORPATH                      # Path for device containing mirrored root
MIRROROFFSET    0               # Offset into mirrored device (Kbytes)

# Physical Log Configuration

###PHYSDBS         rootdbs         # Location (dbspace) of physical log
###PHYSFILE      65536         # Physical log file size (Kbytes)
PHYSDBS         rootdbs         # Location (dbspace) of physical log
PHYSFILE      960000          # Physical log file size (Kbytes)
LOGFILES      150             # Number of logical log files
LOGSIZE         1024            # Logical log size (Kbytes)


# Diagnostics

MSGPATH         /u/informix/online.log # System message log file path
CONSOLE         /dev/console    # System console message path
ALARMPROGRAM    /u/informix/etc/log_full.sh # Alarm program path
SYSALARMPROGRAM /u/informix/etc/evidence.sh # System Alarm program path
TBLSPACE_STATS1

# System Archive Tape Device

TAPEDEV         /dev/null       # Tape device path
#TAPEDEV         /dev/st0      # Tape device path
TAPEBLK         32            # Tape block size (Kbytes)
TAPESIZE      80000000      # Maximum amount of data to put on tape (Kbytes)

# Log Archive Tape Device

#LTAPEDEV      /dev/null       # Log tape device path
LTAPEDEV      /dev/null       # Log tape device path
LTAPEBLK      32            # Log tape block size (Kbytes)
LTAPESIZE       80000000      # Max amount of data to put on log tape (Kbytes)

# Optical

STAGEBLOB                     # Informix Dynamic Server/Optical staging area

# System Configuration

SERVERNUM       18            # Unique id corresponding to a Dynamic Server instance
DBSERVERNAME    on_nbtcp7       # Name of default database server
DBSERVERALIASES on_nbtcp7_intra # List of alternate dbservernames
NETTYPE         soctcp,1,150,NET # Configure poll thread(s) for nettype
NETTYPE         ipcshm,1,50,CPU # Configure poll thread(s) for nettype

DEADLOCK_TIMEOUT 60            # Max time to wait of lock in distributed env.
RESIDENT      0               # Forced residency flag (Yes = 1, No = 0)

MULTIPROCESSOR0               # 0 for single-processor, 1 for multi-processor
NUMCPUVPS       4               # Number of user (cpu) vps
SINGLE_CPU_VP   0               # If non-zero, limit number of cpu vps to one

NOAGE         0               # Process aging
AFF_SPROC       0               # Affinity start processor
AFF_NPROCS      0               # Affinity number of processors

# Shared Memory Parameters

LOCKS         120000          # Maximum number of locks
BUFFERS         280000          # Maximum number of shared buffers
NUMAIOVPS       2               # Number of IO vps
PHYSBUFF      128             # Physical log buffer size (Kbytes)
LOGBUFF         128             # Logical log buffer size (Kbytes)
###LOGSMAX         128             # Maximum number of logical log files
LOGSMAX         200             # Maximum number of logical log files
CLEANERS      1               # Number of buffer cleaner processes
SHMBASE         0x401b7000      # Shared memory base address
###SHMVIRTSIZE   1064            # initial virtual shared memory segment size
SHMVIRTSIZE   1024000         # initial virtual shared memory segment size
SHMADD          307200          # Size of new shared memory segments (Kbytes)
###SHMADD          128000          # Size of new shared memory segments (Kbytes)
SHMTOTAL      2500000         # Total shared memory (Kbytes). 0=>unlimited
CKPTINTVL       7200            # Check point interval (in sec)
LRUS            12            # Number of LRU queues
LRU_MAX_DIRTY   50            # LRU percent dirty begin cleaning limit
LRU_MIN_DIRTY   5               # LRU percent dirty end cleaning limit
###LTXHWM          50            # Long transaction high water mark percentage
###LTXEHWM         60            # Long transaction high water mark (exclusive)
LTXHWM          45            # Long transaction high water mark percentage
LTXEHWM         50            # Long transaction high water mark (exclusive)
TXTIMEOUT       0x258             # Transaction timeout (in sec)
STACKSIZE       64            # Stack size (Kbytes)
# System Page Size
# BUFFSIZE - Dynamic Server no longer supports this configuration parameter.
#            To determine the page size used by Dynamic Server on your platform
#            see the last line of output from the command, 'onstat -b'.


# Recovery Variables
# OFF_RECVRY_THREADS:
# Number of parallel worker threads during fast recovery or an offline restore.
# ON_RECVRY_THREADS:
# Number of parallel worker threads during an online restore.

OFF_RECVRY_THREADS 10            # Default number of offline worker threads
ON_RECVRY_THREADS 1               # Default number of online worker threads

# Data Replication Variables
# DRAUTO: 0 manual, 1 retain type, 2 reverse type
DRAUTO          0               # DR automatic switchover
DRINTERVAL      30            # DR max time between DR buffer flushes (in sec)
DRTIMEOUT       30            # DR network timeout (in sec)
DRLOSTFOUND   /u/informix/etc/dr.lostfound # DR lost+found file path

# CDR Variables
CDR_LOGBUFFERS2048            # size of log reading buffer pool (Kbytes)
CDR_EVALTHREADS 1,2             # evaluator threads (per-cpu-vp,additional)
CDR_DSLOCKWAIT5               # DS lockwait timeout (seconds)
CDR_QUEUEMEM    4096            # Maximum amount of memory for any CDR queue (Kbytes)

# Backup/Restore variables
BAR_ACT_LOG   /tmp/bar_act.log
BAR_MAX_BACKUP0
BAR_RETRY       1
BAR_NB_XPORT_COUNT 10
BAR_XFER_BUF_SIZE 31

# Informix Storage Manager variables
ISM_DATA_POOL   ISMData         # If the data pool name is changed, be sure to
                              # update $INFORMIXDIR/bin/onbar.Change to
ISM_LOG_POOL    ISMLogs

# Read Ahead Variables
RA_PAGES      12            # Number of pages to attempt to read ahead
RA_THRESHOLD    8               # Number of pages left before next group

# DBSPACETEMP:
# Dynamic Server equivalent of DBTEMP for SE. This is the list of dbspaces
# that the Dynamic Server SQL Engine will use to create temp tables etc.
# If specified it must be a colon separated list of dbspaces that exist
# when the Dynamic Server system is brought online.If not specified, or if
# all dbspaces specified are invalid, various ad hoc queries will create
# temporary files in /tmp instead.

DBSPACETEMP   tempdbs         # Default temp dbspaces

# DUMP*:
# The following parameters control the type of diagnostics information which
# is preserved when an unanticipated error condition (assertion failure) occurs
# during Dynamic Server operations.
# For DUMPSHMEM, DUMPGCORE and DUMPCORE 1 means Yes, 0 means No.

DUMPDIR         /tmp            # Preserve diagnostics in this directory
DUMPSHMEM       1               # Dump a copy of shared memory
DUMPGCORE       0               # Dump a core image using 'gcore'
DUMPCORE      0               # Dump a core image (Warning:this aborts Dynamic Server)
DUMPCNT         1               # Number of shared memory or gcore dumps for
                              # a single user's session

FILLFACTOR      90            # Fill factor for building indexes

# method for Dynamic Server to use when determining current time
USEOSTIME       0               # 0: use internal time(fast), 1: get time from OS(slow)

# Parallel Database Queries (pdq)
MAX_PDQPRIORITY 100             # Maximum allowed pdqpriority
DS_MAX_QUERIES                  # Maximum number of decision support queries
DS_TOTAL_MEMORY               # Decision support memory (Kbytes)
DS_MAX_SCANS    1048576         # Maximum number of decision support scans
DATASKIP      off             # List of dbspaces to skip

# OPTCOMPIND
# 0 => Nested loop joins will be preferred (where
#      possible) over sortmerge joins and hash joins.
# 1 => If the transaction isolation mode is not
#      "repeatable read", optimizer behaves as in (2)
#      below.Otherwise it behaves as in (0) above.
# 2 => Use costs regardless of the transaction isolation
#      mode.Nested loop joins are not necessarily
#      preferred.Optimizer bases its decision purely
#      on costs.
OPTCOMPIND      2               # To hint the optimizer

ONDBSPACEDOWN   2               # Dbspace down option: 0 = CONTINUE, 1 = ABORT, 2 = WAIT
LBU_PRESERVE    0               # Preserve last log for log backup
OPCACHEMAX      0               # Maximum optical cache size (Kbytes)

# HETERO_COMMIT (Gateway participation in distributed transactions)
# 1 => Heterogeneous Commit is enabled
# 0 (or any other value) => Heterogeneous Commit is disabled
HETERO_COMMIT   0

# Optimization goal: -1 = ALL_ROWS(Default), 0 = FIRST_ROWS
OPT_GOAL      -1

# Optimizer DIRECTIVES ON (1/Default) or OFF (0)
DIRECTIVES      1


系统为redhat 9.0 + IDS 7.3 ,平时大约有100个用户会同时在线,请各位大侠帮忙看看,不胜感激!

[ 本帖最后由 yunzhongyue 于 2007-12-11 17:01 编辑 ]

yunzhongyue 发表于 2007-12-10 08:56

各位帮帮忙了,
今早一看,更厉害了,十分钟一次。
$ onstat -m

Informix Dynamic Server Version 7.30.UC7   -- On-Line -- Up 1 days 12:34:39 -- 1621352 Kbytes

Message Log File: /u/informix/online.log

07:38:57listener-thread: err = -25580: oserr = 0: errstr = : System error occurred in network function.

07:48:57listener-thread: err = -25580: oserr = 0: errstr = : System error occurred in network function.

07:58:57listener-thread: err = -25580: oserr = 0: errstr = : System error occurred in network function.

08:08:57listener-thread: err = -27001: oserr = 0: errstr = : Read error occurred during connection attempt.

08:18:40Logical Log 14441 Complete.
08:18:57listener-thread: err = -27001: oserr = 0: errstr = : Read error occurred during connection attempt.

08:21:21Checkpoint Completed:duration was 1 seconds.
08:28:57listener-thread: err = -25580: oserr = 0: errstr = : System error occurred in network function.

08:36:44Logical Log 14442 Complete.
08:38:57listener-thread: err = -25580: oserr = 0: errstr = : System error occurred in network function.

08:48:57listener-thread: err = -25580: oserr = 0: errstr = : System error occurred in network function.

czw1413_cn 发表于 2007-12-10 18:24

3sane 发表于 2007-12-11 15:24

回复 #1 yunzhongyue 的帖子

MULTIPROCESSOR0               # 0 for single-processor, 1 for multi-processor
NUMCPUVPS       4               # Number of user (cpu) vps
SINGLE_CPU_VP   0               # If non-zero, limit number of cpu vps to one

服务器是否多CPU?,建议MULTIPROCESSOR=1,否则NUMCPUVPS=1。

NETTYPE         soctcp,1,150,NET
如果NUMCPUVPS=4,建议增加NETTYPE中的轮询线索数,减少用户数,如:
NETTYPE         soctcp,3,50,NET

yunzhongyue 发表于 2007-12-11 16:52

回复 #4 3sane 的帖子

服务器有2个CPU,4个core
25580这个错误的原因找到了,是因为有一台monitor第10分钟连一次服务器造成的,但是-349这个错误还没有解决,程序一直用的没有问题,自从config文件修改成上面的样子后偶而就会出现,每次出现这个错误也不是在同一个位置,日志中也没有报错,因为那个程序要跑一个多小时,总是重跑会很浪费时间的。

liaosnet 发表于 2007-12-12 09:17

原帖由 yunzhongyue 于 2007-12-11 16:52 发表 http://bbs.chinaunix.net/images/common/back.gif
服务器有2个CPU,4个core
25580这个错误的原因找到了,是因为有一台monitor第10分钟连一次服务器造成的,但是-349这个错误还没有解决,程序一直用的没有问题,自从config文件修改成上面的样子后偶而就会出现, ...

25580这个错误原因找到?用什么方法解决的??

liaosnet 发表于 2007-12-12 09:18

349错误是没有选择到数据库

-349    Database not selected yet.

This statement cannot be executed because no current database exists.
Either no current database has been established yet, or the current
database was closed with a CLOSE DATABASE statement. You execute the
DATABASE or CREATE DATABASE statement to establish a current database.

yunzhongyue 发表于 2007-12-12 11:04

回复 #6 liaosnet 的帖子

只要将监控去掉就不会在出现25580这个错误了。

349那个错误只是偶而会出现,而且每次出现也不是在同一个地方,应用应当没有问题。

liaosnet 发表于 2007-12-12 13:51

原帖由 yunzhongyue 于 2007-12-12 11:04 发表 http://bbs.chinaunix.net/images/common/back.gif
只要将监控去掉就不会在出现25580这个错误了。

349那个错误只是偶而会出现,而且每次出现也不是在同一个地方,应用应当没有问题。

呵~~我想知道的是为什么会出现25580?出现在机制什么??
就像你说的有监控连接会报25580的错误,那为什么其他的连接就没报呢?.......:em14:

yunzhongyue 发表于 2007-12-13 11:58

回复 #9 liaosnet 的帖子

因为监控会侦测INFORMIX的TCP PORT在不在,侦测到后就会断开,
就像他刚说完hello转头就走,可是INFORMIX才说出‘好久不见‘,对方已经不在了,
所以INFORMIX 的LOG中会出现这个错误。
其它的连接也会出现这个错误.
页: [1] 2
查看完整版本: RH 9.0+IDS 7.3 ERROR 25580,27001,349