- 论坛徽章:
- 0
|
http://rackerhacker.com/category/emergency/
Plesk authorization failed: HTTP request error [7]
Posted by:
major
in
Emergency
,
Plesk
I found myself wrestling with a server where the Plesk interface suddenly became unavailable without any user intervention. An attempt to start the service was less than fruitful:
[root@server ~]# service psa start
Key file: /opt/drweb/drweb32.key - Key file not found!
A path to a valid license key file does not specified.
Plesk authorization failed: HTTP request error [7]
Error: Plesk Software not running.
[FAILED]
(Although I included the text from the drweb failure, I later found that it was not related to the issue. However, since it might appear in your logs prior to the HTTP request error, I included it anyways.)
This was a perfectly working server that had no other issues besides this peculiar Plesk issue. Another technician had upgraded the license a few weeks prior, and it was verified at the the time to be working properly. After a bit of Google searching, I found that the solution was to completely stop Plesk and its related services and then start it all up again.
[root@server ~]# service psa stopall
/usr/local/psa/admin/bin/httpsdctl stop: httpd stopped
Stopping Plesk: [ OK ]
Stopping named: [ OK ]
service psa startStopping MySQL: [ OK ]
Stopping : Stopping Courier-IMAP server:
Stopping imap [ OK ]
Stopping imap-ssl [ OK ]
Stopping pop3 [ OK ]
Stopping pop3-ssl [ OK ]
Stopping postgresql service: [ OK ]
Shutting down psa-spamassassin service: [ OK ]
Stopping httpd: [ OK ]
[root@server ~]# service psa start
Starting named: [ OK ]
Starting MySQL: [ OK ]
Starting qmail: [ OK ]
Starting Courier-IMAP server:
Starting imapd [ OK ]
Starting imap-ssl [ OK ]
Starting pop3 [ OK ]
Starting pop3-ssl [ OK ]
Starting postgresql service: [ OK ]
Starting psa-spamassassin service: [ OK ]
Processing config directory: /usr/local/psa/admin/conf/httpsd.*.include
/usr/local/psa/admin/bin/httpsdctl start: httpd started
Starting Plesk: [ OK ]
Starting up drwebd: [ OK ]
I couldn’t nail down anything within the Plesk log files that would explain the cause of the problem, but this solution corrected the issue instantly.
This issue occurred with Plesk 8.1.1 on Red Hat Enterprise Linux 4 Update 5
![]()
No Comments »
Nov 13 2007
![]()
![]()
Rackspace Outage
Posted by:
major
in
Emergency
I’ve received a lot of IM’s and e-mails from friends and readers of this blog about the Rackspace outages on November 11th and 12th. I work for a company that believes in full disclosure, so if you want the facts, they’re already available to the public:
Rackspace Information Center
TechCrunch - Quick, Plug The Internet Back In: Major Rackspace Outage
Laughing Squid - Massive Power Outage At Rackspace’s Dallas Data Center
37Signals - Downtime Explanation
ValleyWag - Truck driver in Texas kills all the websites you really use
I don’t know of any additional information besides what is contained within these articles and blog posts. However, I can tell you that I’ve never worked for a company before that pulled together in such large numbers to get on the phones and respond to tickets long after shifts should have been over. Obviously, it was a horrible “perfect storm” type of situation, and no one would wish for it to happen to anyone.
Over the last five years, I’ve had dedicated servers and VPS accounts with seven companies. Out of those seven, five have had major outages. After those outages, I can honestly say I received a timely and courteous response from one of the companies in that list. In one situation with a certain Texas hosting company, I had no network connectivity for almost 72 hours with no response to phone calls or trouble tickets.
After it’s all said and done (and it’s not done yet), I find myself to be very proud of the company for which I work. Server parts will eventually fail, as will networks, generators and power grids - it’s inevitable. The important part is that the will of those who are providing the support never fails.
Our will is strong, and it continues to stay that way.
![]()
No Comments »
Oct 17 2007
![]()
![]()
Enforcing mode requested but no policy loaded. Halting now.
Posted by:
major
in
Emergency
,
Kernel Panics
,
Security
Here’s a pretty weird kernel panic that I came across the other day:
Enforcing mode requested but no policy loaded. Halting now.
Kernel panic - not syncing: Attempted to kill init!
This usually means that you’ve set SELINUX in enforcing mode within /etc/sysconfig/selinux or /etc/selinux/selinux.conf but you don’t have the appropriate SELINUX packages installed. To fix the issue, boot the server into the Red Hat rescue environment and disable SELINUX until you can install the proper packages that contain the SELINUX targeted configuration.
This kernel panic appeared on a Red Hat Enterprise Linux 4 Update 5 server.
![]()
No Comments »
Aug 25 2007
![]()
![]()
DB function failed with error number 1033
Posted by:
major
in
Database
,
Emergency
One of these errors might appear on your website without warning:
Warning: DB function failed with error number 1033
Incorrect information in file: './database_name/table_name.frm' SQL=SELECT col1, col2 FROM table_name WHERE col3 = 'some_value' ORDER BY col1 ASC
MySQL is telling you that the table structure it has within data files doesn’t match the structure in the .frm file that’s on the disk. There’s only a few scenarios where this can happen:
Different version of the .frm files
If the .frm files from an older or later version of the table are placed in MySQL’s data directory, MySQL will become confused and it won’t be able to determine the proper database structure.
Pending table alteration
A pending database operation that ran an ALTER TABLE may not have written changes to the disk. MySQL may have stopped running abruptly or the entire server may have crashed. The normal operation for MySQL is to make changes in memory first and then perform disk operations.
Complete wierdness
I cannot explain it, and I can’t figure out the logic that would allow it to happen, but some web application vulnerabilities can cause this problem. I’ve seen it happen with Joomla! sites running on fairly secure servers, and there was no Apache privilege escalation used to modify the .frm files directly.
How is it fixed? The only way to repair it is to import the table again from a mysqldump backup, find the correct .frm file and restore it on the server, or run an ALTER TABLE to bring the table back to its original state.
![]()
No Comments »
Aug 24 2007
![]()
![]()
Apache: No space left on device: Couldn’t create accept lock
Posted by:
major
in
Emergency
,
Web
This error completely stumped me a couple of weeks ago. Apparently someone was adjusting the Apache configuration, then they checked their syntax and attempted to restart Apache. It went down without a problem, but it refused to start properly, and didn’t bind to any ports.
Within the Apache error logs, this message appeared over and over:
[emerg] (28)No space left on device: Couldn't create accept lock
Apache is basically saying “I want to start, but I need to write some things down before I can start, and I have nowhere to write them!” If this happens to you, check these items in order:
1. Check your disk space
This comes first because it’s the easiest to check, and sometimes the quickest to fix. If you’re out of disk space, then you need to fix that problem.
![]()
2. Review filesystem quotas
If your filesystem uses quotas, you might be reaching a quota limit rather than a disk space limit. Use repquota / to review your quotas on the root partition. If you’re at the limit, raise your quota or clear up some disk space. Apache logs are usually the culprit in these situations.
3. Clear out your active semaphores
Semaphores? What the heck is a semaphore? Well, it’s actually an
apparatus for conveying information by means of visual signals
. But, when it comes to programming,
semaphores are used for communicating between the active processes of a certain application
. In the case of Apache, they’re used to communicate between the parent and child processes. If Apache can’t write these things down, then it can’t communicate properly with all of the processes it starts.
I’d assume if you’re reading this article, Apache has stopped running. Run this command as root:
# ipcs -s
If you see a list of semaphores, Apache has not cleaned up after itself, and some semaphores are stuck. Clear them out with this command:
# for i in `ipcs -s | awk '/httpd/ {print $2}'`; do (ipcrm -s $i); done
Now, in almost all cases, Apache should start properly. If it doesn’t, you may just be completely out of available semaphores. You may want to increase your available semaphores, and you’ll need to tickle your kernel to do so. Add this to /etc/sysctl.conf:
kernel.msgmni = 1024
kernel.sem = 250 256000 32 1024
And then run sysctl -p to pick up the new changes.
Further reading:
Wikipedia: Semaphore (Programming)
Apache accept lock fix
![]()
No Comments »
Aug 23 2007
![]()
![]()
MySQL couldn’t find log file
Posted by:
major
in
Database
,
Emergency
This error will pop up when binary logging is enabled, and someone thought it was a good idea to remove binary logs from the filesystem:
/usr/sbin/mysqld: File './mysql_bin.000025' not found (Errcode: 2)
[ERROR] Failed to open log (file './9531_mysql_bin.000025', errno 2)
[ERROR] Could not open log file
[ERROR] Can't init tc log
[ERROR] Aborting
InnoDB: Starting shutdown...
InnoDB: Shutdown completed; log sequence number 0 2423986213
[Note] /usr/sbin/mysqld: Shutdown complete
Basically, MySQL is looking in the mysql-bin.index file and it cannot find the log files that are listed within the index. This will keep MySQL from starting, but the fix is quick and easy. You have two options:
Edit the index file
You can edit the mysql-bin.index file in a text editor of your choice and remove the references to any logs which don’t exist on the filesystem any longer. Once you’re done, save the index file and start MySQL.
Take away the index file
Move or delete the index file and start MySQL. This will cause MySQL to reset its binary log numbering scheme, so if this is important to you, you may want to choose the previous option.
So how do you prevent this from happening? Use the PURGE MASTER LOGS statement and allow MySQL to delete its logs on its own terms. If you’re concerned about log files piling up, adjust the expire_logs_days variable in your /etc/my.cnf.
Further reading:
12.6.1.1. PURGE MASTER LOGS Syntax
5.2.3 System Variables
![]()
No Comments »
Aug 21 2007
![]()
![]()
Qmail-smtpd spawns many processes and uses 100% of CPU
Posted by:
major
in
Emergency
,
Mail
,
Plesk
It’s not abnormal for qmail act oddly at times with Plesk, and sometimes it can use 100% of the CPU. However, if you find qmail’s load to be higher than usual with a small volume of mail, there may be a fix that you need.
First off, check for two files in /var/qmail/control called dh512.pem and dh1024.pem. If they are present, well, then this article won’t be able to help you. You have a different issue that is causing increased CPU load (check for swap usage and upgrade your disk’s speed).
If the files aren’t there, do the following:
# cd /var/qmail/control
# cp dhparam512.pem dh512.pem
# cp dhparam1024.pem dh1024.pem
# /etc/init.d/qmail restart
# /etc/init.d/xinetd restart
At this point, your CPU load should be reduced once the currently running processes for qmail clear out.
So why is this fix required? Without dh512.pem and dh1024.pem, qmail has to create certificate and key pairs when other mail servers or mail users connect to qmail via TLS. If qmail is forced to generate them on the fly, you will get a big performance hit, and your load will be much higher than it could be. By copying the dhparam files over, you will pre-populate the SSL key and certificate for qmail to use, and all it has to do is pick it up off the file system rather than regenerating it each time.
Further reading:
SWsoft Forums: Qmail-smtpd spawning many processes, using full cpu
![]()
No Comments »
Jul 01 2007
![]()
![]()
Repair auto_increment in MySQL
Posted by:
major
in
Database
,
Emergency
Table corruption in MySQL can often wreak havoc on the auto_increment fields. I’m still unsure why it happens, but if you find a table tries to count from 0 after a table corruption, just find the highest key in the column and add 1 to it (in this example, I’ll say the highest key is 9500).
Just run this one SQL statement on the table:
ALTER TABLE brokentablename AUTO_INCREMENT=9501;
If you run a quick insert and then run SELECT last_insert_id(), the correct key number should be returned (9501 in this case).
![]()
No Comments »
Jun 18 2007
![]()
![]()
Corrupt /dev/null
Posted by:
major
in
Command Line
,
Emergency
If you find that /dev/null is no longer a block device, and it causes issues during init on Red Hat boxes, you will need to follow these steps to return things to normal:
- Reboot the server
- When grub appears, edit your kernel line to include init=/bin/bash at the end
- Allow the server to boot into the emergency shell
- Run the following three commands
# rm -rf /dev/null
# mknod /dev/null c 1 3
# chmod 666 /dev/null
You should be back to normal. Make sure that the root users on your server don’t use cp or mv with /dev/null as this will cause some pretty ugly issues.
![]()
No Comments »
Jun 14 2007
![]()
![]()
Rebuild RPM file permissions and ownerships
Posted by:
major
in
Command Line
,
Emergency
,
Security
If you find that someone has done a recursive chmod or chown on a server, don’t fret. You can set almost everything back to its original permissions and ownership by doing the following:
rpm -qa | xargs rpm --setperms --setugids
Depending on how many packages are installed as well as the speed of your disk I/O, this may take a while to complete.
本文来自ChinaUnix博客,如果查看原文请点:http://blog.chinaunix.net/u/25994/showart_426165.html |
|