前言
之前使用的 UPS 是 CyberPower OLS1000E 。
从 2025 年 10 月开始,服务器会在每周三的 15:30 突然关机,并立刻重启,服务器日志中只有开机相关的部分而完全没有关于关机的内容。推断 UPS 会对应时刻突然切断输出,并立刻恢复导致。但是查看 UPS 日志却没有任何有关报错的信息。而后我将 UPS 设置为旁路模式,结果问题消失,由此确定为 UPS 问题。
redstone1024@myredstone-debian:~$ sudo journalctl --list-boots -r
IDX BOOT ID FIRST ENTRY LAST ENTRY
0 6452dedee14c4fc893b535810ac25779 Wed 2025-10-22 18:51:49 CST Thu 2025-10-30 18:53:36 CST
-1 a21fbae8486444b7a2f34c6354fc0ecf Wed 2025-10-22 15:32:00 CST Wed 2025-10-22 18:45:45 CST
-2 f86a791dcb104456a915c78a60799649 Wed 2025-10-15 15:38:35 CST Wed 2025-10-22 15:28:20 CST
-3 502c6a79302a4b02888e3f63d68d47d3 Wed 2025-10-08 15:32:55 CST Wed 2025-10-15 15:28:55 CST
-4 e22abe8dbf924ca79923b652af02e728 Thu 2025-10-02 18:19:42 CST Wed 2025-10-08 15:29:31 CST
-5 ff49f2f74a2c48be9f6412b3de4b6d83 Thu 2025-10-02 17:25:56 CST Thu 2025-10-02 18:15:14 CST
-6 1431417c43574cafaf7c5b768ce19207 Thu 2025-10-02 16:14:39 CST Thu 2025-10-02 17:18:58 CST
-7 4c564ecbdf0a4c0aa0a2ce72b8cd14ab Wed 2025-10-01 15:33:03 CST Thu 2025-10-02 16:09:08 CST
-8 674e335233f0444da865210239596234 Wed 2025-07-16 00:11:49 CST Wed 2025-10-01 15:29:59 CST
-9 7b783abea03d42eb983108d1bf50b335 Tue 2025-07-15 23:46:58 CST Tue 2025-07-15 23:55:43 CST
-10 1dab7080a9244ceeba7e6e6ab1e522e1 Fri 2024-07-05 21:19:23 CST Tue 2025-07-15 23:13:24 CST
于是现在决定把 UPS 换成 APC SPM1K 。
移除旧的 CyberPower UPS
- 记录信息
redstone1024@myredstone-debian:~$ sudo pwrstat -status
The UPS information shows as following:
Properties:
Model Name................... OLS1000E
Firmware Number.............. 6.1.6
Rating Voltage............... 220 V
Rating Power................. 900 Watt(1000 VA)
Current UPS status:
State........................ Normal
Power Supply by.............. Utility Power
Utility Voltage.............. 223 V
Output Voltage............... 223 V
Utility Frequency............ 50.0 Hz
Output Frequency............. 50.0 Hz
Battery Capacity............. 100 %
Remaining Runtime............ 22 min.
Load......................... 207 Watt(23 %)
Test Result.................. Passed at 2025/10/22 16:04:32
Last Power Event............. Blackout at 2025/09/03 15:31:36 for 7 sec.
redstone1024@myredstone-debian:~$ sudo journalctl -u pwrstatd -r
Oct 22 18:53:23 myredstone-debian systemd[1]: Started pwrstatd.service - The monitor UPS software..
Oct 22 18:53:23 myredstone-debian pwrstatd[1428]: Starting pwrstatd 1.4.1.
Oct 22 18:53:23 myredstone-debian systemd[1]: Starting pwrstatd.service - The monitor UPS software....
-- Boot a21fbae8486444b7a2f34c6354fc0ecf --
Oct 22 18:45:27 myredstone-debian systemd[1]: pwrstatd.service: Consumed 3.750s CPU time, 6.9M memory peak.
Oct 22 18:45:27 myredstone-debian systemd[1]: Stopped pwrstatd.service - The monitor UPS software..
Oct 22 18:45:27 myredstone-debian systemd[1]: pwrstatd.service: Deactivated successfully.
Oct 22 18:45:27 myredstone-debian pwrstatd[439355]: Stopping pwrstatd 1.4.1.
Oct 22 18:45:27 myredstone-debian systemd[1]: Stopping pwrstatd.service - The monitor UPS software....
Oct 22 15:34:01 myredstone-debian systemd[1]: Started pwrstatd.service - The monitor UPS software..
Oct 22 15:34:01 myredstone-debian pwrstatd[1471]: Starting pwrstatd 1.4.1.
Oct 22 15:34:01 myredstone-debian systemd[1]: Starting pwrstatd.service - The monitor UPS software....
-- Boot f86a791dcb104456a915c78a60799649 --
Oct 15 15:40:24 myredstone-debian systemd[1]: Started pwrstatd.service - The monitor UPS software..
Oct 15 15:40:24 myredstone-debian pwrstatd[1482]: Starting pwrstatd 1.4.1.
Oct 15 15:40:24 myredstone-debian systemd[1]: Starting pwrstatd.service - The monitor UPS software....
-- Boot 502c6a79302a4b02888e3f63d68d47d3 --
Oct 08 15:34:49 myredstone-debian systemd[1]: Started pwrstatd.service - The monitor UPS software..
Oct 08 15:34:49 myredstone-debian pwrstatd[1497]: Starting pwrstatd 1.4.1.
Oct 08 15:34:49 myredstone-debian systemd[1]: Starting pwrstatd.service - The monitor UPS software....
-- Boot e22abe8dbf924ca79923b652af02e728 --
Oct 02 18:21:15 myredstone-debian systemd[1]: Started pwrstatd.service - The monitor UPS software..
Oct 02 18:21:15 myredstone-debian pwrstatd[1374]: Starting pwrstatd 1.4.1.
Oct 02 18:21:15 myredstone-debian systemd[1]: Starting pwrstatd.service - The monitor UPS software....
-- Boot ff49f2f74a2c48be9f6412b3de4b6d83 --
Oct 02 18:14:56 myredstone-debian systemd[1]: Stopped pwrstatd.service - The monitor UPS software..
Oct 02 18:14:56 myredstone-debian systemd[1]: pwrstatd.service: Deactivated successfully.
Oct 02 18:14:56 myredstone-debian pwrstatd[169383]: Stopping pwrstatd 1.4.1.
Oct 02 18:14:54 myredstone-debian systemd[1]: Stopping pwrstatd.service - The monitor UPS software....
Oct 02 17:27:04 myredstone-debian systemd[1]: Started pwrstatd.service - The monitor UPS software..
Oct 02 17:27:04 myredstone-debian pwrstatd[1428]: Starting pwrstatd 1.4.1.
Oct 02 17:27:03 myredstone-debian systemd[1]: Starting pwrstatd.service - The monitor UPS software....
-- Boot 1431417c43574cafaf7c5b768ce19207 --
Oct 02 17:18:39 myredstone-debian systemd[1]: Stopped pwrstatd.service - The monitor UPS software..
Oct 02 17:18:39 myredstone-debian systemd[1]: pwrstatd.service: Deactivated successfully.
Oct 02 17:18:39 myredstone-debian pwrstatd[275866]: Stopping pwrstatd 1.4.1.
Oct 02 17:18:39 myredstone-debian systemd[1]: Stopping pwrstatd.service - The monitor UPS software....
Oct 02 16:15:46 myredstone-debian systemd[1]: Started pwrstatd.service - The monitor UPS software..
Oct 02 16:15:46 myredstone-debian pwrstatd[1347]: Starting pwrstatd 1.4.1.
Oct 02 16:15:46 myredstone-debian systemd[1]: Starting pwrstatd.service - The monitor UPS software....
-- Boot 4c564ecbdf0a4c0aa0a2ce72b8cd14ab --
Oct 02 16:08:31 myredstone-debian systemd[1]: pwrstatd.service: Consumed 26.188s CPU time.
Oct 02 16:08:31 myredstone-debian systemd[1]: Stopped pwrstatd.service - The monitor UPS software..
Oct 02 16:08:31 myredstone-debian systemd[1]: pwrstatd.service: Deactivated successfully.
Oct 02 16:08:30 myredstone-debian pwrstatd[2720528]: Stopping pwrstatd 1.4.1.
Oct 02 16:08:30 myredstone-debian systemd[1]: Stopping pwrstatd.service - The monitor UPS software....
Oct 01 15:34:29 myredstone-debian systemd[1]: Started pwrstatd.service - The monitor UPS software..
Oct 01 15:34:29 myredstone-debian pwrstatd[1399]: Starting pwrstatd 1.4.1.
Oct 01 15:34:29 myredstone-debian systemd[1]: Starting pwrstatd.service - The monitor UPS software....
-- Boot 674e335233f0444da865210239596234 --
Jul 16 00:12:54 myredstone-debian systemd[1]: Started pwrstatd.service - The monitor UPS software..
Jul 16 00:12:54 myredstone-debian pwrstatd[1254]: Starting pwrstatd 1.4.1.
Jul 16 00:12:54 myredstone-debian systemd[1]: Starting pwrstatd.service - The monitor UPS software....
-- Boot 7b783abea03d42eb983108d1bf50b335 --
- 拆卸驱动
redstone1024@myredstone-debian:~$ sudo -i command -v pwrstat
/usr/sbin/pwrstat
redstone1024@myredstone-debian:~$ sudo dpkg -S /usr/sbin/pwrstat
powerpanel: /usr/sbin/pwrstat
redstone1024@myredstone-debian:~$ sudo apt-get purge -y powerpanel
- 服务器常规关机
- 移除 CyberPower UPS
安装新的 APC UPS
-
安装 APC UPS
-
通过 USB 连接服务器和 UPS
-
服务器常规开机
-
安装驱动
redstone1024@myredstone-debian:~$ sudo apt-get update
redstone1024@myredstone-debian:~$ sudo apt-get install apcupsd
- 修改驱动设置为 USB 串口连接
redstone1024@myredstone-debian:~$ sudo vim /etc/apcupsd/apcupsd.conf
...
UPSCABLE smart
UPSTYPE apcsmart
DEVICE /dev/ttyUSB0
...
-
修改 EEPROM 设置
RETURNCHARGE 50– 当电力恢复后,若电池电量在 50% 以上,恢复供电输出。LOWBATT 5– 当电力故障时,若预计剩余时间少于 5 分钟,持续低电量报警。SLEEP 180– 接到关机信号后,等待 180 秒的关机宽容延迟,关闭供电输出。WAKEUP 60– 当电力恢复后,等待 60 秒延迟,恢复供电输出。
redstone1024@myredstone-debian:~$ sudo systemctl stop apcupsd
redstone1024@myredstone-debian:~$ sudo apctest
...
Config Current Permitted
Description Directive Value Values
===================================================================
Upper transfer voltage HITRANSFER 264 231 242 253 264
Lower transfer voltage LOTRANSFER 176 187 176 165 154
Return threshold RETURNCHARGE 50 00 15 50 90
Output voltage on batts OUTPUTVOLTS 220 220 230 240
Low battery warning LOWBATT 5 02 05 07 10
Shutdown grace delay SLEEP 180 020 180 300 600
Alarm delay BEEPSTATE 0 0 T L N
Wakeup delay WAKEUP 60 000 060 180 300
Self test interval SELFTEST 336 336 168 ON OFF
===================================================================
...
redstone1024@myredstone-debian:~$ sudo systemctl start apcupsd
-
修改驱动设置
-
UPSCABLE smart– 使用 Smart 协议连接 UPS -
UPSTYPE apcsmart– 使用 Smart 协议连接 UPS -
/dev/serial/by-id/usb-04e2_1410-if00-port0– 持久化的稳定 UPS 设备路径 -
ONBATTERYDELAY 30– 当电力故障时,等待 30 秒延迟,进入关机判定。 -
BATTERYLEVEL 25– 当电力故障时,若电池电量在 25% 以下,进入关机流程。 -
MINUTES 10– 当电力故障时,若预计剩余时间少于 10 分钟,进入关机流程。 -
TIMEOUT 0– 禁用常量时间关机判定。 -
EVENTSFILEMAX 4096– 设置事件日志文件的最大大小为 4096KB 。
-
redstone1024@myredstone-debian:~$ sudo vim /etc/apcupsd/apcupsd.conf
...
#
# ========= General configuration parameters ============
#
...
# UPSCABLE <cable>
# Defines the type of cable connecting the UPS to your computer.
#
# Possible generic choices for <cable> are:
# simple, smart, ether, usb
#
# Or a specific cable model number may be used:
# 940-0119A, 940-0127A, 940-0128A, 940-0020B,
# 940-0020C, 940-0023A, 940-0024B, 940-0024C,
# 940-1524C, 940-0024G, 940-0095A, 940-0095B,
# 940-0095C, 940-0625A, M-04-02-2000
#
UPSCABLE smart
# To get apcupsd to work, in addition to defining the cable
# above, you must also define a UPSTYPE, which corresponds to
# the type of UPS you have (see the Description for more details).
# You must also specify a DEVICE, sometimes referred to as a port.
# For USB UPSes, please leave the DEVICE directive blank. For
# other UPS types, you must specify an appropriate port or address.
#
# UPSTYPE DEVICE Description
# apcsmart /dev/tty** Newer serial character device, appropriate for
# SmartUPS models using a serial cable (not USB).
#
# usb <BLANK> Most new UPSes are USB. A blank DEVICE
# setting enables autodetection, which is
# the best choice for most installations.
#
# net hostname:port Network link to a master apcupsd through apcupsd's
# Network Information Server. This is used if the
# UPS powering your computer is connected to a
# different computer for monitoring.
#
# snmp hostname:port:vendor:community
# SNMP network link to an SNMP-enabled UPS device.
# Hostname is the ip address or hostname of the UPS
# on the network. Vendor can be can be "APC" or
# "APC_NOTRAP". "APC_NOTRAP" will disable SNMP trap
# catching; you usually want "APC". Port is usually
# 161. Community is usually "private".
#
# netsnmp hostname:port:vendor:community
# OBSOLETE
# Same as SNMP above but requires use of the
# net-snmp library. Unless you have a specific need
# for this old driver, you should use 'snmp' instead.
#
# dumb /dev/tty** Old serial character device for use with
# simple-signaling UPSes.
#
# pcnet ipaddr:username:passphrase:port
# PowerChute Network Shutdown protocol which can be
# used as an alternative to SNMP with the AP9617
# family of smart slot cards. ipaddr is the IP
# address of the UPS management card. username and
# passphrase are the credentials for which the card
# has been configured. port is the port number on
# which to listen for messages from the UPS, normally
# 3052. If this parameter is empty or missing, the
# default of 3052 will be used.
#
# modbus /dev/tty** Serial device for use with newest SmartUPS models
# supporting the MODBUS protocol.
# modbus <BLANK> Leave the DEVICE setting blank for MODBUS over USB
# or set to the serial number of the UPS to ensure
# that apcupsd binds to that particular unit
# (helpful if you have more than one USB UPS).
#
UPSTYPE apcsmart
DEVICE /dev/serial/by-id/usb-04e2_1410-if00-port0
...
#
# ======== Configuration parameters used during power failures ==========
#
# The ONBATTERYDELAY is the time in seconds from when a power failure
# is detected until we react to it with an onbattery event.
#
# This means that, apccontrol will be called with the powerout argument
# immediately when a power failure is detected. However, the
# onbattery argument is passed to apccontrol only after the
# ONBATTERYDELAY time. If you don't want to be annoyed by short
# powerfailures, make sure that apccontrol powerout does nothing
# i.e. comment out the wall.
ONBATTERYDELAY 30
#
# Note: BATTERYLEVEL, MINUTES, and TIMEOUT work in conjunction, so
# the first that occurs will cause the initation of a shutdown.
#
# If during a power failure, the remaining battery percentage
# (as reported by the UPS) is below or equal to BATTERYLEVEL,
# apcupsd will initiate a system shutdown.
BATTERYLEVEL 25
# If during a power failure, the remaining runtime in minutes
# (as calculated internally by the UPS) is below or equal to MINUTES,
# apcupsd, will initiate a system shutdown.
MINUTES 10
# If during a power failure, the UPS has run on batteries for TIMEOUT
# many seconds or longer, apcupsd will initiate a system shutdown.
# A value of 0 disables this timer.
#
# Note, if you have a Smart UPS, you will most likely want to disable
# this timer by setting it to zero. That way, you UPS will continue
# on batteries until either the % charge remaing drops to or below BATTERYLEVEL,
# or the remaining battery runtime drops to or below MINUTES. Of course,
# if you are testing, setting this to 60 causes a quick system shutdown
# if you pull the power plug.
# If you have an older dumb UPS, you will want to set this to less than
# the time you know you can run on batteries.
TIMEOUT 0
...
#
# ==== Configuration statements for Network Information Server ====
#
...
# EVENTSFILEMAX <kilobytes>
# By default, the size of the EVENTSFILE will be not be allowed to exceed
# 10 kilobytes. When the file grows beyond this limit, older EVENTS will
# be removed from the beginning of the file (first in first out). The
# parameter EVENTSFILEMAX can be set to a different kilobyte value, or set
# to zero to allow the EVENTSFILE to grow without limit.
EVENTSFILEMAX 4096
...
redstone1024@myredstone-debian:~$ sudo systemctl restart apcupsd
-
修改 KVM 设置
ON_SHUTDOWN=shutdown– 宿主机关闭时,向虚拟机发送优雅关机信号。PARALLEL_SHUTDOWN=4– 宿主机关闭时,最多 4 台虚拟机并行关机。SHUTDOWN_TIMEOUT=180– 宿主机关闭时,最多等待每台虚拟机 180 秒优雅关机。
redstone1024@myredstone-debian:~$ sudo vim /etc/default/libvirt-guests
...
# action taken on host shutdown
# - suspend all running guests are suspended using virsh managedsave
# - shutdown all running guests are asked to shutdown. Please be careful with
# this settings since there is no way to distinguish between a
# guest which is stuck or ignores shutdown requests and a guest
# which just needs a long time to shutdown. When setting
# ON_SHUTDOWN=shutdown, you must also set SHUTDOWN_TIMEOUT to a
# value suitable for your guests.
ON_SHUTDOWN=shutdown
# Number of guests will be shutdown concurrently, taking effect when
# "ON_SHUTDOWN" is set to "shutdown". If Set to 0, guests will be shutdown one
# after another. Number of guests on shutdown at any time will not exceed number
# set in this variable.
PARALLEL_SHUTDOWN=4
# Number of seconds we're willing to wait for a guest to shut down. If parallel
# shutdown is enabled, this timeout applies as a timeout for shutting down all
# guests on a single URI defined in the variable URIS. If this is 0, then there
# is no time out (use with caution, as guests might not respond to a shutdown
# request). The default value is 300 seconds (5 minutes).
SHUTDOWN_TIMEOUT=180
...
redstone1024@myredstone-debian:~$ sudo systemctl restart libvirt-guests