Consoling Sun machines with our products.
When a serial device such as a terminal server, PC, or even a dumb terminal is used to "console" a Sun machine, the Sun machine will "halt" and stop running when the consoling device is power cycled.
So far, we've seen this problem only on Sun workstations and servers. Further, it happens not only with our products, but also Cisco Access Servers, Specialix Jetstream servers, serial ports on the back of most PC's, dumb terminals etc. In other words, it appears to be a problem with what the Sun system considers to be an RS-232 "break condition", as opposed to one actually being sent intentionally by the consoling device.
Once the system is halted, you can restore service by typing "c" for continue, but the downtime is unacceptable, and you also risk data loss or other problems as a result of halting UNIX so abruptly.
A BREAK is defined as a "space" condition on an RS-232/V.24 line for some number of milliseconds (typically about 125ms to 500ms, but a POSIX compliant break condition is specified as 250ms). The normal condition on the line is a "mark".
A space = logical 0 = positive voltage between +3 and +12V on RS-232/V.24
A mark = logical 1 = negative voltage between -3 and -12V on RS-232/V.24
A normal async character starts with:
- a start bit (space)
- 7 or 8 data bits (marks or spaces)
- an optional parity bit
- 1, 1.5 or 2 stop bits (mark)
The BREAK sequence starts of with a start bit (space), followed by a number of spaces (all zeros) for an amount of time greater then the transmission of a "normal" asynchronous character; therefore, the receiving side knows/detects it as a BREAK condition requiring attention.
When RS-232 line drivers lose and/or regain power, the RS-232/V.24 signal can "float" and cause false signaling that the Sun system interprets as a BREAK signal. This is why many vendors advise that BREAK be disabled on console ports by default if the system is to be consoled remotely.
Most (multi)serial port devices from vendors such as ourselves typically send a small "glitch" of energy at one or more of these events:
- The access server is powered on.
- The access server is powered off.
- The serial controller hardware of the access server is reset.
This "glitch" looks like a "BREAK" signal to the Sun system. By default, Sun Microsystems computers will halt execution of the operating system and drop in to ROM monitor upon receipt of BREAK. This behavior is intended to facilitate debugging system hangs and serious performance issues. Other UNIX systems may have this behavior as well, but so far, we''''ve only seen it in Suns.
Digi Product Solutions/Workarounds:
Because the spurious BREAK signal is an artifact of physical layer issues, a solution is required that prevents the BREAK signal from getting from to the Sun, or that causes the Sun to not
interpret the BREAK signal as a halt command.
- PortServer TS8, PortServer TS16 and the Digi CM products are Sun Break Safe.
- Power Control Modules for Etherlite Products: ASP Technologies (800) 516-0841 sells a PWR-001 which can be used to prevent the problem from occurring on Etherlite 2, 8, and 16 port units.
- A simple workaround on Solaris 2.6 and higher is to edit the /etc/default/kbd file to disable halt on break. This file is self documented.
Mar 22, 2019