Troubleshooting a 7318 IPX Broadcast Boot


Contents

About this document
Prerequisite information
Checking the 7318's status lights
Checking the interface
Checking daemons
Verifying the IPX network
Remote configurations
Checking for unique IPX internal network numbers
7318 SMIT configuration
Checking file permissions
Checking the IPX communication channel
Setting up error logging
PTF level
Debugging from the BIOS console session

About this document

The purpose of this document is to assist with troubleshooting an IPX broadcast boot for an IBM 7318 model P10 or model S20. This document was designed for AIX System Administrators with a knowledge of AIX administration and the 7318. Using this document, IPX broadcast boot troubleshooting takes about 20 minutes.

In a broadcast load configuration, the 7318 broadcasts a boot request to every host on the network. Hosts that are configured to send the 7318 its load image and configuration file reply to the broadcast. This document contains modules that troubleshoot the broadcast load configuration on the host.

This document applies to AIX versions 3.2 through 4.3.

This document is comprised of modules that contain nondestructive troubleshooting steps. If any step requires a modification or interrupts regular system operation, the command will be labeled with a warning message. This document does not contain detailed explanations about the modules and steps in the configuration methods.

Remote troubleshooting

The 7318 was designed to boot and operate locally to an IBM eServer pSeries or RS/6000 (that is, the pSeries or RS/6000 and 7318 are on the same LAN). However, frequently the 7318's functionality is needed in a remote network (that is, the pSeries or RS/6000 and 7318 are on separate networks, linked by a gateway or router). The 7318 is capable of booting while remote from the computer, but special care needs to be taken for the communication channel. Remote troubleshooting tips will be listed throughout this document when troubleshooting a remote 7318 varies from one that is local to the computer.


Prerequisite information

Assumptions

Prior to the use of this document, it is assumed that the following conditions apply:

Terms used in this document

Requirements

To use this document effectively, it is necessary to:


Checking the 7318's status lights

The lights on the front of the 7318 show its status. From left to right, the lights correspond to Power, Ready, AUI interface in use, and 10Base-T interface in use. Different light sequences will be shown during the 7318's power-on and hardware check, booting, and normal operation. For more information on the light sequences, refer to pages 2-5 in the 7318 Serial Communications Network Server Guide and Reference (SC23-2542-00).

  1. Check the light sequence on the front of the 7318 and compare it to the possible sequences listed below.
    Legend:

    Light sequence condition and resolution, if applicable:

    0 0 0 0  The 7318 is not powered ON. The 7318 does not have a switch for powering on and off. You must use the AC plug to power on and off the 7318.

    1 0 0 0  The 7318 has a hardware problem if this sequence persists for more than one minute. In that case, the 7318 needs to be serviced or replaced.

    1 B 0 0  The 7318 is trying to boot but cannot communicate with the network. Check all cabling connecting the 7318 to the LAN. Replace cabling if necessary. Continue troubleshooting once the 7318 is able to transmit onto the LAN.

    1 B 1 0  The 7318 is trying to boot and can transmit onto the LAN.

    1 B 0 1  The 7318 is trying to boot and can transmit onto the LAN. The 7318 is trying to boot but cannot find a load host. Proceed to the next section to continue troubleshooting.

    1 1 1 0  The 7318 is booted and can transmit onto the LAN.

    1 1 0 1  The 7318 is booted and can transmit onto the LAN. The 7318 has booted and is operating as designed. If you are unable to communicate with the 7318, the problem is beyond booting and the scope of this document.

  2. Determine the next action. If the sequence 1 B 1 0 or 1 B 0 1 appears, continue with the next section. If any other sequence appears, correct or resolve the condition listed in the preceding discussion before continuing to the next section.

Checking the interface

The host communicates with the network through an interface (tr0, en0, et0, fi0). To configure an interface for IPX, the interface must be active (the words <UP, RUNNING> appear in the flags when the ifconfig command is run).

NOTES:

	ent0     Available 00-02     Ethernet High-Performance LAN Adapter (8ef5) 
	fddi0    Available 00-03     FDDI Primary Card, Single Ring Fiber 
	tok0     Available 00-04     Token-Ring High-Performance Adapter (8fc8) 
Verify that the communications interface is active using the ifconfig command and the adapter name.

Checking daemons

The Terminal Server depends on many daemons for operation. The sapd and npsd daemons (Novell Protocol Suite) manage the IPX channel for communication and the cnsview daemon manages all Terminal Server devices and processes on the host. All three daemons must be active for a successful IPX boot and normal Terminal Server operation.

  1. Verify that the IPX npsd daemon is active. Enter:
           ps -ef | grep npsd
    

    Sample output when the daemon is active is as follows:

    root 5150      1 0 06:45:54 -      0:00 ./npsd 
    root 16170 15336 2 15:47:33 pts/1  0:00 grep npsd
    
  2. Check to see if the IPX sapd daemon is active. Enter:
           ps -ef | grep sapd
    

    Sample output when the daemon is active is as follows:

    root 5404      1 0 06:45:52 -     0:09 ./sapd
    root 16178 15336 5 15:47:39 pts/1 0:00 grep sapd
    
  3. Check to see if the cnsview daemon is active. Enter:
        ps -ef | grep cnsview
    

    Sample output when the daemon is active is as follows:

    root 3370      1 0 06:46:03 -     0:03 /usr/bin/cnsview -c daemon start 
    root 18978 15336 6 15:47:25 pts/1 0:00 grep cnsview
    

    If all three processes were listed in the output from the preceding ps commands, proceed to the section, "Verifying the IPX network". If the cnsview daemon was active but the sapd or nspd daemons were not active, stop the cnsview daemon and restart all three daemons using the sequence in step 4 below.

    To stop the cnsview daemon, enter:

             cnsview -c "daemon stop"
    

    If all three processes were not listed in the output from the preceding ps commands, proceed to the next step to start the daemons.

  4. To start the sapd and npsd daemons, enter:

    /usr/lpp/netware/bin/startnps
    
  5. To start the cnsview daemon, enter:
    cnsview -c "daemon start"

    NOTE: The cnsview daemon cannot be active until both the sapd and nspd daemons are active. If you are stopping daemons and yet they are still listed as active system processes, execute a kill -1 on the process ID and restart all three daemons, beginning with step 4 above. If this restart is unsuccessful, please contact an IBM AIX Technical Specialist.

  6. Once these daemons are active, proceed to the next section.

Verifying the IPX network

The host's IPX routing table can be viewed by entering the following program:

         /usr/lpp/netware/bin/drouter

If the IPX network is configured properly, it will be listed in the IPX routing table.


Remote configurations

All remote IPX networks will appear in the host's IPX routing table if there is an active IPX communication channel between the two IPX networks (that is, a router or gateway is configured for IPX and passes IPX packets).

  1. To view the host's IPX route table, enter:
             /usr/lpp/netware/bin/drouter
    

    Sample output is as follows:

    NETWORK   HOPS  TIME  NODE          NETWORK   HOPS  TIME  NODE 
    --------  ----  ----  ------------  --------  ----  ----  ---- 
    00000001  0000  0001  000000000001  00000002  0000  0001  02608C2F7119 
    00000003  0000  0001  02608C2F1591  00000004  0001  0002  00406E0002F5 
    00000005  0001  0002  00406E0002DB
    

    The column headings have the following meanings:

    NETWORK is the network number, internal or external. There should be one entry in this table for each network segment in the overall network.

    HOPS is the number of routers which must be passed through to get to this network.

    NODE is the ethernet address of the station used to get to the network.

  2. Verify that the IPX network is shown in the listing.

    If the IPX network on which the 7318 resides is shown in the host's IPX network table, proceed to the next section. If an IPX network is not shown and must be configured, refer to the documentation on configuring IPX networks or call an IBM AIX Technical Specialist.


Checking for unique IPX internal network numbers

If any two IPX hosts on the same inter-network have the same internal network number, the SPX link between the 7318 and the host will be unstable and the 7318 may not boot. Therefore, no two hosts can have the same IPX internal network number.

NOTE: The default internal network number is 00000001, but it can be any eight-digit hexadecimal number (it is usually the last eight digits of the host's MAC address). The internal network number must also be different from the LAN network numbers in the environment.

  1. Open the /etc/netware/NPSConfig file with your favorite editor.

  2. Search on "internal_network".

    Below is a sample internal network number from the /etc/netware/NPSConfig file.

           internal_network = "00000001"
    
  3. Check the internal_network number in the /etc/netware/NPSConfig file on every other IPX host on the network (repeat steps 1 and 2).

    If all the internal_network numbers are unique, continue to the next section. If any internal network numbers are the same, make each host's internal network number unique and continue with the next step.

  4. After making any modifications to the /etc/netware/NPSConfig file, save the changes and exit the file.

  5. Enter the command sequence below on each modified host to recycle the IPX daemons.

    WARNING: Recycling these daemons will disconnect any device communicating with the host via IPX, including P10 Style Ports.

        cnsview -c "daemon stop" 
        /usr/lpp/netware/bin/stopnps 
        /usr/lpp/netware/bin/startnps 
        cnsview -c "daemon start"
    
  6. Proceed to the next section.

7318 SMIT configuration

Verify that the 7318 device information, specified download image, and IPX network address are correct in its SMIT configuration.

  1. To start SMIT using the fast path ts7318_cs_mnu, enter:
         smitty ts7318_cs_mnu
    
  2. Choose Show/Change Configured ComNetServers.

  3. Select the appropriate 7318. A sample stanza of a 7318 configuration follows:
    [Entry Fields]
    ComNetServer Number                  01
    ComNetServer Network Address         [00000002]
    ComNetServer Ethernet Address        [00406ee00155]
    ComNetServer Bootfile                [/usr/lib/cns/cns-p10]
    
  4. Verify that the ComNetServer Network Address is correct.

    It should match an IPX network address listed in the host's IPX routing table. Change the network address if it is incorrect. If the network address is not listed in the host's IPX routing table, either change the 7318's network address in the ComNetServer configuration stanza or check with your network administrator to verify the correct IPX network configuration.

  5. Verify that the ComNetServer Ethernet Address is correct.

    This number is the 7318's hardware address and is labeled on the back of the 7318.

  6. Make sure the ComNetServer Bootfile is specified.

    The default boot image and path for the model P10 and S20 are /usr/lib/cns/cns-p10 and /usr/lib/cns/cns-s20 (or /usr/lib/cns/cns-s20e), respectively.

  7. If no modifications were made in SMIT, skip to step 8 in this section. Otherwise, press Enter to implement the changes.

    NOTE: SMIT automatically refreshes the cnsview daemons. Reboot the 7318. This can be done in two ways.

    If the 7318 still does not boot, proceed to the next section.

  8. If the 7318's SMIT configuration is correct, delete the 7318 device configuration, re-add the 7318 definitions once more, and reboot the 7318.

    NOTE: It is possible that the device data in the ODM is corrupt and a complete reconfiguration solves the problem. If the 7318 still fails to boot, proceed to the next section.


Checking file permissions

In a broadcast load configuration, the 7318 requests its boot image and configuration from the host. The boot image and configuration file must have permissions that allow the 7318 to download them.

  1. Change to the directory in which the load image and configuration file reside.

    The default is /usr/lib/cns.

  2. Verify that the permissions for the files are world-readable. Enter:
         ls -1 | more
    

    Correct sample file permissions are as follows:

    -r--r--r--      1 root     system     442432 Jul 24 01:56 cns-p10 
    -r--r--r--      1 root     system    1240228 Jul 24 01:56 cns-s20e 
    -rw-r--r--      1 root     system      17539 Sep 09 1995 p10.cfg 
    -rw-r--r--      1 root     system      43120 Jul 24 01:56 s20.cfg
    
  3. If the file is not world-readable, change the permissions. Execute:
           chmod 444 <filename>
    
  4. Proceed to the next section.

Checking the IPX communication channel

Once it has been verified that an IPX network exists, check the IPX communication between the host and the 7318. To verify IPX communication between the 7318 and the host machine, send a broadcast IPX ping to every 7318 on the network.

  1. To send a broadcast IPX ping, enter:
           cnsview -c "ipxping -b"
    

    To IPX ping a specific 7318, enter:

            cnsview -c "ipxping external_ipx_network_number:7318's_ethernet_address"
    

    For example:

            cnsview -c "ipxping 00000002:00406ee00155"
    

    Sample output follows:

    [root@ivorye] / # cnsview -c "ipxping -b"
    00000002:00406ee00175 is responding but not online
    00000002:00406ee00155 is responding but not online
    00000002:00406ef000f0 is responding and online
    

    NOTE: The success of an IPX ping implies that the 7318 is powered on, that there is an IPX path to the 7318, and that the routers, if any, are routing the packets correctly.

    NOTE: The success of any IPX ping does NOT imply that the 7318 has booted or that the 7318 is configured so that the SPX link is present.

  2. Check whether the 7318 (Terminal Server) is listed in the output.


Setting up error logging

Often error reporting can show why the Terminal Server is not booting properly. When configured, these errors and their codes can be logged to a file. This section sets up error reporting for cns, the software that manages the Terminal Server.

  1. To set up error reporting, open the /usr/lib/cns/cnsd.conf file with a text editor.

  2. Search for the entry log:
         config  svclts  2sess    periodic method vpd     boot     stats
    
  3. If this line is commented out, remove the # sign at the beginning of the line.

  4. Save any modifications and exit the editor.

  5. Recycle the cnsview daemon to start error reporting. Enter:
    # cnsview -c "daemon stop"
    # cnsview -c "daemon start"
    

    WARNING: All other Terminal Server communication (IPX) will be stopped on the host when these daemons are recycled.

  6. Reboot the 7318. This can be done in two ways.

  7. Both the device driver and the cnsview daemon will log errors to different files. These files are the /usr/lib/cns/cnsd.log file and the AIX error log.

    NOTE: The resource names used in the AIX error log are as follows:

    cnsdd  Events logged by the cns device driver
    cnsview  Events logged by the cnsview daemon for the 7318 units
    cnld  Events logged by the cnsview daemon associated with downloading and booting

  8. To review the driver-logged errors, enter:
        errpt -aNcnsdd | more
    

    NOTE: Every 3-4 minutes, the cnsview daemon checks the status of the SPX links to those 7318 devices that were configured. When the SPX links are not present, an error log entry is made that is similar to the following:

        ERROR LABEL:            CNS_DISCONNECT 
        ERROR ID:               5EBD0D06 
        Date/Time               Fri June  9 22:34:41 
        Sequence Number:        9987 
        Machine Id:             0000001871800 
        Node Id:                        levesconte 
        Error Class:            S 
        Error Type:             PERM 
        Resource Name  cnsdd 
        Error Description 
        Driver for ComServer 
        Probable Causes 
        REMOTE NODE 
        Failure Causes 
        COMMUNICATIONS/REMOTE NODE 
        SOFTWARE PROGRAM 
        Recommended Actions 
        RUN STANDALONE DIAGNOSICS 
        Detail Data 
        ERROR CODE 
        0000 0000 
        Comm 
        Probable Cause 
        REMOTE NODE 
        Failure Causes 
        COMMUNICATIONS/REMOTE NODE 
        SOFTWARE PROGRAM 
        Recommended Actions 
        RUN STANDALONE DIAGNOSTICS 
        Detail Data 
        ERROR CODE 
        0000 0000 
        Communications Device Name: 
        00406e0002db
    
  9. If the problem is undetermined, make copies of the error reports and send them to an IBM AIX Technical Specialist.

PTF level

If the 7318 still fails to boot, there may be a problem with the cns software on the computer. If you suspect a software bug, run the following command to determine the level of the cns software on the system.

Contact your AIX support center or use the FixDist application to access a list of the latest PTFs for the 7318 software.


Debugging from the BIOS console session

The BIOS console allows you to view the 7318's boot process as it queries hosts for its load image and configuration file. A BIOS console session can be accessed by connecting a terminal (for example, an IBM3151) to one of the ports on the front of the 7318 using an RJ-45 cable and a null-modem adapter.

  1. Verify the terminal configuration.

    The terminal should emulate an ASCII terminal with the following settings:

             9600 baud  8 data bits  no parity  1 stop bit
    
  2. Start a BIOS console session on the 7318.

    To start a BIOS console session, recycle the power on the 7318. When the 7318 is powered back on, hold the Shift key and press 3 (to get the # sign) repeatedly until four # signs scroll across the screen of the dumb terminal.

    A BIOS console may be accessed between the time when the 7318 is first powered on and when the ready light (second light) starts blinking. If the ready light begins blinking, you have missed the window to enter the BIOS console and must recycle the power and try again. Once four # signs scroll across the screen, the 7318 will begin a BIOS console session.

  3. Set the NVRAM to the manufactured default settings. Enter:
            admin 
            default 
            save
    
  4. Check if the BIOS code on the 7318 is at the latest BIOS level and verify that all the parameters are set to their default values. Enter:
    show
    

    NOTE: The latest BIOS level as of 04/16/99 is 5.23. If you are unsure whether the 7318 is at the latest BIOS level, please contact an IBM AIX Technical Specialist.

  5. Boot the 7318 using the load command. Enter:
            load
    

    The load command allows you to view the 7318's requests and the host's responses as the 7318 queries for a load host and boot files. Incorrect configuration may lead to a load host ignoring 7318 requests or not finding files and boot images. If the problem cannot be determined from the BIOS load information, make a copy of the error log file and send this information to an IBM AIX Technical Specialist for evaluation.




[ Doc Ref: 90605206414712     Publish Date: Jan. 26, 2001     4FAX Ref: 7447 ]